Statistical models for multilevel data with “Don’t know” category: implication for program evaluation

Thumbnail Image
Huang, Beili
Journal Title
Journal ISSN
Volume Title
Background: “Don’t know (DK)” category has been increasingly used in surveys of longitudinal research. This creates unique challenges in data analysis and program evaluation. Strategies applying missing data methods may lead to biased and inaccurate estimations and lose valuable information. Objectives: (i) To illustrate advantages of the proposed two-part mixed effects model over other methods for longitudinal outcomes with DKs through simulation; (ii) to apply the proposed model to a mental health program (Project 11) to evaluate the program effects on participants’ awareness and level of connectedness. Methods: We applied a two-part mixed effects model for longitudinal outcome containing DKs. A simulation study was designed to illustrate the advantages of our proposed model over other methods, where different conditions including sample size, DK proportion, correlation strength between DKs and non-DK responses were considered under different DK mechanisms. We also compared the proposed model with other approaches by applying them to a mental health program (Project 11). In the data analysis of Project 11, we further extended the two-part model to account for within-cluster correlations among students within schools and to explore gender differences in program effects. Results: The proposed two-part mixed effects model outperformed other methods (i.e., CCA, SI, and ML) in estimating both intervention and random effects under all DK mechanisms. In contrast, methods disregarding DKs as missing experienced issues in at least some scenarios. Application of the two-part model to Project 11 data suggested significant intervention effects on improving the connectedness among boys (β ̂_11: -0.071, p = 0.049), whereas no significant improvements were observed among girls. Significant correlations were also found between the likelihood of DKs and connectedness level at both student level and school level. Conclusions: The proposed two-part mixed effects model is highly recommended for analyzing data with DK responses, based on the results of both simulations and empirical data analysis. Missing data techniques should be avoided due to potentially biased and/or imprecise estimates and the loss of information conveyed by DK responses.
“Don’t know (DK)” response, Two-part mixed effects model, Longitudinal data, Connectedness, Mental health, Program evaluation