Analysis of multivariate responses in patient reported outcome measures: missing data and auxiliary variables
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Patient-reported outcomes measures (PROMs) are increasingly used in clinical registries and clinical trials to collect information about patient’s perspectives of their own health. Item non-response or missing data, which may occur when patients fail to complete or respond to PROMs question, threatens the validity of findings from the assessment of group differences or longitudinal change in PROMs. The goal of this research was to develop and evaluate methods for addressing item non-response in PROMs. Four related studies were undertaken using the population-based Winnipeg Regional Health Authority Joint Replacement Registry and simulated data. The first study compared the performance of non-negative matrix factorization (NNMF), which is an unsupervised machine-learning method that uses optimization techniques to detect a low-dimensional structure from the data, with full information maximum likelihood (FIML). The methods were applied to test for differential item functioning in multidimensional PROMs. The second study evaluated the performance of NNMF, FIML, and multiple imputation (MI) with conditional proportional odds model when estimating longitudinal change in latent variable means. The third study investigated the use of auxiliary variables, which are potential correlates of missingness in the data, in imputation model and compared the precision and bias of FIML, MI with and without auxiliary variable, when estimating longitudinal change in PROM scores. The fourth study proposed an enhanced weighted NNMF, which uses observed item responses as auxiliary variable to define weights for item-level imputation, and compares the performance with FIML, and NNMF. We found that the Type I error rates and statistical power for NNMF were comparable to the FIML method. The NNMF method is relatively efficient when sample size is large (i.e., >500) and the percentage of non-response is high, but less optimal under other data-analytic conditions. Also, we showed that including auxiliary information in the imputation model increased the precision and reduced the bias of the estimated parameters. This research contributes to the statistical literature on methods to address missing data in PROMs with potential applications in clinical and quality of life research. Also, it demonstrates the practicality of using observed item responses to define an auxiliary variable, which provides a basis for accessible approach of identifying auxiliary variable in PROMs.