Regularized regression in generalized linear measurement error models with instrumental variables -variable selection and parameter estimation

dc.contributor.authorXue, Lin
dc.contributor.examiningcommitteeWang, Xikui (Statistics)en_US
dc.contributor.examiningcommitteeTorabi, Mahmoud (Community Health Sciences)en_US
dc.contributor.examiningcommitteeHe, Wenqing (Western University)en_US
dc.contributor.supervisorWang, Liqun (Statistics)en_US
dc.date.accessioned2020-09-09T15:50:20Z
dc.date.available2020-09-09T15:50:20Z
dc.date.copyright2020-08-25
dc.date.issued2020en_US
dc.date.submitted2020-08-25T21:52:33Zen_US
dc.degree.disciplineStatisticsen_US
dc.degree.levelDoctor of Philosophy (Ph.D.)en_US
dc.description.abstractRegularization method is a commonly used technique in high dimensional data analysis. With properly chosen tuning parameter for certain penalty functions, the resulting estimator is consistent in both variable selection and parameter estimation. Most regularization methods assume that the data can be observed and precisely measured. However, it is well-known that the measurement error (ME) is ubiquitous in real-world datasets. In many situations some or all covariates cannot be observed directly or are measured with errors. For example, in cardiovascular disease related studies, the goal is to identify important risk factors such as blood pressure, cholesterol level and body mass index, which cannot be measured precisely. Instead, the corresponding proxies are employed for analysis. If the ME is ignored in regularized regression, the resulting naive estimator can have high selection and estimation bias. Accordingly, the important covariates are falsely dropped from the model and the redundant covariates are retained in the model incorrectly. We illustrate how ME affects the variable selection and parameter estimation through theoretical analysis and several numerical examples. To correct for the ME effects, we propose the instrumental variable assisted regularization method for linear and generalized linear models. We showed that the proposed estimator has the oracle property such that it is consistent in both variable selection and parameter estimation. The asymptotic distribution of the estimator is derived. In addition, we showed that the implementation of the proposed method is equivalent to the plug-in approach under linear models, and the asymptotic variance-covariance matrix has a compact form. Extensive simulation studies in linear, logistic and poisson log-linear regression showed that the proposed estimator outperforms the naive estimator in both linear and generalized linear models. Although the focus of this study is the classical ME, we also discussed the variable selection and estimation in the setting of Berkson ME. In particular, our finite sample simulation studies show that in contrast to the estimation in linear regression, the Berkson ME may cause bias in variable selection and estimation. Finally, the proposed method is applied to real datasets of diabetes and Framingham heart study.en_US
dc.description.noteOctober 2020en_US
dc.identifier.urihttp://hdl.handle.net/1993/35023
dc.language.isoengen_US
dc.rightsopen accessen_US
dc.subjectMeasurement erroren_US
dc.subjectRegularizationen_US
dc.subjectInstrumental variableen_US
dc.titleRegularized regression in generalized linear measurement error models with instrumental variables -variable selection and parameter estimationen_US
dc.typedoctoral thesisen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Xue_Lin.pdf
Size:
1.33 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
2.2 KB
Format:
Item-specific license agreed to upon submission
Description: