Regularized regression in generalized linear measurement error models with instrumental variables -variable selection and parameter estimation

Xue, Lin

Regularized regression in generalized linear measurement error models with instrumental variables -variable selection and parameter estimation

dc.contributor.author	Xue, Lin
dc.contributor.examiningcommittee	Wang, Xikui (Statistics)	en_US
dc.contributor.examiningcommittee	Torabi, Mahmoud (Community Health Sciences)	en_US
dc.contributor.examiningcommittee	He, Wenqing (Western University)	en_US
dc.contributor.supervisor	Wang, Liqun (Statistics)	en_US
dc.date.accessioned	2020-09-09T15:50:20Z
dc.date.available	2020-09-09T15:50:20Z
dc.date.copyright	2020-08-25
dc.date.issued	2020	en_US
dc.date.submitted	2020-08-25T21:52:33Z	en_US
dc.degree.discipline	Statistics	en_US
dc.degree.level	Doctor of Philosophy (Ph.D.)	en_US
dc.description.abstract	Regularization method is a commonly used technique in high dimensional data analysis. With properly chosen tuning parameter for certain penalty functions, the resulting estimator is consistent in both variable selection and parameter estimation. Most regularization methods assume that the data can be observed and precisely measured. However, it is well-known that the measurement error (ME) is ubiquitous in real-world datasets. In many situations some or all covariates cannot be observed directly or are measured with errors. For example, in cardiovascular disease related studies, the goal is to identify important risk factors such as blood pressure, cholesterol level and body mass index, which cannot be measured precisely. Instead, the corresponding proxies are employed for analysis. If the ME is ignored in regularized regression, the resulting naive estimator can have high selection and estimation bias. Accordingly, the important covariates are falsely dropped from the model and the redundant covariates are retained in the model incorrectly. We illustrate how ME affects the variable selection and parameter estimation through theoretical analysis and several numerical examples. To correct for the ME effects, we propose the instrumental variable assisted regularization method for linear and generalized linear models. We showed that the proposed estimator has the oracle property such that it is consistent in both variable selection and parameter estimation. The asymptotic distribution of the estimator is derived. In addition, we showed that the implementation of the proposed method is equivalent to the plug-in approach under linear models, and the asymptotic variance-covariance matrix has a compact form. Extensive simulation studies in linear, logistic and poisson log-linear regression showed that the proposed estimator outperforms the naive estimator in both linear and generalized linear models. Although the focus of this study is the classical ME, we also discussed the variable selection and estimation in the setting of Berkson ME. In particular, our finite sample simulation studies show that in contrast to the estimation in linear regression, the Berkson ME may cause bias in variable selection and estimation. Finally, the proposed method is applied to real datasets of diabetes and Framingham heart study.	en_US
dc.description.note	October 2020	en_US
dc.identifier.uri	http://hdl.handle.net/1993/35023
dc.language.iso	eng	en_US
dc.rights	open access	en_US
dc.subject	Measurement error	en_US
dc.subject	Regularization	en_US
dc.subject	Instrumental variable	en_US
dc.title	Regularized regression in generalized linear measurement error models with instrumental variables -variable selection and parameter estimation	en_US
dc.type	doctoral thesis	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Xue_Lin.pdf
Size:: 1.33 MB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 2.2 KB
Format:: Item-specific license agreed to upon submission
Description:

Download

Collections

FGS - Electronic Theses and Practica