Robust regression analysis using median nomination sampling
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
In this thesis, we introduce a novel methodology for robust regression analysis when traditional mean regression falls short due to the presence of heteroscedasticity or outliers. Unlike conventional approaches that rely on simple random sampling (SRS), our methodology leverages median nomination sampling (MedNS) by utilizing readily available ranking information to obtain training data that more accurately captures the central tendency of the underlying population, thereby enhancing the representativeness of the sample in the presence of extensive outliers. We propose a new loss function that integrates the extra rank information of MedNS data during the training phase of regression model fitting, thereby offering a form of robust regression. Through simulation studies, including a high dimensional and a non-linear quantile regression setting, and a real data application, we evaluate the efficacy of our proposed approach compared to its SRS counterpart by comparing the integrated mean squared error of regression estimates. We observe that our proposed method provides higher relative efficiency compared to its SRS counterparts when outliers are present in the data.