On fractionally-supervised classification with nominated samples

dc.contributor.authorWang, Jingyu
dc.contributor.examiningcommitteeJohnson, Brad (Statistics)en_US
dc.contributor.examiningcommitteeTurgeon, Max (Statistics)en_US
dc.contributor.supervisorJafari Jozani, Mohammad (Statistics)en_US
dc.date.accessioned2021-01-18T14:18:02Z
dc.date.available2021-01-18T14:18:02Z
dc.date.copyright2021-01-14
dc.date.issued2020en_US
dc.date.submitted2021-01-14T19:05:52Zen_US
dc.degree.disciplineStatisticsen_US
dc.degree.levelMaster of Science (M.Sc.)en_US
dc.description.abstractFractionally-supervised classification (FSC) is a recently proposed classification method in the literature that combines the finite mixture model (FMM), weighted likelihood, and Expectation-Maximization (EM) algorithm to adjust the weight of labeled (unlabeled) data in the training process of a classifier and obtain the best classification result. All the results in the literature pertinent to FSC are based on simple random sampling (SRS). In this thesis, we extend FSC approach to a ranked-based type sampling design called nominated sampling (NS), which collects more representative data than SRS from tails of the underlying population. We show that the usual EM algorithm for finite mixture modeling using nominated samples leads to incorrect maximization problems. In this thesis, we propose a set of proper latent variables and modify the usual EM algorithm for the FSC approach based on maxima (minima) nominated samples and evaluate the estimation and classification results. We compare the mean squared error (MSE) of estimates obtained by FSC with two EM algorithms and observe that the EM algorithm with proper latent variable has a higher relative efficiency when applying NS samples. Moreover, we compute the adjusted Rand index (ARI) to assess the classification performance in different weights of unlabeled data and determine the best choice of weight for the purpose of FSC.en_US
dc.description.noteFebruary 2021en_US
dc.identifier.urihttp://hdl.handle.net/1993/35257
dc.language.isoengen_US
dc.rightsopen accessen_US
dc.subjectStatisticsen_US
dc.subjectEM-Algorithmen_US
dc.subjectFinite Mixture Modelsen_US
dc.subjectNomination Samplingen_US
dc.subjectFractionally-Supervised Classificationen_US
dc.titleOn fractionally-supervised classification with nominated samplesen_US
dc.typemaster thesisen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
wang_jingyu.pdf
Size:
686 KB
Format:
Adobe Portable Document Format
Description:
MSc. thesis
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
2.2 KB
Format:
Item-specific license agreed to upon submission
Description: