On fractionally-supervised classification with nominated samples
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Fractionally-supervised classification (FSC) is a recently proposed classification method in the literature that combines the finite mixture model (FMM), weighted likelihood, and Expectation-Maximization (EM) algorithm to adjust the weight of labeled (unlabeled) data in the training process of a classifier and obtain the best classification result. All the results in the literature pertinent to FSC are based on simple random sampling (SRS). In this thesis, we extend FSC approach to a ranked-based type sampling design called nominated sampling (NS), which collects more representative data than SRS from tails of the underlying population. We show that the usual EM algorithm for finite mixture modeling using nominated samples leads to incorrect maximization problems. In this thesis, we propose a set of proper latent variables and modify the usual EM algorithm for the FSC approach based on maxima (minima) nominated samples and evaluate the estimation and classification results. We compare the mean squared error (MSE) of estimates obtained by FSC with two EM algorithms and observe that the EM algorithm with proper latent variable has a higher relative efficiency when applying NS samples. Moreover, we compute the adjusted Rand index (ARI) to assess the classification performance in different weights of unlabeled data and determine the best choice of weight for the purpose of FSC.