Pattern recognition using robust discrimination and fuzzy set theoretic preprocessing
Pizzi, Nicolino John.
Classification is the empirical process of creating a mapping from individual patterns to a set of classes and its subsequent use in predicting the classes to which new patterns belong. Tremendous energies have been expended in developing systems for the creation of the mapping component. Less effort has been devoted to the nature and analysis of the data component, namely, strategies that transform the data in order to simplify, in some sense, the classification process. The purpose of this thesis is to redress somewhat this imbalance by introducing two novel preprocessing methodologies. Fuzzy interruptible encoding determines the respective degrees to which a feature belongs to a collection of fuzzy sets and subsequently using these membership grades in place of the original feature. Burnishing tarnished gold standards compensates for the possible imprecision of a well-established reference test by adjusting, if necessary, the class labels in the design set while maintaining the test's vital discriminatory power. The methodologies were applied to several synthetic data sets as well as biomedical spectra acquired from magnetic resonance and infrared spectrometers. Both fuzzy encoding and burnishing consistently improved the discriminatory power of the underlying classifiers. They are insensitive to outliers and often reduce the training time for iterative classifiers such as the multi-layer perceptron. With the latter, reclassification only occurs for data within the design set; outliers within the test set are flagged but not altered. Therefore, the accepted gold standard is left in a pristine state sullied only by its original tarnish.