Modified K-means clustering algorithms  for feature selection

Akhter, Ayeasha

Modified K-means clustering algorithms for feature selection

Files

Akhter_Ayeasha.pdf(864 KB)

Date

2023-06-14

Authors

Akhter, Ayeasha

Abstract

Computational effort is difficult when dealing with high dimensional data that has hundreds or thousands of features. Features that don't significantly influence class predictions throughout the classification process increase the computing load. By eliminating unnecessary, redundant, or noisy features from the original features, feature selection, as a dimensionality reduction strategy, tries to pick a small subset of the important features from the original features. Two new feature selection methods are described in this study in relation to the effectiveness of kmeans-based clustering methods. This research project aims to reduce the number of different features by clustering the D features into k (k < D) clusters, determining the cluster center to represent its members by finding the closest feature to the cluster center or selecting the highest weighted features among the cluster members, and performing feature selection. After removing 41.4% of the features from the VIRUS-MNIST dataset, we are able to deliver accuracy equivalent to the original dataset using both of our suggested methods in a shorter amount of time. Our proposed methods outperform sparse k-means, PCA, LLE, and wk-means based feature selection method for clustering by ANN following feature reduction in the Wine dataset. With fewer features than the modified k-means feature selection method, our second method performs more accurately on the CNAE dataset.

Keywords

Modified K-means, K-means, Feature selection

URI

http://hdl.handle.net/1993/37434

Collections

FGS - Electronic Theses and Practica

Full item page