Modified K-means clustering algorithms for feature selection

dc.contributor.authorAkhter, Ayeasha
dc.contributor.examiningcommitteeMohammed, Noman (Computer Science)
dc.contributor.examiningcommitteeMcLeod, Bob (Electrical and Computer Engineering)
dc.contributor.supervisorFerens, Ken
dc.date.accessioned2023-07-28T15:09:44Z
dc.date.available2023-07-28T15:09:44Z
dc.date.issued2023-06-14
dc.date.submitted2023-07-25T21:35:32Zen_US
dc.degree.disciplineElectrical and Computer Engineeringen_US
dc.degree.levelMaster of Science (M.Sc.)
dc.description.abstractComputational effort is difficult when dealing with high dimensional data that has hundreds or thousands of features. Features that don't significantly influence class predictions throughout the classification process increase the computing load. By eliminating unnecessary, redundant, or noisy features from the original features, feature selection, as a dimensionality reduction strategy, tries to pick a small subset of the important features from the original features. Two new feature selection methods are described in this study in relation to the effectiveness of kmeans-based clustering methods. This research project aims to reduce the number of different features by clustering the D features into k (k < D) clusters, determining the cluster center to represent its members by finding the closest feature to the cluster center or selecting the highest weighted features among the cluster members, and performing feature selection. After removing 41.4% of the features from the VIRUS-MNIST dataset, we are able to deliver accuracy equivalent to the original dataset using both of our suggested methods in a shorter amount of time. Our proposed methods outperform sparse k-means, PCA, LLE, and wk-means based feature selection method for clustering by ANN following feature reduction in the Wine dataset. With fewer features than the modified k-means feature selection method, our second method performs more accurately on the CNAE dataset.
dc.description.noteOctober 2023
dc.identifier.urihttp://hdl.handle.net/1993/37434
dc.language.isoeng
dc.rightsopen accessen_US
dc.subjectModified K-means
dc.subjectK-means
dc.subjectFeature selection
dc.titleModified K-means clustering algorithms for feature selection
dc.typemaster thesisen_US
local.subject.manitobano
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Akhter_Ayeasha.pdf
Size:
864 KB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
770 B
Format:
Item-specific license agreed to upon submission
Description: