Cognitive vector quantization for malware detection

Thumbnail Image
Heim, Ainslee
Journal Title
Journal ISSN
Volume Title
In today’s world, detection of malware is a prevalent challenge due to the evolving nature of malware designs and techniques. Many machine learning algorithms, such as ANNs, use supervised learning, which relies on a labeled dataset. In real-time systems training sets are unlabeled and obtained in real-time. In this case, the use of unsupervised machine learning algorithms may be used. However, a problem with them is the generated clusters are not identified as either malware or benign. Much work has been done on incorporating cognition into ANNs to improve their performance. This thesis explores using a Vector Quantization Artificial Neural Network (VQ-ANN) to classify a malware dataset using unsupervised learning. Due to the unsupervised nature of the Vector Quantization Artificial Neural Network, the basic algorithm will not know the classification of examples during the training process and will attempt to sort the dataset based on similarities. This thesis uses a novel method to identify the clusters as either malware or benign by using elements of cognition in the form of the Variance Fractal Dimension to label the clusters formed in a VQ-ANN. As compared with currently used clustering methods (U-Matrix), our algorithm consistency produced a higher accuracy, with an average accuracy of 98.1%.
Vector quantization, Cognition, Fractal Dimensions, Malware detection, Machine learning, Self-organizing map