Bio-inspired constrained clustering: A case study on aspect-based sentiment analysis

Loading...
Thumbnail Image
Date
2018-07-04
Authors
Qasem, Mohammed
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Clustering is an important problem in the era of big data. Exact algorithmic clustering approaches are not affordable for many real-world applications (RWA), requiring innovative, and approximation algorithms. Among them are bio or nature-inspired techniques such as “ant brood clustering algorithm” (ACA) inspired by how real ants brood sort their nests. ACA's mathematical model assumes a static radius of perception which is not adaptable to RWA. I address this issue by developing an adaptive clustering algorithm, called “ACA with Adaptive Radius (ACA-AR)” using kernel density estimation, a non-parametric statistical model, to measure average dissimilarity of data objects in ant’s neighborhood. I extend this algorithm to a search-based semi-supervised constrained clustering algorithm (CACA-AR) that incorporates supervisory information to guide the clustering algorithm towards solutions where constraints are minimally violated. I evaluate the accuracy of CACA-AR on benchmark datasets and provide a feasibility study on one RWA, aspect-based sentiment analysis. The F1-score results show that CACA-AR outperforms baseline techniques, multi-class logistic regression, and lexicon based approaches by 20%.
Description
Keywords
ant, constrained clustering, sentiment analysis
Citation