Scalable vertical mining for big data analytics

Thumbnail Image
Date
2016
Authors
Zhang, Hao
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The increasing size of modern applications produces huge amounts of data, which in turn leads to a new challenge to data mining or big data analytics. Researchers often use the five V’s (Volume, Velocity, Variety, Veracity, and Value) to describe the features of big data. The interest of discovering patterns from a large collection of data has risen in both academic and industrial areas. Examples of rich sources of big data are on-line social networks like Facebook or Twitter. Embedded in these user online social activities are useful information and knowledge. Recently, although some algorithms have been proposed to mine a large scale of data, they mostly focused on the volume aspect. Unfortunately, not that many approaches have been focused on data variety which is also a critical criterion for mining process. The composition of a dataset could either be sparse or dense, or not evenly uniformly distributed. For example, a list of common friends in an on-line social network can be dense if two people share a lot of common friends; it could be sparse otherwise. For my MSc thesis, I design and implement a big data analytic algorithm that tackles both volume and variety aspects of big data.
Description
Keywords
Data mining, Frequent pattern mining, Big data, Data analytics
Citation