Fast and scalable MapReduce-based vertical mining

Loading...
Thumbnail Image
Date
2018-07-12
Authors
Yu, Jialiang
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract

Mining uncertain data is challenging because uncertainty is usually represented as real numbers which are in infinite (cf. representing infinite occurrence counts when mining precise data). This means that they are not easy to store in a data structure. Although there exist some data mining algorithms for handling uncertain data, these algorithms become inefficient when the size of data becomes so big. Vertical data mining algorithms have advantages in that they run fast and require low memory space. Hence, for my M.Sc. thesis, I propose two vertical mining algorithms that mine big uncertain data. Analytical and experimental evaluation results show that, between these two MapReduce-based vertical mining algorithms, MR-UV-Eclat is fast and scalable.

Description
Keywords
data mining
Citation