Fast and scalable MapReduce-based vertical mining

Loading...
Thumbnail Image

Authors

Yu, Jialiang

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Mining uncertain data is challenging because uncertainty is usually represented as real numbers which are in infinite (cf. representing infinite occurrence counts when mining precise data). This means that they are not easy to store in a data structure. Although there exist some data mining algorithms for handling uncertain data, these algorithms become inefficient when the size of data becomes so big. Vertical data mining algorithms have advantages in that they run fast and require low memory space. Hence, for my M.Sc. thesis, I propose two vertical mining algorithms that mine big uncertain data. Analytical and experimental evaluation results show that, between these two MapReduce-based vertical mining algorithms, MR-UV-Eclat is fast and scalable.

Description

Keywords

data mining

Citation