Mining frequent itemsets from uncertain data: extensions to constrained mining and stream mining

Thumbnail Image
Hao, Boyu
Journal Title
Journal ISSN
Volume Title
Most studies on frequent itemset mining focus on mining precise data. However, there are situations in which the data are uncertain. This leads to the mining of uncertain data. There are also situations in which users are only interested in frequent itemsets that satisfy user-specified aggregate constraints. This leads to constrained mining of uncertain data. Moreover, floods of uncertain data can be produced in many other situations. This leads to stream mining of uncertain data. In this M.Sc. thesis, we propose algorithms to deal with all these situations. We first design a tree-based mining algorithm to find all frequent itemsets from databases of uncertain data. We then extend it to mine databases of uncertain data for only those frequent itemsets that satisfy user-specified aggregate constraints and to mine streams of uncertain data for all frequent itemsets. Experimental results show the effectiveness of all these algorithms.
Data Mining, Databases
Leung, C.K.-S., Carmichael, C.L., Hao, B. (2007). Efficient mining of frequent patterns from uncertain data. In Proc. IEEE ICDM Workshops 2007: 489-494.
Leung, C.K.-S., Hao, B. (2009). Mining of frequent itemsets from streams of uncertain data. In Proc. IEEE ICDE 2009: 1663-1670.
Leung, C.K.-S., Hao, B., Jiang, F. (2010). Constrained frequent itemset mining from uncertain data streams. In Proc. IEEE ICDE Workshops 2010: 120-127.
Leung, C.K.-S., Hao, B., Brajczuk, D.A. (2010). Mining uncertain data for frequent itemsets that satisfy aggregate constraints. In Proc. ACM SAC 2010: 1034-1038.