Seeing the forest for the trees: tree-based uncertain frequent pattern mining
dc.contributor.author | MacKinnon, Richard Kyle | |
dc.contributor.examiningcommittee | Wang, Yang (Computer Science) Wang, Xikui (Statistics) | en_US |
dc.contributor.supervisor | Leung, Carson K.-S. (Computer Science) | en_US |
dc.date.accessioned | 2016-01-13T22:43:27Z | |
dc.date.available | 2016-01-13T22:43:27Z | |
dc.date.issued | 2014-05 | en_US |
dc.date.issued | 2014-09 | en_US |
dc.date.issued | 2014-09 | en_US |
dc.date.issued | 2014-12 | en_US |
dc.date.issued | 2014-12 | en_US |
dc.degree.discipline | Computer Science | en_US |
dc.degree.level | Master of Science (M.Sc.) | en_US |
dc.description.abstract | Many frequent pattern mining algorithms operate on precise data, where each data point is an exact accounting of a phenomena (e.g., I have exactly two sisters). Alas, reasoning this way is a simplification for many real world observations. Measurements, predictions, environmental factors, human error, &ct. all introduce a degree of uncertainty into the mix. Tree-based frequent pattern mining algorithms such as FP-growth are particularly efficient due to their compact in-memory representations of the input database, but their uncertain extensions can require many more tree nodes. I propose new algorithms with tightened upper bounds to expected support, Tube-S and Tube-P, which mine frequent patterns from uncertain data. Extensive experimentation and analysis on datasets with different probability distributions are undertaken that show the tightness of my bounds in different situations. | en_US |
dc.description.note | February 2016 | en_US |
dc.identifier.citation | MacKinnon, R.K., Leung, C.K.-S., Tanbeer, S.K. (2014) A scalable data analytics algorithm for mining frequent patterns from uncertain data. In Proc. PAKDDW 2014: 404-416. Springer International Publishing. | en_US |
dc.identifier.citation | Leung, C.K.-S., MacKinnon, R.K. (2014) BLIMP: a compact tree structure for uncertain frequent pattern mining. In Proc. DaWaK 2014: 115-123. Springer International Publishing. | en_US |
dc.identifier.citation | Leung, C.K.-S., MacKinnon, R.K., Tanbeer, S.K. (2014) Tightening upper bounds to the expected support for uncertain frequent pattern mining. In Proc. KES 2014: 328-337. Elsevier. | en_US |
dc.identifier.citation | MacKinnon, R.K., Strauss, T.D., Leung, C.K.-S. (2014) DISC: efficient uncertain frequent pattern mining with tightened upper bounds. In Proc. ICDMW 2014: 1038-1045. IEEE Computer Society Press. | en_US |
dc.identifier.citation | Leung, C.K.-S., MacKinnon, R.K., Tanbeer, S.K. (2014) Fast algorithms for frequent itemset mining from uncertain data. In Proc. ICDM 2014: 893-898. IEEE Computer Society Press. | en_US |
dc.identifier.uri | http://hdl.handle.net/1993/31059 | |
dc.language.iso | eng | en_US |
dc.publisher | Springer International Publishing | en_US |
dc.publisher | Springer International Publishing | en_US |
dc.publisher | Elsevier | en_US |
dc.publisher | IEEE Computer Society Press | en_US |
dc.publisher | IEEE Computer Society Press | en_US |
dc.rights | open access | en_US |
dc.subject | Data mining | en_US |
dc.subject | Databases | en_US |
dc.title | Seeing the forest for the trees: tree-based uncertain frequent pattern mining | en_US |
dc.type | master thesis | en_US |