Show simple item record

dc.contributor.supervisor Leung, Carson K. (Computer Science) en_US
dc.contributor.author Brajczuk, Dale A.
dc.date.accessioned 2011-09-01T14:38:45Z
dc.date.available 2011-09-01T14:38:45Z
dc.date.issued 2011-09-01
dc.identifier.uri http://hdl.handle.net/1993/4814
dc.description.abstract Existing frequent-sequence mining algorithms perform multiple scans of a database, or a structure that captures the database. In this M.Sc. thesis, I propose a frequent-sequence mining algorithm that mines each database row as it reads it, so that it can potentially complete mining in the time it takes to read the database once. I achieve this by having my algorithm enumerate all sub-sequences from each row as it reads it. Since sub-sequence enumeration is a time-consuming process, I create a method to distribute the work over multiple computers, processors, and thread units, while balancing the load between all resources, and limiting the amount of communication so that my algorithm scales well in regards to the number of computers used. Experimental results show that my algorithm is effective, and can potentially complete the mining process in near the time it takes to perform one scan of the input database. en_US
dc.subject data mining en_US
dc.subject databases en_US
dc.subject distributed computing en_US
dc.title Mining frequent sequences in one database scan using distributed computers en_US
dc.degree.discipline Computer Science en_US
dc.contributor.examiningcommittee Irani, Pourang (Computer Science) Rajapakse, Athula (Electrical & Computer Engineering) en_US
dc.degree.level Master of Science (M.Sc.) en_US
dc.description.note October 2011 en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

View Statistics