Mining frequent sequences in one database scan using distributed computers

dc.contributor.authorBrajczuk, Dale A.
dc.contributor.examiningcommitteeIrani, Pourang (Computer Science) Rajapakse, Athula (Electrical & Computer Engineering)en_US
dc.contributor.supervisorLeung, Carson K. (Computer Science)en_US
dc.date.accessioned2011-09-01T14:38:45Z
dc.date.available2011-09-01T14:38:45Z
dc.date.issued2011-09-01
dc.degree.disciplineComputer Scienceen_US
dc.degree.levelMaster of Science (M.Sc.)en_US
dc.description.abstractExisting frequent-sequence mining algorithms perform multiple scans of a database, or a structure that captures the database. In this M.Sc. thesis, I propose a frequent-sequence mining algorithm that mines each database row as it reads it, so that it can potentially complete mining in the time it takes to read the database once. I achieve this by having my algorithm enumerate all sub-sequences from each row as it reads it. Since sub-sequence enumeration is a time-consuming process, I create a method to distribute the work over multiple computers, processors, and thread units, while balancing the load between all resources, and limiting the amount of communication so that my algorithm scales well in regards to the number of computers used. Experimental results show that my algorithm is effective, and can potentially complete the mining process in near the time it takes to perform one scan of the input database.en_US
dc.description.noteOctober 2011en_US
dc.identifier.urihttp://hdl.handle.net/1993/4814
dc.language.isoengen_US
dc.rightsopen accessen_US
dc.subjectdata miningen_US
dc.subjectdatabasesen_US
dc.subjectdistributed computingen_US
dc.titleMining frequent sequences in one database scan using distributed computersen_US
dc.typemaster thesisen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
brajczuk_dale.pdf
Size:
12.85 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
2.25 KB
Format:
Item-specific license agreed to upon submission
Description: