Scalable high-utility pattern mining from data streams
Traditional high-utility mining mainly focuses on improving the efficiency of discovering high utility patterns from static databases based on a simplified assumption that the unit utility for a given item is a constant. However, not much research effort has been put into mining dynamic profit from data stream yet. The emergence of big data has led to some performance challenges such that a proper big data management technique is required to discover useful knowledge from the dynamic data streams. Traditional static data mining algorithms cannot directly apply to dynamic data. Furthermore, as information in the data stream might not be uniformly distributed, it introduces extra challenges to process the data. To mine real-world data streams, it is logical to use big data stream processing frameworks. Leveraging these big data processing frameworks requires having scalable algorithms. Hence, for my MSc thesis, I design and develop a high utility data stream framework to speed up the execution time and be flexible to adapt to mining requirement after data are dynamically modified. Utilizing our proposed algorithm, the data stream mining performance is expected to be further enhanced against both synthetic and real-world datasets.