Frequent pattern mining from dense graph streams
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
As technology advances, streams of data can be produced in many applications such as social networks, sensor networks, bioinformatics, and chemical informatics. These kinds of streaming data share a property in common--namely, they can be modeled in terms of graph-structured data. Here, the data streams generated by graph data sources in these applications are graph streams. To extract implicit, previously unknown, and potentially useful frequent patterns from these streams, efficient data mining algorithms are in demand. Many existing algorithms capture important streaming data and assume that the captured data can fit into main memory. However, problems arise when such an assumption does not hold (e.g., when the available memory is limited). In this paper, we propose a data structure called DSMatrix for capturing important data from the streams--especially, dense graph streams--onto the disk when the memory space is limited. In addition, we also propose two stream mining algorithms that use DSMatrix to mine frequent patterns. The tree-based horizontal mining algorithm applies an effective frequency counting approach to avoid recursive construction of sub-trees as in many tree-based mining. The vertical mining algorithm makes good use of the information captured in the DSMatrix for mining.