Frequent pattern mining from dense graph streams

Loading...
Thumbnail Image
Date
2014
Authors
Cameron, Juan J.
Cuzzocrea, Alfredo
Jiang, Fan
Leung, Carson K.
Journal Title
Journal ISSN
Volume Title
Publisher
CEUR Workshop Proceedings
Abstract
As technology advances, streams of data can be produced in many applications such as social networks, sensor networks, bioinformatics, and chemical informatics. These kinds of streaming data share a property in common--namely, they can be modeled in terms of graph-structured data. Here, the data streams generated by graph data sources in these applications are graph streams. To extract implicit, previously unknown, and potentially useful frequent patterns from these streams, efficient data mining algorithms are in demand. Many existing algorithms capture important streaming data and assume that the captured data can fit into main memory. However, problems arise when such an assumption does not hold (e.g., when the available memory is limited). In this paper, we propose a data structure called DSMatrix for capturing important data from the streams--especially, dense graph streams--onto the disk when the memory space is limited. In addition, we also propose two stream mining algorithms that use DSMatrix to mine frequent patterns. The tree-based horizontal mining algorithm applies an effective frequency counting approach to avoid recursive construction of sub-trees as in many tree-based mining. The vertical mining algorithm makes good use of the information captured in the DSMatrix for mining.
Description
J.J. Cameron, A. Cuzzocrea, F. Jiang, & C.K. Leung. Frequent pattern mining from dense graph streams. In Proc. EDBT/ICDT Workshops 2014, pp. 240-247. This paper is published in the Workshop Proceedings of the EDBT/ICDT 2014 Joint Conference (March 28, 2014, Athens, Greece) on CEUR-WS.org (ISSN 1613-0073) under the terms of the Creative Commons license CC-by-nc-nd 4.0 (http://creativecommons.org/licenses/by-nc-nd/4.0).
Keywords
data mining, frequent pattern discovery, graph patterns, graph-structured data, social networks, extending database technology, database theory
Citation