Graduation Year
2017
Document Type
Thesis
Degree
M.S.
Degree Name
Master of Science (M.S.)
Degree Granting Department
Engineering
Major Professor
Yicheng Tu, Ph.D.
Committee Member
Yao Liu, Ph.D.
Committee Member
Paul Rosen, Ph.D.
Keywords
Data Mining, Vertex Overlap, Anti-monotonic, Linear Programming, Hypergraph
Abstract
In recent years, the popularity of graph databases has grown rapidly. This paper focuses on single-graph as an effective model to represent information and its related graph mining techniques. In frequent pattern mining in a single-graph setting, there are two main problems: support measure and search scheme. In this paper, we propose a novel framework for constructing support measures that brings together existing minimum-image-based and overlap-graph-based support measures. Our framework is built on the concept of occurrence / instance hypergraphs. Based on that, we present two new support measures: minimum instance (MI) measure and minimum vertex cover (MVC) measure, that combine the advantages of existing measures. In particular, we show that the existing minimum-image-based support measure is an upper bound of the MI measure, which is also linear-time computable and results in counts that are close to number of instances of a pattern. Although the MVC measure is NP-hard, it can be approximated to a constant factor in polynomial time. We also provide polynomial-time relaxations for both measures and bounding theorems for all presented support measures in the hypergraph setting. We further show that the hypergraph-based framework can unify all support measures studied in this paper. This framework is also flexible in that more variants of support measures can be defined and profiled in it.
Scholar Commons Citation
Meng, Jinghan, "Flexible and Feasible Support Measures for Mining Frequent Patterns in Large Labeled Graphs" (2017). USF Tampa Graduate Theses and Dissertations.
https://digitalcommons.usf.edu/etd/6900