Graduation Year
2020
Document Type
Dissertation
Degree
Ph.D.
Degree Name
Doctor of Philosophy (Ph.D.)
Degree Granting Department
Computer Science and Engineering
Major Professor
Yicheng Tu, Ph.D.
Committee Member
John Licato, Ph.D.
Committee Member
John Murray-Bruce, Ph.D.
Committee Member
Ankit Shah, Ph.D.
Committee Member
Feng Cheng, Ph.D.
Keywords
Data Mining, Graph Mining, Hypergraph
Abstract
In recent years, the popularity of graph datasets has grown rapidly. Frequent subgraph mining (FSM) from graphs becomes an important subject in computer science research. In this dissertation, we study single-graph as an effective model to represent information and its related graph mining techniques. In frequent pattern mining in a single-graph setting, there are two main problems: support measure and search scheme. We study the development of support measures, which are basically functions that map a pattern to its frequency count in a database. Our work is based on the hypergraph framework using the concept of occurrence/instance hypergraphs. We present improved hardness and approximation theorems among the major support measures and a general form for minimum-image-based measures. For the purpose of guiding the development of new support measures, we present general sufficient conditions for designing new support measures in hypergraph framework, which can be applied to MNI and other support measures that are not included in the overlap graph framework. We utilize the sufficient conditions to generalize MNI and minimum instance measure (MI) for designing user-defined linear-time measures. From the sufficient conditions, we develop a new efficient polynomial-time support measure named maximum independent subedge set (MISS) measure which combines the advantages of existing measures. We also show that MISS can ll the gap between MIS and MI in computation complexity and support count. Last but not least, we present and review the experimental evaluations of the major support measures.
Scholar Commons Citation
Meng, Jinghan, "Design of Support Measures for Counting Frequent Patterns in Graphs" (2020). USF Tampa Graduate Theses and Dissertations.
https://digitalcommons.usf.edu/etd/9015