Graduation Year
2014
Document Type
Dissertation
Degree
Ph.D.
Degree Name
Doctor of Philosophy (Ph.D.)
Degree Granting Department
Industrial and Management Systems Engineering
Major Professor
Bo Zeng, Ph.D.
Co-Major Professor
Xiaoning Qian, Ph.D.
Committee Member
Jose Zayas-Castro, Ph.D.
Committee Member
Tapas Das, Ph.D.
Committee Member
Kendra Vehik, M.P.H.,Ph.D.
Keywords
Biomarker Identication, Column Generation, Combinatorial Optimization, Integer Programming, Maximum Clique Problem
Abstract
We introduce and study a novel graph optimization problem to search for multiple cliques with the maximum overall weight, to which we denote as the Maximum Weighted Multiple Clique Problem (MWMCP). This problem arises in research involving network-based data mining, specifically, in bioinformatics where complex diseases, such as various types of cancer and diabetes, are conjectured to be triggered and influenced by a combination of genetic and environmental factors. To integrate potential effects from interplays among underlying candidate factors, we propose a new network-based framework to identify effective biomarkers by searching for "groups" of synergistic risk factors with high predictive power to disease outcome. An interaction network is constructed with vertex weight representing individual predictive power of candidate factors and edge weight representing pairwise synergistic interaction among factors. This network-based biomarker identification problem is then formulated as a MWMCP. To achieve near optimal solutions for large-scale networks, an analytical algorithm based on column generation method as well as a fast greedy heuristic have been derived. Also, to obtain its exact solutions, an advanced branch-price-and-cut algorithm is designed and solved after studying the properties of the problem. Our algorithms for MWMCP have been implemented and tested on random graphs and promising results have been obtained. They also are used to analyze two biomedical datasets: a Type 1 Diabetes (T1D) dataset from the Diabetes Prevention Trial-Type 1 (DPT-1) Study, and a breast cancer genomics dataset for metastasis prognosis. The results demonstrate that our network-based methods can identify important biomarkers with better prediction accuracy compared to the conventional feature selection that only considers individual effects.
Scholar Commons Citation
Sajjadi, Seyed Javad, "Novel Models and Efficient Algorithms for Network-based Optimization in Biomedical Applications" (2014). USF Tampa Graduate Theses and Dissertations.
https://digitalcommons.usf.edu/etd/5300