Graduation Year


Document Type




Degree Granting Department

Industrial Engineering

Major Professor

Tapas K. Das, Ph.D.

Committee Member

Ali Yalcin, Ph.D.

Committee Member

Donald Berndt, Ph.D.


rough sets, reinforcement learning, solitary pulmonary nodule, markov decision process


Determining the most efficient use of diagnostic tests is one of the complex issues facing the medical practitioners. It is generally accepted that excessive use of tests is common practice in medical diagnosis. Many tests are performed even though the incremental knowledge gained does not affect the course of diagnosis. With the soaring cost of healthcare in the US, there is a critical need for cutting costs of diagnostic tests, while achieving a higher level of diagnostic accuracy. Various decision making tools assisting physicians in diagnosis management have been presented to the literature. One such method, called analytical hierarchy process, utilize a multilevel structure of decision criterion for sequential pair wise comparison of available test choices. Many of the decision-analytic methods are based on Bayes' theory and decision trees. These methods use threshold treatment probabilities and performance characteristics of the tests, such as true-positive rate and false-positive rates, to choose among the available alternatives. Sequential testing approaches tend to elongate the diagnosis process, whereas the parallel testing approach generally involves higher number of tests.

This research is focused on developing a machine learning based methodology for finding an efficient testing strategy for medical diagnosis. The method, based on the patient parameters (both observed and tested), recommends test(s) with the objective of optimizing a measure of performance for the diagnosis process. The performance measure is a combined cost of the testing, the risk and discomfort associated with the tests and the time taken to reach diagnosis. The performance measure also considers the diagnostic ability of the tests.

The methodology is developed combining tools from the fields of data mining (rough set theory, in particular), utility theory, Markov decision processes (MDP), and reinforcement learning (RL). The rough set theory is used in extracting diagnostic information in the form of rules from the medical databases. Utility theory is used to bring three non-homogenous measures (cost of testing, risk and discomfort and diagnostic ability) into one cost based measure of performance. The MDP framework along with an RL algorithm facilitates obtaining efficient testing strategies. The methodology is implemented on a sample problem of diagnosing Solitary Pulmonary Nodule (SPN). The results obtained are compared with those from four other approaches. It is shown that the RL based methodology holds significant promise in improving the performance of diagnostic process.