Graduation Year

2020

Document Type

Dissertation

Degree

Ph.D.

Degree Name

Doctor of Philosophy (Ph.D.)

Degree Granting Department

Mathematics and Statistics

Major Professor

Chris P. Tsokos, Ph.D.

Committee Member

Kandethody M. Ramachandran, Ph.D.

Committee Member

Lu Lu, Ph.D.

Committee Member

Zhuo Lu, Ph.D.

Keywords

Ordinary Bayes, Non-Homogeneous Poisson Process, Loss function, operating systems, Cyber

Abstract

As most of mankind now lives in an era of high dependence on multiple technologies and complex systems to store and manage sensitive information, researchers are constantly urged to obtain and improve measurements and methodologies that have the ability to evaluate systems reliability and security. The objectives of the present dissertation are to improve the Bayesian reliability estimation of a software package where the Power Law Process, also known as Non-Homogeneous Poisson Process, is the underlying failure model and to develop a set of statistical models evaluating computer operating systems vulnerabilities. Furthermore, we develop a reliability function of a computer network system using the Common Vulnerability Scoring System framework.

In the context of software reliability, we propose a Bayesian Reliability analysis approach of the Power Law Process under the Higgins-Tsokos loss function for modeling software failure times. We demonstrate, using real data, that the shape parameter of the Power Law Process behaves as a random variable. Based on Monte Carlo simulations and using real data, we show that the Bayesian estimate of the shape parameter and the proposed estimate of the scale parameter perform better compared to approximate maximum likelihood estimates, while they are sensitive to a prior selection. Using this result, we obtained a Bayesian reliability estimate of the Power Law Process. The results of this study have the potential to contribute not only to the reliability analysis field but also to other fields that employ the Power Law Process.

We further illustrate the robustness of the Higgins-Tsokos loss function verses the commonly used squared-error loss function in Bayesian Reliability analysis of the Power Law Process. Based on extensive Monte Carlo simulations and using real data, the Bayesian estimate of the shape parameter and the proposed estimate of the scale parameter were not only as robust as the Bayesian estimates under the squared-error loss function, but also performed better. The reliability function of the Power Law Process is a function of the intensity function; therefore the relative efficiency is used to compare the intensity function estimates. The intensity function using the Bayesian estimate of the shape parameter under the Higgins-Tsokos loss function and its influence on the scale parameter estimate is more efficient than using the Bayesian estimate under the squared-error loss function. Moreover, using Monte Carlo simulations for different sample sizes, we show the efficiency and best performance of Bayesian reliability analysis under the Higgins-Tsokos loss function, recognizing that it is sensitive to selections of its parameters values and the shape parameter's prior density function. An interactive user interface application was developed to simply, without any prior coding knowledge required of the user, compute and visualize the Bayesian and maximum likelihood estimates of the intensity and reliability functions of the Power Law Process for a given dataset.

In addition, we propose a new approach using Copula theory to obtain Bayesian estimates of the Power Law Process intensity function parameters. We first demonstrate, using real data, the random behaviors of the shape and scale parameters of the Power Law Process. We then show the applicability of Copula theory in capturing the dependency structure of the subject parameters and develop a bivariate probability distribution that best characterizes their bivariate probabilistic behaviors. Copula-based Bayesian analysis, under the squared-error loss function and the developed bivariate probability distribution, was studied, where Copula-based Bayesian estimates of the shape and scale parameters of the Power Law Process are obtained simultaneously, considering both parameters unknown and random quantities. Monte Carlo simulations and using real data found superiority of the simultaneous Copula-based Bayesian estimates of the shape and scale parameters of the Power Law Process compared to their corresponding approximate maximum likelihood and Jeffreys Bayesian estimates, under the Higgins-Tsokos loss function where only the shape parameter is considered an unknown and random quantity. A Copula-based Bayesian reliability estimate of the Power Law Process is then obtained using the obtained Copula-based Bayesian estimates of the subject parameters. This result is expected to be widely used in different fields that employ the Power Law Process.

In the context of cybersecurity, we demonstrate the vulnerability scores of the commonly used computer operating systems from 1999 to 2019. We then show the difficulty of performing parametric analysis and proceed to develop non-parametric analytical probability and cumulative density functions estimates of the vulnerability scores of Microsoft, Apple, and Linux computer operating systems, combined and individually. We also obtain and compare probabilities and expected values of their low, medium, and high vulnerability scores. The results showed that the vulnerability scores of Microsoft computer operating systems have higher expected and median values of vulnerabilities compared to Linux and Apple computer operating systems. The developed estimates of the probability and cumulative density functions of computer operating systems vulnerability scores, along with their graphical figures, should help IT mangers better understand their behavior probabilistically and serve as an important marketing tool.

Moreover, we propose analytic classification and prediction models to classify and predict the vulnerability severity level (low (0-3.9), medium (4-6.9), and high (7-10)) and score of a given computer operating system vulnerability, respectively, based on the Random Forest method subject to 13 risk factors. Using the National Vulnerability Database, we evaluated the most commonly used classification methods to develop analytical models to classify computer operating systems vulnerabilities. Evaluation of the Random Forest method showed it to have the best performance, on the subject vulnerabilities, compared to the other methods based on multiple evaluation metrics. We also propose a Random Forest classification method to develop an analytical model subject to 13 risk factors to classify whether a given computer operating system vulnerability will allow attackers to cause Denial of Service to the subject system. Furthermore, we develop a ranking process of computer operating systems vulnerabilities risk factors using the Random Forest regression method which should help prioritize the remediation process. The developed analytical models will assist not only vendors and information technology specialists, but also end-users, in managing and understanding the impact of the unfixed and newly discovered vulnerabilities of their computer operating systems.

Finally, we expand upon previous efforts and proceed to utilize two structured statistical models to simulate expected path length and the minimum number of steps data necessary to hack a computer network system with a very high probability. We illustrate a process of identifying probability distribution functions, parametrically and non-parametrically, that characterize the probabilistic behaviors of the expected path length and the minimum number of steps data. A mixture of Gamma and LogNormal probability distributions was found to be a good fit to the data. Also, we develop non-parametric analytical density estimates of the expected path length and the minimum number of steps to hack the computer network system with a very high probability. As well, we perform a parametric reliability analysis, and additionally a non-parametric reliability analysis of the computer network system for maintenance purposes and other administrative management. The analytical procedure and methodology presented in this study can be applied to a larger computer network system.

Share

COinS