Graduation Year


Document Type




Degree Granting Department

Chemical Engineering

Major Professor

Aydin K. Sunol, Ph.D.

Committee Member

John A. Llewellyn, Ph.D.

Committee Member

Scott W. Campbell, Ph.D.


machine learning, function identification, statistic analysis, modeling, nonlinear regression


The symbolic regression problem is to find a function, in symbolic form, that fits a given data set. Symbolic regression provides a means for function identification. This research describes an adaptive hybrid system for symbolic function identification of thermo-physical model that combines the genetic programming and a modified Marquardt nonlinear regression algorithm.

Genetic Programming (GP) system can extract knowledge from the data in the form of symbolic expressions, i.e. a tree structure, that are used to model and derive equation of state, mixing rules and phase behavior from the experimental data (properties estimation).

During the automatic evolution process of GP, the function structure of generated individual module could be highly complicated. To ensure the convergence of the regression, a modified Marquardt regression algorithm is used. Two stop criteria are attached to the traditional Marquardt algorithm to enforce the algorithm repeat the regression process before it stops.

Statistic analysis is applied to the fitted model. Residual plot is used to test the goodness of fit. The χ2-test is used to test the model's adequacy.

Ten experiments are run with different form of input variables, number of data points, standard errors added to data set, and fitness functions. The results show that the system is able to find models and optimize for its parameters successfully.