Graduation Year


Document Type




Degree Name

Doctor of Philosophy (Ph.D.)

Degree Granting Department

Biology (Cell Biology, Microbiology, Molecular Biology)

Major Professor

Sameer Varma, Ph.D.

Committee Member

Bin Xue, Ph.D.

Committee Member

Sagar Pandit, Ph.D.

Committee Member

Yicheng Tu, Ph.D.


Protein Allostery, Prediction, Machine Learning


Regulation of protein activity is essential for normal cell functionality. Many proteins are regulated allosterically, that is, with spatial gaps between stimulation and active sites. Biological stimuli that regulate proteins allosterically include, for example, ions and small molecules, post-translational modifications, and intensive state-variables like temperature and pH. These effectors can not only switch activities on-and-off, but also fine-tune activities. Understanding the underpinnings of allostery, that is, how signals are propagated between distant sites, and how transmitted signals manifest themselves into regulation of protein activity, has been one of the central foci of biology for over 50 years. Today, the importance of such studies goes beyond basic pedagogical interests as bioengineers seek design features to control protein function for myriad purposes, including design of nano-biosensors, drug delivery vehicles, synthetic cells and organic-synthetic interfaces. The current phenomenological view of allostery is that signaling and activity control occur via effector-induced changes in protein conformational ensembles. If the structures of two states of a protein differ from each other significantly, then thermal fluctuations can be neglected and an atomically detailed model of regulation can be constructed in terms of how their minimum-energy structures differ between states. However, when the minimum-energy structures of states differ from each other only marginally and the difference is comparable to thermal fluctuations, then a mechanistic model cannot be constructed solely on the basis of differences in protein structure. Understanding the mechanism of dynamic allostery requires not only assessment of high-dimensional conformational ensembles of the various individual states, including inactive, transition and active states, but also relationships between them. This challenge faces many diverse protein families, including G-protein coupled receptors, immune cell receptors, heat shock proteins, nuclear transcription factors and viral attachment proteins, whose mechanisms, despite numerous studies, remain poorly understood. This dissertation deals with the development of new methods that significantly boost the applicability of molecular simulation techniques to probe dynamic allostery in these proteins. Specifically, it deals with two different methods, one to obtain quantitative estimates for subtle differences between conformational ensembles, and the other to relate conformational ensemble differences to allosteric signal communication. Both methods are enabled by a new application of the mathematical framework of machine learning. These methods are applied to (a) identify specific effects of employed force fields on conformational ensembles, (b) compare multiple ensembles against each other for determination of common signaling pathways induced by different effectors, (c) identify the effects of point mutations on conformational ensemble shifts in proteins, and (d) understand the mechanism of dynamic allostery in a PDZ domain. These diverse applications essentially demonstrate the generality of the developed approaches, and specifically set the foundation for future studies on PDZ domains and viral attachment proteins.