The Effects of Mutations on Protein Function: A Comparative Study of Three Databases of Mutations in Humans

Document Type


Publication Date


Digital Object Identifier (DOI)



Single-nucleotide mutations (SNPs) in protein-coding regions of the human genome are a major factor in determining human variation in health and disease. Here, we analyze the amino acid changes and functional effects due to non-synonymous SNPs. Three databases were used: (i) Variation – mutations found in the general human population; (ii) Cosmic – mutations found in cancer cells; and (iii) Pathogenic – a curated subset of mutations in Variation that are associated with diseases. The distributions of amino acid changes in these datasets were analyzed. It is shown that mutations in the Pathogenic dataset, in particular, tend to introduce order-promoting residues. The effects of the mutations in these datasets were also studied using the program Polyphen-2, which predicts the functional impact of non-synonymous mutations. In order to evaluate the significance of these predicted effects, we compared them to those due to the same amino acid replacements introduced at other positions in the same proteins as a control. A mutation can be deleterious because the amino acid change is drastic (for example a change from hydrophobic residue to hydrophilic residue) or because of its location in the protein. We found that, on both counts, mutations in the Variation dataset tend to be less deleterious than randomly expected whereas mutations in the Pathogenic dataset tend to be more deleterious than their control mutations. The mutations in the Cosmic dataset are found to be more deleterious than those in its control set but less than those in Pathogenic.

Was this content written or created while at USF?


Citation / Publisher Attribution

Israel Journal of Chemistry, v. 53, issue 3-4, p. 217-226