Genes Encoding Intrinsic Disorder in Eukaryota Have High GC Content

Document Type


Publication Date



disorder prediction, DNA-binding protein, GC content, protein evolution, RNA-binding protein

Digital Object Identifier (DOI)



We analyze a correlation between the GC content in genes of 12 eukaryotic species and the level of intrinsic disorder in their corresponding proteins. Comprehensive computational analysis has revealed that the disordered regions in eukaryotes are encoded by the GC-enriched gene regions and that this enrichment is correlated with the amount of disorder and is present across proteins and species characterized by varying amounts of disorder. The GC enrichment is a result of higher rate of amino acid coded by GC-rich codons in the disordered regions. Individual amino acids have the same GC-content profile between different species. Eukaryotic proteins with the disordered regions encoded by the GC-enriched gene segments carry out important biological functions including interactions with RNAs, DNAs, nucleotides, binding of calcium and metal ions, are involved in transcription, transport, cell division and certain signaling pathways, and are localized primarily in nucleus, cytosol and cytoplasm. We also investigate a possible relationship between GC content, intrinsic disorder and protein evolution. Analysis of a devised “age” of amino acids, their disorder-promoting capacity and the GC-enrichment of their codons suggests that the early amino acids are mostly disorder-promoting and their codons are GC-rich while most of late amino acids are mostly order-promoting.

Was this content written or created while at USF?


Citation / Publisher Attribution

Intrinsically Disordered Proteins, v. 4, issue 1, art. e1262225