When DNA words started to repeat and took over the world!

Exploring Evolution with Flying Colors
When DNA words started to repeat and took over the world!

Read the paper

SpringerLink SpringerLink

Protein Repeats Show Clade-Specific Volatility in Aves - Molecular Biology

Abstract Protein repeats are a source of rapid evolutionary and functional novelty. Repeats are crucial in development, neurogenesis, immunity, and disease. Repeat length variability and purity can alter the outcome of a pathway by altering the protein structure and affecting the protein−protein interaction affinity. Such rampant alterations can facilitate species to rapidly adapt to new environments or acquire various morphological/physiological features. With more than 11000 species, the avian clade is one of the most speciose vertebrate clades, with near-ubiquitous distribution globally. Explosive adaptive radiation and functional diversification facilitated the birds to occupy various habitats. High diversity in morphology, physiology, flight pattern, behavior, coloration, and life histories make birds ideal for studying protein repeats’ role in evolutionary novelty. Our results demonstrate a similar repeat diversity and proportion of repeats across all the avian orders considered, implying an essential role of repeats in necessary pathways. We detected positively selected sites (PSS) in the polyQ repeat of RUNX2 in the avian clade; and considerable repeat length contraction in the Psittacopasserae. The repeats show a species-wide bias towards a contraction in Galloanseriformes. Interestingly, we detected the length contrast of polyS repeat in PCDH20 between Galliformes and Anseriformes. We speculate the length variability of serine repeat and its interaction with β-catenin in the Wnt/β-catenin signaling pathway could have facilitated fowls to adapt to their respective environmental conditions. We believe our study emphasizes the role of protein repeats in functional/morphological diversification in birds. We also provide an extensive list of genes with considerable repeat length contrast to further explore the role of length volatility in evolutionary novelty and rapid functional diversification.

The Beginning

Almost six years ago, our new evolutionary genomics lab at the  Indian Institute of Science Educaton and Research, Bhopal began exploring the impact of gene loss on organisms' evolution and adaptation. In our initial study of the CYP8B1, we discovered that gene loss is primarily preceded by the relaxation of purifying selection [1]. This laid the foundation for our subsequent research. As we delved deeper into the realm of relaxed selection, gene loss, and evolution, we observed that the intricate interplay of these factors helps organisms adapt to specific environmental demands. We extensively studied this phenomenon in the PLGRKT [2].


The exploration of gene loss using next-generation data required clever and definitive methods to identify gene loss across various vertebrate species. We developed a five-pass strategy to identify gene loss and its role in organisms' evolution (Figure 1).

Figure 1: Five-pass strategy to detect evidence of gene loss using next-generation sequencing data across species.

Linking Loss and Function

Building on our previous studies, we established the essential role of gene loss and relaxed selection in evolution. We aimed to extend and link this phenomenon with its consequences on morphology and function. This led us to investigate COA1 loss in vertebrates [3]. Here, we identified a significant link between the loss of flight in birds and the loss of the COA1 (Figure 2).

Figure 2: Overall status of the COA1 gene across the phylogenetic tree.

Exploring New Horizons

Having explored gene loss, relaxed selection, and their effects on function, we turned our attention to another phenomenon in genome evolution—amino acid repeat variability. This phenomenon, around 100,000 times more prominent than point mutations, became the focus of our study on immune genes. We examined the effects of amino acid repeats on morphological variations and evolutionary novelties [4].

New Challenges: New Innovations

While detecting general patterns of amino acid repeat distribution and types was straightforward, exploring variability required innovative strategies to empirically quantify length contrasts between species. We employed phylogenetically independent contrasts (PIC) and statistical tests for this purpose. This approach helped us identify candidate genes for future exploration (Figure 3).

Figure 3: The pie chart represents the protein-coding repeats' expansion and contraction in the primate clade.

Taking Flight with Birds

In our research journey, we utilized our accumulated knowledge to explore the fascinating group of species, Aves (Figure 4).

Figure 4: Sandhya Sharma (first author) explains the potential significance of amino acid repeats in Aves to Dr. Nagarjun Vijay (left side; corresponding author), and Lokdeep Teekas (right side; co-author) notes down the critical discussion points of the project.

Having gained insights into the role of amino acid repeats in providing evolutionary innovations, we focused on birds as the most diverse and species-rich clade (Figure 5).

Figure 5: Exploring the avian diversity in Bhopal, India (photos credit: Lokdeep Teekas).

This project was particularly exciting for us, and our findings motivated us to continue. We identified overall patterns of length expansion and contraction of amino acid repeats in each species/clade along the bird phylogeny [5]. Additionally, we identified the PCDH20 with polyS variability between waterfowls and landfowls, potentially connecting differences in hearing abilities to their environment (Figure 6).

Figure 6: PolyS repeat variability in the PCDH20 gene of Galloanseriformes.

What’s Next?

Our journey continues on this exciting path as we explore the role of amino acid repeats in all protein-coding genes across the available high-quality genomes of Tetrapoda species [6].

After graduating, I (Sandhya Sharma) am planning to apply for a post-doc position internationally so I can continue to carry out research.


  1. Shinde S.S., Teekas L., Sharma S., Vijay N. 2019. Signatures of Relaxed Selection in the CYP8B1 Gene of Birds and Mammals. J. Mol. Evol. 87, 209–20.
  2. Sharma S., Shinde S.S., Teekas L., Vijay N. 2020. Evidence for the loss of plasminogen receptor KT gene in chicken. Immunogenetics. 72, 507–15.
  3. Shinde S.S., Sharma S., Teekas L., Sharma A., Vijay N. 2021. Recurrent erosion of COA1/MITRAC15 exemplifies conditional gene dispensability in oxidative phosphorylation. Sci. Rep. 11, 24437.
  4. Teekas L., Sharma S., Vijay N. 2022. Lineage-specific protein repeat expansions and contractions reveal malleable regions of immune genes. Genes Immun. 1–17.
  5. Sharma S., Teekas L., Vijay N. 2023. Protein Repeats Show Clade-Specific Volatility in Aves. Mol. Biol. 57, 1199–211.
  6. Teekas L., Sharma S., Vijay N. 2023. Terminal regions of a protein are a hotspot for low complexity regions (LCRs) and selection. bioRxiv. 2023.07.05.547895.

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in