De novo atomic protein structure modeling for cryoEM density maps

Cryo2Struct determines atomic structures of large proteins directly from cryo-EM density maps, advancing biomedical research and drug design.
Like

Share this post

Choose a social network to share with, or copy the URL to share elsewhere

This is a representation of how your post may appear on social media. The actual post will vary between social networks

In recent years, cryo-electron microscopy (cryo-EM) has emerged as a key technology for experimentally determining the structures of large protein complexes and assemblies. The 3D arrangement of atoms provides a mechanistic understanding of molecular processes, offering insights into the fundamental processes of life at the molecular level. Accurately modeling protein atomic structures from cryo-EM density maps is crucial because it reveals how proteins perform their biological functions and interact with other molecules, such as substrates, inhibitors, DNA, RNA, and other proteins. This understanding also enables the design of molecules that can precisely interact with proteins, leading to the development of more effective and specific drugs. Additionally, determining the structures of proteins from pathogens, such as bacteria and viruses, can inform the development of vaccines and treatments.

Our Approach:

We developed Cryo2Struct, a fully automated, ab initio modeling method that generates 3D atomic structures solely from cryo-EM density maps, without using predicted or homologous structures as templates. This method allows for the modeling of atomic structures through direct observation of atoms in the density maps.

Cryo2Struct employs a Transformer-based deep learning model with an attention mechanism to identify atoms and their amino acid types in cryo-EM density maps. It then uses an innovative generative Hidden Markov Model (HMM) and a tailored Viterbi Algorithm to align protein sequences with the predicted atoms and amino acid types, generating atomic backbone structures. Cryo2Struct has been rigorously tested on 628 density maps in a stringent ab initio modeling setting, where no homologous or predicted structures were used as templates, and it demonstrated substantially improved modeling accuracy. Additionally, Cryo2Struct provides per-residue confidence estimations ranging from 0 to 1 for both C-alpha atoms and amino acid type predictions. These confidence scores, similar to the pLDDT scores assigned by AlphaFold, reflect the degree of certainty in the predictions, with higher scores indicating more reliable predictions and lower scores suggesting areas that may require further scrutiny.

Cryo2Struct builds the atomic structure of proteins from cryo-EM density maps.

Results:

Cryo2Struct achieved substantially better performance than the most widely used de novo modeling method - Phenix in terms of multiple evaluation metrics including C-alpha recall, F1 score, global normalized TM-score, aligned C-alpha length, C-alpha match score, C-alpha sequence match score, and C-alpha quality score. In general, it can build much more accurate and more complete protein structures from cryo-EM density maps than Phenix, therefore advancing the state of the art of ab initio modeling of protein structures on cryo-EM density maps and providing a useful means for the community to build better protein structural models from both existing cryo-EM density maps and new ones to be generated to support biomedical research.  More detailed results and analysis are available in the manuscript, which can be accessed here: https://doi.org/10.1038/s41467-024-49647-6 .

Code:

The source code for Cryo2Struct is open-source and available in the GitHub repository: https://github.com/jianlin-cheng/Cryo2Struct. This repository also includes instructions on running Cryo2Struct on cryo-EM maps to generate 3D atomic protein structures.


Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in

Follow the Topic

Machine Learning
Mathematics and Computing > Computer Science > Artificial Intelligence > Machine Learning
Protein Structure Predictions
Life Sciences > Biological Sciences > Biological Techniques > Computational and Systems Biology > Protein Structure Predictions
Cryoelectron Microscopy
Life Sciences > Biological Sciences > Structural Biology > Biological Structure Determination > Electron Microscopy > Cryoelectron Microscopy
Probability and Statistics in Computer Science
Mathematics and Computing > Statistics > Applied Statistics > Statistics in Engineering, Physics, Computer Science, Chemistry and Earth Sciences > Probability and Statistics in Computer Science

Related Collections

With collections, you can get published faster and increase your visibility.

Applications of Artificial Intelligence in Cancer

In this cross-journal collection between Nature Communications, npj Digital Medicine, npj Precision Oncology, Communications Medicine, Communications Biology, and Scientific Reports, we invite submissions with a focus on artificial intelligence in cancer.

Publishing Model: Open Access

Deadline: Mar 31, 2025

Biology of rare genetic disorders

This cross-journal Collection between Nature Communications, Communications Biology, npj Genomic Medicine and Scientific Reports brings together research articles that provide new insights into the biology of rare genetic disorders, also known as Mendelian or monogenic disorders.

Publishing Model: Open Access

Deadline: Apr 30, 2025