Can machines learn 3D chemical bond distributions?

Published in Chemistry, Materials, and Physics
Can machines learn 3D chemical bond distributions?
Like

Share this post

Choose a social network to share with, or copy the URL to share elsewhere

This is a representation of how your post may appear on social media. The actual post will vary between social networks

The mean-field or self-consistent field (SCF) method represents a ubiquitous computational approach to deal with complex scientific problems formulated as several coupled differential equations and appear in a wide range of contexts such as the Landau theory for phase transitions, Bogoliubov-de Gennes equations for superconductivity, Gummel’s equations for semiconductor devices, and Kohn-Sham density functional theory (DFT) for ab initio electronic structure calculations, to name a few. In the case of DFT, which has become the standard computational tool for a wide range of science and engineering fields, the SCF solutions of the Kohn-Sham (KS) equations identify the three-dimensional (3D) ground-state electron density and, in doing so, obtain the variationally minimized total energy.

Despite its success, the applicability of DFT calculations is typically limited to a few hundred to thousand atoms due to the cubic scaling of the computational cost with respect to the number of atoms. Recently, there has been much interest in utilizing artificial intelligence (AI) techniques to accelerate DFT calculations. However, compared to the machine learning (ML) strategies for predicting macroscopic materials properties and atomic forces, the progress in applying AI techniques to the prediction of quantum mechanical electronic structure information has been slow .

In the paper recently published in npj Computational Materials, our team at KAIST demonstrate for the first time how one can efficiently yet accurately bypass SCF iterations in DFT calculations by encoding spatial chemical bonding distributions using 3D convolutional neural networks (CNNs). 

According to the Hohenberg-Kohn theorem of DFT, 3D electron density distributions fully embody quantum mechanical electronic structure information. In practice, however, the true electron density is unknown, so DFT calculations proceed by repeating the SCF step comprised of first educationally guessing initial electron density, next constructing the Kohn-Sham Hamiltonian, and finally solving Schrödinger-like Kohn-Sham equations. The summation of individual atomic electron densities is often taken as the initial electron density, and the SCF iteration is repeated until the convergence of electron density or Hamiltonian is reached and the total energy is minimized. The number of SCF iteration cycle typically grows with the system size and complexity and can reach several hundred times.

The crux of the newly developed ML approach, which we termed "DeepSCF", lies in recognizing that, rather than total electron density itself, the residual electron density (difference between total electron density and the summation of atomic electron densities) signifies chemical bonding information and coming up with efficient strategies to machine learn the residual density via 3D CNNs. Specifically, we learned the residual density projected onto a 3D spatial grid using several atomic fingerprints, adopted a molecular database that includes diverse chemical bonds, and enhanced the accuracy and transferability of the algorithm by randomly modifying molecular structures within the database.

We demonstrated that DeepSCF achieves significant computational speedups compared with conventional SCF DFT calculations while achieving comparable accuracies. The method proved effective not only for molecules and crystals but also for complex device models such as a carbon nanotube-based DNA sequencer.

In addition to offering a foundational principle for AI-based acceleration of materials simulations across scales, our study should contribute to the AI application fields in general by showcasing a powerful example for integrating advanced deep learning models with high-performance scientific computing.

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in

Follow the Topic

Theoretical, Mathematical and Computational Physics
Physical Sciences > Physics and Astronomy > Theoretical, Mathematical and Computational Physics
Computational Materials Science
Physical Sciences > Materials Science > Computational Materials Science
Machine Learning
Mathematics and Computing > Computer Science > Artificial Intelligence > Machine Learning
Computational Chemistry
Physical Sciences > Chemistry > Theoretical Chemistry > Computational Chemistry

Related Collections

With collections, you can get published faster and increase your visibility.

Machine Learning Interatomic Potentials in Computational Materials

Publishing Model: Open Access

Deadline: Jun 06, 2025

Self-Driving Laboratories for Chemistry and Materials Science

Publishing Model: Open Access

Deadline: Jul 08, 2025