Behind the Paper

A crystal graph neural network model for defect formation in clean energy materials

Broad materials screening for defect properties requires fast methods that circumvent the need for first-principles supercell calculations of a vast number of possible defect configurations. We construct and train a defect graph neural network to screen oxides for clean energy applications.

Published in Computational Sciences

Aug 10, 2023

Stephan Lany

Sr. Scientist , National Renewable Energy Laboratory

Liked by Laura Taylor and 3 others

Explore the Research

The properties and functionality of solid-state materials are governed by their chemical composition and crystal structure. Often, however, it is not only the ideal crystal structure that matters but also the material’s ability to form defects. This is especially true in oxides for renewable energy application, where defects enable, for example, the ionic conductivity in solid electrolytes or the redox processes for solar thermochemical hydrogen (STCH) generation.

To efficiently screen a large number of materials for such applications, we were seeking after a machine learning (ML) model that would predict the propensity of materials to form these enabling defects, avoiding the need for a very large number (possibly up to a million) of comparatively slow and tedious density functional theory (DFT) supercell calculations. Graph neural networks (GNNs) are promising for this task because they automatically extract crystal structure information as features for model inputs, and therefore go beyond the capabilities of simpler models based only on chemical and compositional characteristics that can be derived by hand. However, existing GNN models were primarily designed to describe the ideal crystal without defects. Therefore, they predict the properties of the entire input structure, not site-specific properties as desired here.

The basic premise of our work is that the properties of defects, such as O vacancies, should be encoded in the crystal structure of the ideal oxide. Thus, an ML model could predict the vacancy formation energies without the need to determine the supercell structure of the defected material beforehand. We implemented node-level pooling operations for our defect-GNN model (dGNN) which extracts site-specific properties on an underlying crystal structure. To generate the training data for the ML model, we performed DFT supercell calculations to obtain defect formation energies for ~1500 unique defect sites in about 200 different oxides representing a chemical space of 15 different metal cations. Extensive cross-validation statistics ensured the numerical robustness and demonstrated that the model is capable of digesting future additional training data to further increase the accuracy.

Defect Graph Neural Network — **Figure 1**. Predicting vacancy defect formation enthalpies starts with identification of all unique symmetry sites in the DFT-relaxed host crystal structure. Direct first-principles (DFT-based) defect predictions require computationally expensive supercell calculations with atomic relaxations for each symmetry site. We perform DFT supercell calculations only for a smaller training set. On the other hand, the dGNN approach encodes a host crystal structure as a graph and updates feature vectors for each symmetry site according to a convolutional GNN approach. The last key step is node-pooling and a final multi-layer perceptron (MLP) operation to predict the final vacancy formation enthalpies corresponding to each symmetry site. The machine learned prediction of defect energies is computationally inexpensive and can be performed on the fly during database screening.

The dGNN model enables us to screen large databases. As a first exercise, we applied the model on over 2000 oxides in the Materials Project database that match the elements of the training set. Progressively stringent multi-objective selection criteria narrow this space down to a manageable number of candidates that are worthwhile for further investigation. We also connected the database screening with thermodynamic modeling, facilitating the creation of experimentally relevant temperature-pressure diagrams and analyzing the reduction entropy as an additional selection criterion in a high-throughput fashion.

The dGNN approach is a versatile framework for all crystal/symmetry classes and chemistries without need for engineering of descriptors and feature vectors. It is generally independent of the underlying graph encoding and convolutional architectures, so that it can be readily integrated with new developments in these areas. The defect-data and the codes implementing the dGNN approach are publicly available in Zenodo and Github repositories. We envision a broadening of the current chemical space, providing access to much larger number of candidate materials, as well as future extensions to more complex defects that could further enhance the materials performance in clean energy technologies.

Original research paper: Matthew D. Witman, Anuj Goyal, Tadashi Ogitsu, Anthony H. McDaniel, and Stephan Lany. Defect graph neural networks for materials discovery in high-temperature clean-energy applications. Nat. Comput. Sci. (2023).

Stephan Lany

Sr. Scientist , National Renewable Energy Laboratory

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in

Follow the Topic

Computer Science

Mathematics and Computing > Computer Science

Nature Computational Science

Nature Computational Science

A multidisciplinary journal that focuses on the development and use of computational techniques and mathematical models, as well as their application to address complex problems across a range of scientific disciplines.

More about the journal

Related Collections

With Collections, you can get published faster and increase your visibility.

Physics-Informed Machine Learning

This cross-journal Collection between Nature Communications, Nature Computational Science, Communications Physics, Communications AI & Computing, and Scientific Reports brings together the advances in Physics-Informed Machine Learning.

Publishing Model: Hybrid

Deadline: May 31, 2026

Explore this Collection

Latest Content

What Systematic Reviews Don’t Solve: Hidden Gaps Behind Evidence Synthesis 🔍📚

Behind the Paper, From the Editors

Functionalized Wood: A Green Nanoengineering Platform for Sustainable Technologies

Opportunities, Events, From the Editors

Call for Papers for New Special Issue on "Al for catalysis towardsa low-carbon and sustainable future"

Behind the Paper, News and Opinion

Turning plants into rare earth resources with ultrafast electrothermal heating

The Algorithmic Plate: AI-Driven Inverse Nutrient Design and the Emergence of Digital Nutrient Specification

Cookies

We use cookies to ensure the functionality of our website, to personalize content and advertising, to provide social media features, and to analyze our traffic. If you allow us to do so, we also inform our social media, advertising and analysis partners about your use of our website. You can decide for yourself which categories you want to deny or allow. Please note that based on your settings not all functionalities of the site are available.

Further information can be found in our privacy policy.

A crystal graph neural network model for defect formation in clean energy materials

Share this post

Share with...

...or copy the link