Behind the Paper

Machine learning insights into predicting biogas separation in metal-organic frameworks

This blog is written by Dr. Isabel Cooley who completed her PhD in 2023 with Prof. Elena Besley at the University of Nottingham and is currently a postdoctoral fellow at the University of Loughborough.

Published in Chemistry, Materials, and Statistics

May 15, 2024

Elena Besley and Isabel Cooley

2 contributors

Machine learning insights into predicting biogas separation in metal-organic frameworks

Liked by India Ambler and 2 others

Explore the Research

With the modern world increasingly impacted by the effects of climate change, it is imperative to pursue all available avenues for increasing sustainability across all sectors. One of these avenues is the use of biogas as a fuel alternative. It burns more cleanly than fossil fuels and can be obtained from renewable sources, in particular existing waste stocks from agricultural and industrial processes. However, a challenge when it comes to achieving more widespread use of biogas as fuel is obtaining sufficiently pure gas streams. The cleanest and most efficient biofuel is composed of pure methane, CH4, but biogas sources contain contaminants including CO2. Efficient, effective, and sustainable methods to separate CO2/CH4 streams and upgrade the biogas are needed. One possible method is separation by an appropriate porous material. Adsorption, the gathering of molecules on the material’s surface, is key here. A well-selected material can adsorb CO2 while leaving CH4 largely free to pass, and the process is enhanced by specific chemical interactions. In this work, we leveraged computational chemistry and machine learning methods to search material databases for promising materials based on predicted properties.

The materials we focused on are metal organic frameworks (MOFs). These crystalline solids have specific structures, with fragments containing metal atoms bonding to those containing carbon. By altering the chemical makeup of the fragments, their bonding geometry, or a combination of the two, it is theoretically possible to access hundreds of thousands of MOFs with different properties. Certain MOFs perform separations very well for specific targeted gas mixtures with a separation process which depends on tailored chemical interactions.

Finding a MOF with excellent performance among the many possibilities is a formidable task; experimentally creating and testing even a small fraction of the potential structures is not possible. However, several databases exist populated with the chemical structures of large numbers of MOFs. These databases are a valuable tool. Through them, the field lends itself to studies involving fast property prediction for thousands of accessible structures, allowing refinement and material selection. Computational statistical thermodynamics methods using a sufficiently accurate underlying model can be used to make reasonable predictions, but they are limited by their speed. Machine learning models can be trained to reproduce their results at greater speed, expediting database searching and accelerating the development of new materials for biogas upgrading.

Useful as they are, some of the information contained in these databases is flawed: some of the structures defy chemical rules and do not represent realistic materials. The scale of this issue is surprising given MOF databases are ultimately based on experimentally characterised structures. However, the problems exist in all major MOF databases, affecting a large fraction of entries, often over 50%. They come about as a result of dealing with data on a large scale using automatic processing methods which miss chemical nuance. Where unreasonable chemical structures are present in materials searches, they can lead to misleading results, particularly if they are used to train machine learning models. We curated the databases that we used in our work, applying several checks to the chemical structures and removing unreasonable data.

Once suitable databases were selected and unreasonable structures removed, we used computational chemistry methods based on statistical thermodynamics to predict the ability of the MOFs to separate CO2 and CH4. We looked for MOFs expected to adsorb a high total quantity of CO2 while keeping CH4 adsorption low, and we picked out a handful of MOFs based on this. We also identified geometrical properties displayed by promising MOFs to help guide future searches.

Figure 1: Chemical structures of six promising MOFs selected by our computational chemistry search.

Calculations of this kind, while faster than experiments, can take hours or even days, leading to a sizeable total simulation time. Therefore, we used the results from this property search – chemical structures of thousands of MOFs and predictions about their separation ability – as input data to train machine learning models to speed up future similar property searches. These mathematical models find relationships between readily available data about a material and other less accessible data. They can then be used for new materials to almost instantly make predictions about the less accessible data. In this case, the models make predictions about separation ability based on chemical structure and other readily available information. The models we trained were of two main types, regression and classification. The first make numerical predictions meaning that, if of high enough quality, they can be highly nuanced and specific. The second were trained to identify each MOF as either high-performing or low-performing, and may be used in the early stages of a property search to eliminate low-performing MOFs before any more laborious calculations are needed, focusing efforts on the best materials. We trained a variety of different kinds of both regression and classification models for comparison.

The models performed well on the training and validation set from our computational chemistry search. We selected the most accurate of them and tested them on a set of unseen MOFs from a different database. We still saw strong performance, although, as expected, accuracy was reduced compared to the training performance. In doing these tests, we selected MOFs from the test set identified by the machine learning models as those most likely to display strong performance.

Figure 2: Chemical structures of six promising MOFs selected by our machine learning search.

In order to best understand the models and pave the way for their future improvement, we extensively analysed their accuracy in the context of the data. We compared the models to each other, identified the kinds of MOFs for which predictions were most accurate, and proposed ways that accuracy could be expanded in future work. In the interests of open access and reproducibility, all relevant scripts and datasets from this work are available and downloadable from a GitHub repository.

Overall, in this work we have identified promising MOFs which may promote the sustainable purification of biogas mixtures. We have also made available fast and efficient methods which may be used to expedite similar structure searches in future. Therefore, this work can help to unlock the potential of biogas fuel in promoting sustainability and waste reduction across several sectors.

Multiple Contributors

Elena Besley and Isabel Cooley

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in

Follow the Topic

Machine Learning

Mathematics and Computing > Statistics > Statistics and Computing > Machine Learning

Computational Chemistry

Physical Sciences > Chemistry > Theoretical Chemistry > Computational Chemistry

Metal-organic Frameworks

Physical Sciences > Chemistry > Materials Chemistry > Metal-organic Frameworks

Communications Chemistry

Communications Chemistry

An open access journal from Nature Portfolio publishing high-quality research, reviews and commentary in all areas of the chemical sciences.

More about the journal

Related Collections

With Collections, you can get published faster and increase your visibility.

Advances in Polymer Synthesis

All participating journals invite submissions of original research articles, with Nature Communications and Communications Chemistry also considering Reviews and Perspectives which fall within the scope of the Collection. All submissions will be subject to the same peer review process and editorial processes as regular Nature Communications, Communications Chemistry, and Scientific Reports articles.

Publishing Model: Open Access

Deadline: Jan 31, 2026

Explore this Collection

f-block chemistry

This Collection aims to highlight recent progress in f-element chemistry, encompassing studies on fundamental electronic structure, advances in separation chemistry, advances in coordination and organometallic chemistry, and the application of f-element compounds in materials science and environmental technologies.

Publishing Model: Open Access

Deadline: Feb 28, 2026

Explore this Collection

Latest Content

Evidence of chronic sinusitis alleviation after open sinus allogeneic bone augmentation with simultaneous implant placement: A long-term retrospective observational study

A New Era in Surface Science: Introducing Surface Science and Technology

Behind the Paper

The Story Behind the Paper: From a Workshop to a Published Fight Against Malaria

The Impact of Climate Change on Animal Husbandry: Challenges and Adaptation Strategies

Medicinal Mushrooms

Cookies

We use cookies to ensure the functionality of our website, to personalize content and advertising, to provide social media features, and to analyze our traffic. If you allow us to do so, we also inform our social media, advertising and analysis partners about your use of our website. You can decide for yourself which categories you want to deny or allow. Please note that based on your settings not all functionalities of the site are available.

Further information can be found in our privacy policy.

Machine learning insights into predicting biogas separation in metal-organic frameworks

Share this post

Share with...

...or copy the link