Deep learning approach to genome of two-dimensional materials with flat electronic bands

The rapidly expanding materials databases call for a new approach for materials discovery. Machine learning could help us navigate vast databases to find the materials of choice.
Deep learning approach to genome of two-dimensional materials with flat electronic bands

Materials define our physical world. Their properties are determined by the underlying electronic and crystal structures. With the development of computational science, multiple new materials are being predicted daily, quickly building into massive materials databases. Mining these prohibitively vast databases to find the materials of choice, rather than the lack of materials, has since become the real barrier that lies between the current and future technologies.

In our paper “Deep learning approach to genome of two-dimensional materials with flat electronic bands”, we developed an algorithm to identify 2D materials with flat bands, and further extract their crystal fingerprints for clustering (see Fig. 1 for the architecture of our machine learning algorithm). The resulting flat band sublattices charts could serve as a roadmap for searching 2D flat band lattices in the future, enabling the exploration of strongly correlated physics. Our code is available at, and can be adopted or further enhanced to allow exploration of the computed materials databases in search for materials with new and exotic properties.

Fig. 1 | Architecture of machine learning algorithm used in our work. CNN was trained to identify flat band materials using segmented band structure images from the database, followed by identifying sublattices that are responsible for flat dispersion using element projected DOS. Then, density-based clustering combined with t-SNE was used to classify the assigned structural fingerprints, and to identify classes of flat band 2D materials. Figure from

In our work, we first train convolutional neural network (CNN) to identify materials with flat bands in the whole 2D Materials Encyclopedia (2Dmatpedia) database – a supervised machine learning process. The identification of flat bands using machine learning is not a straightforward task, even for simple band structures. Previous works using parametrized bands and predefined bandwidth in identifying flat bands would overlook potential flat band materials because of, for example, band crossings. Instead, we use band structures images for flat band identification. One of the very critical procedures is the segmentation of the band structure images, horizontally along energy band width and vertically along symmetry points (Fig. 1). This makes it easier for CNN to recognize all the flat segments with high throughput and accuracy and allow us to reveal materials with flat bands spanning the whole Brillouin zone (BZ). Using this method, we found 2127 plane-flat materials in the 2Dmatpedia database. At this point, in principle, one could go through all these flat band materials to find the one(s) that hosting interesting physics.

But we took a step further and generalized the structure features that underpin flat band. This is achieved through unsupervised machine learning, where we extract the sublattices that are responsible for the flat dispersion and classify them into clusters. We came up with a conjecture, that flat bands originate from a single element in the compound, based on spatial localization of electrons forming the flat band. The validity of our conjecture was assessed by primary sublattice score (S1) – higher score means flat bands dominated by one elemental sublattice. Among all the materials, more than three quarters have S1 > 0.7, strongly supporting our conjecture. This conjecture thus allows us to use flat band element sublattice instead of full crystal structure for further analysis, largely simplifying the process while maintaining high accuracy.

Fig. 2 | The identified flat band sublattices clustering. (a) Phylogenetic tree expression of the hierarchical relations. (b) t-SNE 2D visualization of the structure fingerprint space, different coordination patterns are color-coded. Insets are representative coordination templates, where atoms in different atomic planes of the templates are colored differently. Figure from

The extracted flat band sublattices are then represented as a 244-dimensional vector for further clustering based on their structural fingerprints. A combination of density-based algorithm HDBSCAN and t-SNE plot were employed to achieve optimal clustering that representing both global clusters and local neighbourhood information. Our results are visualised as both phylogenetic tree and 2D t-SNE plot to emphasize the structural fingerprints that are responsible for flat bands and the evolution among sublattices, see Fig. 2. Many of the sublattice structures identified in our work have been confirmed in the literature, yet we also identified many new sublattice that are out of the known paradigm. Future works are expected to reveal their origin and the rich physics behind, and to complete our understanding of the structure – property relation of materials.

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in