Materials define our physical world. Their properties are determined by the underlying electronic and crystal structures. With the development of computational science, multiple new materials are being predicted daily, quickly building into massive materials databases. Mining these prohibitively vast databases to find the materials of choice, rather than the lack of materials, has since become the real barrier that lies between the current and future technologies.
In our paper “Deep learning approach to genome of two-dimensional materials with flat electronic bands”, we developed an algorithm to identify 2D materials with flat bands, and further extract their crystal fingerprints for clustering (see Fig. 1 for the architecture of our machine learning algorithm). The resulting flat band sublattices charts could serve as a roadmap for searching 2D flat band lattices in the future, enabling the exploration of strongly correlated physics. Our code is available at https://github.com/Anupam-Bh/ML_2D_flat_band, and can be adopted or further enhanced to allow exploration of the computed materials databases in search for materials with new and exotic properties.
In our work, we first train convolutional neural network (CNN) to identify materials with flat bands in the whole 2D Materials Encyclopedia (2Dmatpedia) database – a supervised machine learning process. The identification of flat bands using machine learning is not a straightforward task, even for simple band structures. Previous works using parametrized bands and predefined bandwidth in identifying flat bands would overlook potential flat band materials because of, for example, band crossings. Instead, we use band structures images for flat band identification. One of the very critical procedures is the segmentation of the band structure images, horizontally along energy band width and vertically along symmetry points (Fig. 1). This makes it easier for CNN to recognize all the flat segments with high throughput and accuracy and allow us to reveal materials with flat bands spanning the whole Brillouin zone (BZ). Using this method, we found 2127 plane-flat materials in the 2Dmatpedia database. At this point, in principle, one could go through all these flat band materials to find the one(s) that hosting interesting physics.
But we took a step further and generalized the structure features that underpin flat band. This is achieved through unsupervised machine learning, where we extract the sublattices that are responsible for the flat dispersion and classify them into clusters. We came up with a conjecture, that flat bands originate from a single element in the compound, based on spatial localization of electrons forming the flat band. The validity of our conjecture was assessed by primary sublattice score (S1) – higher score means flat bands dominated by one elemental sublattice. Among all the materials, more than three quarters have S1 > 0.7, strongly supporting our conjecture. This conjecture thus allows us to use flat band element sublattice instead of full crystal structure for further analysis, largely simplifying the process while maintaining high accuracy.
The extracted flat band sublattices are then represented as a 244-dimensional vector for further clustering based on their structural fingerprints. A combination of density-based algorithm HDBSCAN and t-SNE plot were employed to achieve optimal clustering that representing both global clusters and local neighbourhood information. Our results are visualised as both phylogenetic tree and 2D t-SNE plot to emphasize the structural fingerprints that are responsible for flat bands and the evolution among sublattices, see Fig. 2. Many of the sublattice structures identified in our work have been confirmed in the literature, yet we also identified many new sublattice that are out of the known paradigm. Future works are expected to reveal their origin and the rich physics behind, and to complete our understanding of the structure – property relation of materials.