Learning low-rank latent mesoscale structures in networks
Researchers in many fields use networks to encode interactions between entities in complex systems. To study the large-scale behavior of complex systems, it is useful to examine mesoscale structures in networks as building blocks that influence such behavior. In many studies of mesoscale structures, subgraph patterns (i.e., the connection patterns of small sets of nodes) have been studied as building blocks of network structure at various mesoscales. In particular, researchers often identify motifs as k-node subgraph patterns (where k is typically between 3 and 5) of a network. These patterns are unexpectedly more common than in a baseline network (which is constructed through a random process). In the last two decades, the study of motifs has yielded insights into networked systems in many areas, including biology, sociology, and economics. However, not much is known about how to use such motifs (or related mesoscale structures), after their discovery, as building blocks to reconstruct a network.
Networks have low-rank subgraph structure
The figure above shows examples of 20-node subgraphs that include a path (in red) that spans the entire subgraph. These subgraphs are subsets of various real-world and synthetic networks. The depicted subgraphs have diverse connection patterns, which depend on the structure of the original network. One of our key findings is that roughly speaking, these subgraphs have specific patterns. Specifically, we find that many real-world networks possess a small set of latent motifs that effectively approximate most subgraphs at a fixed mesoscale. The figure below shows "network dictionaries" of 25 latent motifs for Facebook “friendship” networks from the universities UCLA and Caltech Facebook. By using subgraph sampling and nonnegative matrix factorization, we are able to discover these latent motifs.



Selecting appropriate latent motifs is essential for our approach. These latent motifs act as building blocks of a network; they help us recreate different parts of it. If we don't pick an appropriate set of latent motifs, we cannot expect to accurately capture the structure of a network. This is analogous to using the correct puzzle pieces to assemble a picture. If we use the wrong pieces, we will assemble a picture that doesn’t match the original picture.
Motivating Application: Anomalous-subgraph detection
A common problem in network analysis is the detection of anomalous subgraphs of a network. The connection pattern of an anomalous subgraph distinguishes it from the rest of a network. This anomalous-subgraph-detection problem has numerous high-impact applications, including in security, finance, and healthcare.



Our method can also be used for "link-prediction" problems, in which one seeks to figure out the most likely new edges given an observed network structure. The reasoning is similar to that for anomalous-subgraph detection. We learn latent motifs from an observed network and use them to reconstruct the network. The edges with the largest weights in the reconstructed network that were non-edges in the observed network are our predictions as the most likely edges. As we show in the figure below, our latent-motif approach is competitive with popular methods for anomalous-subgraph detection and link prediction.
Conclusion
We introduced a mesoscale network structure, which we call latent motifs, that consists of k-node subgraphs that are building blocks of the connected k-node subgraphs of a network. By using combinations of latent motifs, we can approximate k-node subgraphs that are induced by uniformly random k-paths of a network. We also established algorithmically and theoretically that one can accurately approximate a network if one has a dictionary of latent motifs that can accurately approximate mesoscale structures of the network.
Our computational experiments demonstrate that latent motifs can have distinctive network structures and that various social, collaboration, and protein–protein interaction networks have low-rank mesoscale structures, in the sense that a few learned latent motifs are able to reconstruct, infer, and denoise the edges of a network. We hypothesize that such low-rank mesoscale structures are a common feature of networks beyond the examined networks.
Acknowledgements
We thank Manolo Blaufuss, Binhao Chen, Gopal Hari, Agam Goyal, Sara L. Murley, Krirk Nirunwiroj, Eunice Son, Bella Wu, and Haiyuan Yu for helpful comments on a draft of this blog entry.
Follow the Topic
-
Nature Communications
An open access, multidisciplinary journal dedicated to publishing high-quality research in all areas of the biological, health, physical, chemical and Earth sciences.
Related Collections
With collections, you can get published faster and increase your visibility.
Applications of Artificial Intelligence in Cancer
Publishing Model: Open Access
Deadline: Mar 31, 2025
Biology of rare genetic disorders
Publishing Model: Open Access
Deadline: Apr 30, 2025
Please sign in or register for FREE
If you are a registered user on Research Communities by Springer Nature, please sign in