Structural balance theory (SBT) is a long established framework for studying signed networks consisting of positive and negative ties (e.g. friendship and enmity relations) and the resulting patterns of polarization, or division into mutually hostile groups with positive in-group and negative out-group ties. The origins of SBT can be traced back at least to Gestalt psychology and the work of Fritz Heider, but quickly turned out to be of much interest to sociologists and more recently also to network physicists and computer scientists.
The core idea of SBT, as formalized in graph-theoretical terms in 1956 by Cartwright and Harrary, is the realization that a signed network can be partitioned into two mutually hostile groups if and only if all of its cycles are positive, that is, have positive products over their constitutive edges. This result was soon generalized to the problem of a division into an arbitrary number of groups by Davis who showed that in this more general case the requirement is that a network has no cycles with exactly one negative edge.
Real-world networks are of course hardly ever ordered enough to satisfy the SBT conditions exactly, but we may still be interested in the extent to which they satisfy them, or how close they are to being perfectly balanced and clustered into antagonistic groups (the figure below provides some examples of the connection between positive/negative cycles and polarization) . This is why a lot of work in SBT has been concerned with developing different measures of Degree of Balance (DoB). Many methods have been proposed but one can say, at some risk of oversimplification, that typically they try to assess the relative frequencies of positive and negative cycles.
Hence, in general the situation seems to be simple – it is enough to determine the signs of all cycles and aggregate them into a single numerical descriptor to know the degree of balance of a network (or how “structurally balanced” it is, following the lingo of the theory), right?
Not so fast. Unfortunately, the situation is not as simple in practice for several reasons. This is why SBT is still so interesting even after several decades of intensive research. Below I briefly describe several key issues:
- Enumerating and counting cycles is computationally prohibitive. At least in large graphs where their number grows combinatorially (as a matter of fact it is known that enumeration of Hamiltonian cycles is a #P-complete problem). Thus, in order to scale SBT to large systems we need smart approximations.
- It is not clear how DoB across different cycle lengths should be aggregated. On one hand, according to the clusterability conditions of SBT, we should care only about the overall relative frequency of positive cycles. However, already in 1956 Cartwright and Harrary noted that perhaps a reasonable DoB measure should consider shorter cycles more important. This intuition was quickly corroborated by empirical evidence provided by social psychologists such as Robert Zajonc (who is the patron of my institute at the University of Warsaw) and Eugene Burnstein, who showed that it is easier for people to memorize the valences of ties in shorter cycles. In more mathematical terms, it is easy to define a DoB measure for cycles of a particular length simply as a fraction of positive cycles. But it is not at all clear how such partial DoB measures should be aggregated across different cycle lengths.
- It is not clear how cycle-based DoB can be used for finding mutually antagonistic clusters. It is of course useful to have a DoB score for a network, but arguably often it is even more useful to be actually able to find antagonistic clusters. In this context it is helpful to use the notion of a network partition and its frustration. Namely, the frustration count of a network partition is the number of frustrated edges, that is, positive out-group and negative in-group edges. With this notion in hand, it is easy to see that it would be ideal to have DoB measures that are helpful for finding low frustration network partitions. However, it is often not immediately clear how to connect DoB measures to this problem.
Multiscale Semiwalk Balance
In a recently published paper, we have proposed a comprehensive Multiscale Semiwalk Balance (MSB) framework that consistently addresses all the above-mentioned problems (and some more), which was inspired by multiple earlier works, but particularly by the walk-based approach proposed in 2014 by Ernesto Estrada and Michele Benzi. The framework we propose is computationally efficient, general and multiscale in the following senses:
- Efficient. It approximates cycles with closed walks, which can be counted very efficiently with standard methods of linear algebra. Even though at first such an approximation may seem to be somewhat unjustified, there are very good reasons for using it. Firstly, the SBT conditions have been formulated in terms of cycles, but analogous conditions expressed in terms of closed walks are logically equivalent as a presence of a negative cycle implies a presence of negative closed walk and vice versa. Moreover, in a 2019 paper Ernesto Estrada argued convincingly that, from the perspective of a dynamical structural balance and diffusion on signed graphs, it is actually walks which are more important than cycles. Thus, even though the debate on the strengths and weaknesses of cycle- and walk-based formalization of structural balance theory has only recently started, there is a chance that walk-based methods may paint a fuller picture of SBT (and anyway, as we show in our work, the numerical differences between them are rather small).
- General. It can be applied to any (simple) signed network including directed and/or weighted networks.
- Multiscale. It provides both “grayscale” DoB measures at particular closed walk lengths as well as a global score coinciding with a particular weighted average of the “grayscale” measures. The weighting scheme is defined following the Locality Principle – shorter walks should be more important than longer ones. Moreover, MSB also defines DoB-like measures for particular nodes and for pairs of nodes. The latter allows for designing effective SBT-aware clustering methods useful for describing the mesoscopic network structure.
In the paper we tried to do the justice and delineate the fascinating background of previous works thanks to which we were able to develop our framework (you can find citations for all observations and claims from this post, and many more, in the paper, especially in the introduction part). We also illustrated the validity of our approach based on several empirical case studies. In particular:
- We did a re-analysis of the famous Sampson’s Monks dataset and showed that our methods allows for finding partitions with lower frustration than a popular “ground truth” solution, which also helps to get some important insight in the social dynamics driving the social structure of the network and its evolution in time.
- Applied our framework to a sequence of networks of cooperation ties between representatives and senators in the U.S. Congress over several decades and found strong patterns of increasing polarization within the legislature (a result consistent with multiple other studies).
Last but not least, the paper is accompanied by a user-friendly Python implementation of all the measures defined in our framework. With this contribution, we hope to offer a valuable resource for a variety of scholars interested in studying polarization, signed networks, and structural balance. We proposed a consistent and general SBT framework that is applicable to all kinds of simple signed networks (also directed and/or weighted), while also being an efficient computational tool allowing scaling of SBT-motivated analyses even to very large systems.
Acknowledgments
I thank Massimo Stella and Andreia Sofia Teixeira (who both co-authored the paper) for contributing to writing this blog post.
Please sign in or register for FREE
If you are a registered user on Research Communities by Springer Nature, please sign in