Deep learning as Ricci flow

Deep neural networks (DNNs) efficiently approximate complex data distributions, by applying geometric transformations to the data. While some understanding exists for smooth activation functions, non-smooth functions like the widely used ReLU need further exploration.
Published in Mathematics and Statistics
Deep learning as Ricci flow
Like

Share this post

Choose a social network to share with, or copy the URL to share elsewhere

This is a representation of how your post may appear on social media. The actual post will vary between social networks

Geometry in the context of Deep Learning

The use of geometry in Deep Learning is a dynamic field, significantly shaped by the seminal works of Michael Bronstein, which have established Geometric Deep Learning (GDL) as a framework for understanding deep learning from a geometric perspective. This approach provides a unified taxonomy of the field and fosters a clearer and deeper understanding of this quick evolving discipline. We encourage interested readers to explore the blog posts published by M. Bronstein on GDL:

Towards Geometric Deep Learning I: On the Shoulders of Giants

Towards Geometric Deep Learning II: The Perceptron Affair

Towards Geometric Deep Learning III: First Geometric Architectures

The GDL framework provides a new perspective on deep learning through the lens of geometry. Within this context, the idea that deep neural networks (DNNs) orchestrate topological and geometric changes in data structures has led some researchers to explore differential geometry for analysing DNN efficacy. For instance, a Riemannian geometry framework has been proposed in which each activation layer of the DNN is treated as a coordinate transformation of the previous layer. While this analytical approach is elegant, it relies on continuity assumptions, making it inapplicable to learning with non-smooth activation functions like the widely used rectified linear unit (ReLU) . Therefore, there is a need for new geometric tools that can practically assess the topological simplifications made by deep learning models, and help the intuitive interpretation of the underlying process.

Ricci flow in the context of Geometry

The dynamic of topological spaces, especially manifolds has been a focal point in differential geometry. Hamilton's seminal work on Ricci flow provided a powerful tool for exploring the topological implications of deforming a manifold's metric according to its Ricci curvature, ultimately leading to Perelman's groundbreaking solution to the Poincaré conjecture. In other words, this differential geometry tool smooths the curvature of a manifold to reveal its underlying topology.

In recent years, discrete analogues of differential geometric tools have emerged, particularly for defining the curvature of a graph, which can be seen as a discrete equivalent of a manifold. In this context, several definitions have been proposed, such as the Forman-Ricci curvature, inspired by persistent homology theory, and the Ollivier-Ricci curvature, derived from optimal transport theory. A discrete version of the Ricci flow has also been adapted to these definitions of discrete curvature.

In the context of Neural Network (NN) layers, graphs can be constructed using the k-nearest neighbors (kNN) method, where each node is connected to its k-nearest nodes based on a distance measure.Then,  for each layer of the NN, a corresponding graph is created, and a Ricci curvature can be defined on each graph. Based on this construction, we can explore to what extent the evolution between two successive layers of the NN might be equivalent to a Ricci flow on the corresponding graphs built from these layers.

Building on Ricci flow idea , we propose that the geometric transformations executed by deep neural networks (DNNs) during classification tasks are analogous to the transformations of a manifold during Hamilton's Ricci flow. 

Deep Learning as Ricci flow

Central to the power of deep neural networks (DNNs) is their ability to generalise, which has been linked to their capacity to identify broad geometric principles. Several studies have shown that consistent geometric and topological changes occur in data clouds as they pass through the layers of a well-trained DNN classifier. These transformations highlight the network's ability to disentangle complex data geometries, a behaviour conceptually similar to that expected in Ricci flow.

We present a computational framework that adapts tools from discrete Ricci flow to study the transformations occurring in DNNs. This framework introduces a novel metric called the Ricci coefficient, which measures the strength and significance of global Ricci network flow-like behaviour in the data clouds as they propagate through the layers of a DNN. The Ricci coefficient provides insight into how networks separate distinct data points while converging similar ones, a balance critical for successful classification.

Figure 1: Data sets for binary classification and DNN architectures trained

Through an empirical study (Figure 1) involving more than 1500 DNN classifiers of varying architectures (depth and width) and datasets (both synthetic and real-world). We found that stronger Ricci flow-like behaviour correlates with higher classification accuracy, independently of the network’s architecture or the dataset. This suggests that Ricci coefficient could serve as a predictor of a DNN’s ability in specific task, like classification, and revealing the trade off between separation of disparate points and convergence of similar points, which highlight connection between manifold simplification and data separation.

To conclude, our framework is both computationally and conceptually simple, offering a scalable way to assess DNN behaviour. It provides a new lens for evaluating deep learning models. These findings motivate further investigation into the use of differential and discrete geometry to enhance the explainability and transparency of DNNs.

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in

Follow the Topic

Mathematical Models of Cognitive Processes and Neural Networks
Mathematics and Computing > Mathematics > Applications of Mathematics > Mathematical Models of Cognitive Processes and Neural Networks
Machine Learning
Mathematics and Computing > Statistics > Statistics and Computing > Machine Learning
Differential Geometry
Mathematics and Computing > Mathematics > Geometry > Differential Geometry
Complex Networks
Mathematics and Computing > Mathematics > Applications of Mathematics > Complex Systems > Complex Networks

Related Collections

With collections, you can get published faster and increase your visibility.

Applications of Artificial Intelligence in Cancer

In this cross-journal collection between Nature Communications, npj Digital Medicine, npj Precision Oncology, Communications Medicine, Communications Biology, and Scientific Reports, we invite submissions with a focus on artificial intelligence in cancer.

Publishing Model: Open Access

Deadline: Mar 31, 2025

Artificial intelligence and medical imaging

This collection seeks original research on AI in medical imaging, covering algorithm development, model building, performance, pathology, clinical application, and public health. Includes MRI, CT, ultrasound, PET, and SPECT.

Publishing Model: Open Access

Deadline: May 01, 2025