As computational neuroscientists trained in mathematics, engineering, and neuroscience, we have long worked on theoretical neuroscience and computational models of brain dynamics. Because of this background, extending our work toward brain disease diagnosis felt like a natural next step. We were initially surprised to discover that direct measures of brain activity play almost no role in clinical diagnosis and therapy. Epilepsy is a notable exception to this. It seemed like an ideal opportunity for theory-driven approaches to make a meaningful impact.
However, as we explored the literature, we found many unsuccessful attempts to develop reliable brain activity-based biomarkers. Naturally, we thought we could do better. But after several years of analysing fMRI and M/EEG data to classify patients with disorders such as Parkinson’s disease, Alzheimer’s disease, and anxiety, we encountered the same frustration ourselves. Despite sophisticated analyses and increasingly powerful AI methods, classification performance rarely reached a level that would be genuinely useful or reliable in clinical practice. The literature is filled with similar attempts that produced promising but ultimately non-generalisable results. This led us to rethink the entire program of defining biomarkers of brain diseases. This perspective paper is an outcome of that reflection.
Why do brain-based biomarkers fail so often?
One major reason is the degeneracy of brain features. Different underlying brain mechanisms can produce the same behavioral symptom, while similar brain activity patterns may lead to different outcomes depending on the individual and the context. Therefore, there is no simple one-to-one mapping between symptoms and brain activity measurements.
This problem is amplified by the standard cohort-comparison approach, see the cartoon illustration. Most biomarker studies compare groups of patients against healthy controls while implicitly assuming that each group is internally homogeneous and that there is a one-to-one mapping between brain activity and brain disorders. However, if patients diagnosed with the same disease can arise from very different underlying brain states, averaging across cohorts may actually obscure the relevant mechanisms rather than reveal them.
Another issue is what we describe as the subsampling problem. Brain function emerges from interactions across multiple scales — molecular, cellular, network, behavioral, and environmental. Yet most studies sample only a small part of this system, often focusing on a single modality or spatial scale. Non-invasive recordings of brain activity (e.g. fMRI and M/EEG) remain measurements at a particular macro-scale. Even though these data are highly multivariate, they still capture only a limited slice of the processes underlying disease.
Moreover, these measurements are often only snapshots taken during an individual’s lifetime. Even when longitudinal data are available, detecting meaningful differences remains extremely challenging. And when differences are found, interpreting them is often an ill-posed problem because of the degeneracy present at this level of organization.
The sheer number of possible analytical approaches further complicates the situation. With enough preprocessing choices, connectivity metrics, machine-learning pipelines, and statistical methods, it becomes easy to generate findings that are difficult to reproduce or interpret. This is why we advocate for more theory- and model-driven analyses rather than purely data-driven exploration.
There are also practical considerations. Advanced neuroimaging methods such as MEG and fMRI are expensive, computationally demanding, and often require extensive tuning and expertise. Complex analyses are unlikely to become routine clinical tools if they remain too time-consuming or opaque for clinicians to use and interpret confidently.
So what kind of approach should we encourage instead?
We argue for a broader and more individualized framework for biomarker discovery. Rather than searching for single “magic” biomarkers, we likely need multimodal approaches that combine genetics, neurotransmitter measurements, imaging, electrophysiology, behavior, and other measurements simultaneously. Complex brain disorders cannot realistically be reduced to one feature alone.
We also believe longitudinal data are essential. Comparing a patient to their own past state is imperative to make any progress. Disease progression often unfolds dynamically, and these trajectories may carry more informative signals than static snapshots.
At the same time, patients may not belong to one uniform disease category but rather to biological subgroups or “solution islands.” Individuals sharing similar trajectories or multimodal profiles could form clinically meaningful subtypes, even if they currently receive the same diagnosis.
Finally, we advocate for mechanistic and interpretable AI approaches. AI systems should not simply output classifications; they should help explain why a prediction was made and ideally provide insight into the mechanisms involved. Black-box accuracy alone is unlikely to be sufficient for clinical neuroscience. What is needed are interpretable, biologically grounded models that can connect scales of brain organization and guide meaningful biomarker discovery.
Many online repositories of clinical data are today available and they entice AI/ML experts with exciting data analysis (classification) challenges. It turns out that most repositories do not contain longitudinal data. Furthermore, multimodal data on brain activity, chemistry, connectivity, and structure are often collected from different subjects rather than from the same individuals. We argue that such repositories are not useful. Thus, our perspective also implicitly gives guidelines to choose a suitable online database.
Ultimately, our perspective emerged less from a single successful result than from repeated difficulties and limitations encountered while working with brain data. In many ways, this paper is an invitation to reconsider some of the assumptions underlying current biomarker research – and to encourage approaches that are broader, more interpretable, and more centered on individual dynamics. Of course, this is a very difficult path – but we should not continue losing time on simpler approaches that are ultimately doomed to fail.