Our 20-Year Journey.
High-content screening (HCS) has become a core tool in early-stage drug discovery. Over the past two decades, labs around the world have applied high-content imaging to profile cellular responses induced by small molecules. These rich, high-dimensional "phenotypic profiles" can reveal how a compound acts—often without any knowledge of its target or structure. Our journey began in 2004 as a collaboration with the Mitchison lab, when we demonstrated that automated microscopy and machine learning could be combined to characterize drug function by quantifying and comparing cellular responses induced by treatment. This work showed that compounds with similar mechanisms of action produced similar cellular phenotypes (and the converse), even when measured only through microscopy.
A few years later, we expanded this work by examining patterns of single-cell heterogeneity, showing that the variability within cell populations also carries important biological information. Since then, we and the broader field have seen an explosion in both scale and complexity: more compounds, more cell lines, more imaging markers, more sophisticated computational methods, as well as more enabling academic and commercial software packages.
The Challenge We Didn't Anticipate.
However, we didn't anticipate that the creation and adoption of this scalable, image-based profiling platform would introduce an unexpected challenge: most of these datasets live in silos. Other profiling modalities, such as those for DNA or RNA, have well-defined features that are easily combined across studies, leading to biological discoveries greater than any one study alone. However, HCSs are different: each experiment uses its own microscope, its own cell line, its own dyes and image analysis pipeline. As a result, the phenotypic profiles produced in one lab often can't be directly compared to those from another. The inability to pool and leverage knowledge across HCS community efforts has limited the possibility of realizing the full potential of this rich data.
That's the problem we set out to solve with CLIPn. CLIPn uses contrastive deep learning to align heterogeneous HCS datasets in a shared space, enabling what we call 'transitive prediction': predicting the function of compounds screened in one dataset based on their similarity to reference compounds from entirely different datasets.
CLIPn : Bridging the Silos.
The idea behind CLIPn was motivated by this challenge and from a growing frustration that so much high-quality data was going unused. Our approach takes inspiration from CLIP, a model developed by OpenAI that learned to align images and text into a shared representation space. We wondered: could we do something similar for phenotypic profiles, using shared drug categories as 'references' to align these diverse HCS datasets?
Of course, the problem is messier in reality. Features differ across datasets. Overlap between reference compounds is often partial. And datasets vary in dimensionality and signal quality. So we designed CLIPn to be contrastive, modular, and scalable: it learns a unique encoder for each dataset, and uses overlapping drug category labels to guide alignment without ever needing raw images.
Once trained, the CLIPn model embeds all compounds, reference or uncharacterized, into a unified latent space. Compounds with similar biological effects cluster together, regardless of their dataset of origin. This enables transitive functional prediction: a compound screened in one dataset can be matched to similar compounds from other datasets.
Making It Work.
A key innovation was a rotating "pivot–auxiliary" framework: at each training step, one dataset is treated as the pivot, and contrastive losses are computed relative to the rest. This ensures all datasets contribute equally to the integrated space, even when category overlap is sparse.
We benchmarked CLIPn extensively on simulated data, on 13 real HCS datasets spanning 20 years, and on a set of 10,000+ uncharacterized compounds. As we had hoped, overall CLIPn alignment achieved more confident and more accurate predictions than single-dataset models.
To truly test its utility, we focused on unknown compounds from previous screens that had low-confidence predictions and would typically be ignored. CLIPn was able to take these weak predictions and boost their confidence in the integrated space. When we experimentally validated these rescued compounds, 38 out of 55 (about 70%) proved accurate. These were real opportunities that had been missed simply because the isolated dataset lacked sufficient context.
Building a Foundation for the Future.
We see CLIPn as part of a broader mission: to move from isolated drug screens toward a connected, reusable knowledge base. Just as our early papers helped establish image-based profiling as a viable strategy for drug characterization, we hope CLIPn helps establish cross-dataset alignment as the next frontier.