To expand our knowledge of all aspects of life science, we have developed a high-plex, high-sensitivity platform that enables the simultaneous detection of RNA and proteins in-situ in formalin-fixed, paraffin-embedded (FFPE) tissue. Representatives from our development teams talked about the stories behind the scenes during the development – from the initial proof-of-principle stage to the launching of the CosMxTM Spatial Molecular Imager (SMI).
Joseph M Beechem – Leading NanoString Research & Development Efforts
I have been overseeing all the projects in R&D at NanoString and leading our teams to launch multiple transcriptomic instruments and publishing seminal platform papers, all in Nature Biotechnology. To describe the stories behind the development of these platforms, the newly published CosMx SMI in particular, I interviewed representatives from the R&D teams (shown in Fig. 1). In this blog, I summarize the efforts required to launch the CosMx SMI and the background stories of how our knowledge has been accumulated during the development of platforms over the years that helped make the CosMx SMI a commercial reality.
Dwayne Dunaway – History of Development of Three Platforms: nCounter®, GeoMx® Digital Spatial Profiler (DSP), and CosMxTM SMI
Dwayne has been involved in the development of three platforms – nCounter (Geiss et al, https://www.nature.com/articles/nbt1385), GeoMx DSP (Merritt et al, https://www.nature.com/articles/s41587-020-0472-9), and CosMx SMI (He et al, https://www.nature.com/articles/s41587-022-01483-z). The chemistry and technologies of these platforms are all published in Nature Biotechnology. It was a big effort and accomplishment to launch each platform --- designing chemistry, hardware, and software --- all necessary work seamlessly together. Considering the original platform nCounter, it was amazing to see what you could do with a small team – we started with ~20 scientific people focusing on it. For the newer spatial biology platforms (GeoMx and CosMx), many more people and outside contracting resources were required to accomplish the increased complexity of the platforms. To launch any new platform, a huge amount of teamwork and process organization is required to succeed.
It is difficult to explain how challenging it is to launch a brand-new platform. Multiple years of late nights, early mornings, weekends, Holidays working, and incredible pressure to meet deadlines. When you are done with a platform launch, you say to yourself, “never again”, yet somehow, the very next year you find yourself signing up to do it all over again… simply because, when the scientific community starts making amazing discoveries with the new technology, the feeling is just SO REWARDING!
The CosMx instrument originally started out as a prototype device to accomplish hybridization-based sequencing – aka, the Hyb & Seq project – by adapting an existing NanoString platform commercialized as “Sprint” (prototype called “FrankenSprint”, Fig. 2) to build a commercial CosMx prototype with a custom objective and huge fields-of-view (FOV). It turned out that the prototype instrument and chemistry were very adaptable and allowed us to explore converting this technology to a full-time high-plex multi-omic spatial biology platform (the CosMx SMI described in this paper). The full history of the intermediate prototypes for the CosMx SMI is shown in Fig. 2.
The GeoMx and CosMx Spatial Biology platforms required really big efforts, especially for the software and data analytic components. The sheer mass of data from CosMx is huge, each sample is over 10 terabytes, 20,000 times more than the original nCounter data. Our previous experience with GeoMx data sets greatly helped us to overcome some of the tough problems with data handling during the development of the CosMx platform. To deal with the massive data, it only made sense to have it directly connected to a Cloud-based storage system. In the end, we expanded the Cloud-based data storage for CosMx into a complete Spatial Informatics Platform, called AtoMxTM, whereby image-share & collaboration, data visualization, spatial-analysis pipelines, data-analytics, open-source module incorporation, and extensive pipelines can all reside together in the same place.
While developing these platforms, Dwayne climbed the 100 highest mountain peaks in the State of Washington (Bulger List: https://www.peakbagger.com/list.aspx?lid=5003). He emphasized that developing the three platforms was more challenging than climbing those highest mountains!
Rustem Khafizov – CosMx SMI Chemistry
Rustem has been working with NanoString for over 10 years. He originally worked for me as an intern at Applied Biosystems (Foster City, CA) in my Next-Gen Sequencing group and I “cold-called” him a number of times to talk him into working for NanoString, and eventually --- it worked! Rustem became the technical lead of a new microfluidic version of our nCounter Sprint platform, and absolutely pioneered the development of a disposable-card, 12-channel, automated 2-stage magnetic bead nucleic-acid purifier that became the “heart” of the nCounter Sprint.
Not too long after that, I approached Rustem about a new concept for building a completely hybridization-based sequencer using a new class of optical barcodes that read DNA/RNA hexamers. All you need is 4,096 hexamer barcodes and you have a sequencer! The original barcodes we invented and explored were too big, and we explored DNA Origami barcodes but they were too slow. Rustem was able to conceive and develop a new 10’s-of-nanometer-sized optical barcode (in 2014) that achieved fast hybridization and that became the “heart” of the imaging reagents used for the CosMx SMI. Around that time, we also started developing protein assays using the same imaging reagents.
The project was officially named ‘Hyb & Seq’ in April 2016. Rustem’s twins were born in the same year, wearing the “Hyb” (left) and “Seq” (right) onesies, a gift from R&D colleagues (Fig. 3). Now, these twins are 6 years old, running around in the school yard!
Gary Geiss – CosMx Protein Profiling
“It has been amazing to be part of the development teams of spatial platforms”, said Gary. His most recent role is to develop the 108-plex protein-based assay for the CosMx SMI, which was the highest protein plex in FFPE samples in any published platforms.
The novel barcode design for the first platform nCounter (Geiss et al, Nat Biotechnol 26, 317–325 (2008), https://www.nature.com/articles/nbt1385) for in-solution-based RNA and DNA detection was 2,000 nm in size. The extension of the barcode was used for the original GeoMx DSP, and then we converted the barcode to the NGS-based photocleavable barcode detection strategy for unlimited-plex. For the CosMx SMI, we created a small imaging barcode, 20 nm in size: a 2-log size decrease in the barcode from the original nCounter. It is incredible to see how the technologies have improved over time.
When the nCounter paper was published in 2008, Gary’s son was 2 years old (Fig. 4, left). Just before the recent CosMx paper came out, Gary was sitting next to his son driving his car (Fig. 4, right). As a fun fact, it is well-known in R&D that Gary wears local brewery T-shirts and rides a bike to work every day. He emphasizes that he enjoys working with the great people at NanoString.
Shanshan He – High-Plex CosMx RNA Profiling
When we started the Hyb & Seq project, we didn’t know whether we could make our technology work. Then, Shanshan joined the team in 2017. At the time, we had a much more experienced technical lead for the high-plex RNA project, but they were making very little progress and also introducing additional sample-prep steps up-front of the assay (e.g., tissue clearing) that they felt were needed. Shanshan, however, didn’t have preconceived notions of what was possible or impossible and just rolled up her sleeves and tried to make this work without any additional sample pre-processing steps (requiring only a normal FFPE up-front workflow). Shanshan and her very small team of (essentially) one Research Assistant started to generate results by working more quickly and successfully than the much larger, more experienced RNA team. Thus, Shanshan very quickly took the high-plex RNA program from “something we didn’t know we could do” to “something that we really can do”.
Initially, the project was tough. Shanshan’s small team tried the prototype technology on tissue and could detect only one transcript in 50-100 cells with low specificity. But, she didn’t give up and kept going. In 2019, she saw the first indication that made her believe this approach was going to work. She set up an “instant-readout” model system, consisting of different cell lines with unique RNAs associated with each line. She generated the fluorescence barcode readouts for each cell line that would allow to visualize and report on the success of the experiment via “green” or “yellow” or “red” barcode readouts (linked to the cell line-specific RNAs; see Fig. 5). In this manner, she could explore many different conditions (e.g., the concentration of ISH probes, the concentration of reporter groups, multiple protease concentrations, etc.) and simply look at the ratios of the various colors linked to each cell line and know whether the high-plex RNA system improved or not.
Her team further optimized the assay to address issues, such as sample stability and autofluorescence in tissue. Also, the individual processing of Z-stacked images needed to be improved because a spot (gene) call in condensed 2D projections contaminated about 90% of the data with out-of-plane RNAs. This 3D image processing and registration pipeline issue was solved by establishing a collaboration with the data processing team, led by Mithra Korukonda (also a co-author on this paper).
The updated 3D image processing strategy enhanced the sensitivity of detection, and we started comparing the SMI data to the datasets from GeoMx DSP, RNAscope, bulk RNA-seq, and other RNA detection methods. At that point, we knew that our new platform would be competitive in the field. In 2019, the assay started as 12-plex, and Shanshan’s team increased the plex size to 50 and then 300. From 50-plex to 300-plex was a big step – we had to change the encoding scheme to enable the detection of high expressors. After this pivot at 300-plex, we moved on to 1,000-plex for the CosMx SMI in 2020 and to the whole transcriptome level for GeoMx DSP in 2021.
During the Spatial Molecular Imager development, Shanshan’s daughter was a frequent visitor to NanoString headquarters (Fig. 6), as Mom worked diligently in the lab with the rest of her team.
Margaret Hoang – RNA Quality in Tissue
Margaret was originally hired for the Hyb & Seq program. She said, “I was called Spot 2”, meaning that she worked on the project to establish the second 14-base-pair sequence for reporter hybridization. Her team called the overall reporter structure, an “exploding Christmas tree” because of the structure connected with photocleavable linkers. At the time, there were a lot of activities going on in the development of both GeoMx and nCounter platforms; especially, we were increasing the RNA-plex number in GeoMx by sequencing the barcodes for readout. In 2019, Margaret’s team launched the GeoMx Cancer Transcriptome Atlas which was designed to profile over 1,800 RNA targets in-situ. This led to the release of the commercial product GeoMx Whole Transcriptome Atlas in 2021, which utilized as many as 22,000 in-situ hybridization probes with an NGS readout. For the development of the CosMx SMI, we could use all the knowledge acquired for the GeoMx DSP development to fast-track debugging our high-plex imaging assays.
For the CosMx manuscript, Margaret’s team assessed the RNA integrity of the FFPE non-small cell lung cancer (NSCLC) tissues that were profiled. Her team extracted RNA and measured the RNA integrity using BioAnalyzer for RIN scores and DV200 metrics. What Margaret found was that even when the RNA integrity was so low that bulk sequencing wasn’t possible (DV200 ~20%), CosMx was still able to image at 1,000-plex over 97% of every cell in that sample. Although the SMI chemistry was designed to have this type of “ideal behavior” with highly degraded samples, being able to actually see our design translates into exceptional behavior really gave a huge boost in morale to everyone on the project!
Patrick Danaher – Tertiary Data Analysis
Patrick’s extensive PhD experience in the data analysis of bulk RNA expression brought him to NanoString in 2013. His work originally focused on bulk-transcriptomic analysis of FFPE samples often obtained from clinical trial specimens. Shanshan and Patrick shared an office, and Patrick started to “look over the shoulder” of Shanshan as she was analyzing single-cell spatially resolved transcripts. Needless to say, the minute someone sees this type of powerful data, there is almost no-turning-back to bulk-based transcriptomics. Independent of anyone, Patrick took it upon himself to generate the complete SMI spatial single-cell gene-expression analysis framework. In the process, Patrick worked himself directly into the Technical Lead for Data Analysis of our new CosMx SMI platform. Interesting to think about this now in the era of COVID-19 and hybrid workplace, “how many other physical co-localization opportunities” for leadership or science breakthroughs were missed during this pandemic due to work-from-home?
Since then, Patrick has developed multiple critical methods for data analysis of single-cell and spatial profiling in many tissue types. Just the data in the SMI paper could support a lifetime of analyses. In the published SMI public domain dataset, there are 1,000 genes times 20 cell types, times 30 pairs of unique neighborhoods times…etc., etc. We could ask thousands and thousands of spatial combinatory questions. So, where to begin? So much to see!
In 2019, Patrick developed the initial cell-typing assay using just 30-plex (where the program was at that time), yet it was clear that we could obtain several key cell types even at this low plex. As our plex number grew to hundreds, cell typing became a transformational part of the program. His paper describing the mixed-cell cell-typing deconvolution method developed for GeoMx data was published in Nature Communications (https://www.nature.com/articles/s41467-022-28020-5). The following year, Patrick invented a method to analyze spatial cell “neighborhoods”, which is now a tool that everyone uses.
Patrick has been skiing some of the most challenging mountains in the Pacific Northwest, even skiing the north face of Whitehorse Mountain. This impressed everyone, including another co-author on this paper Dwayne, who summited the 100-tallest peaks in Washington State. Patrick had his first baby in 2016 and his second baby in 2018. Regrettably, I don’t have pictures of his children on skis atop Whitehorse yet…
Zachary Reitz – Ligand-Receptor Interaction
Zachary was hired as a Bioinformatics Scientist during the COVID lockdown in 2020 and did not physically meet his teammates in person for over a year. In the fall, he started “playing around” with this new class of spatially resolved 1,000-plex datasets. He was trying to think about what kind of questions people really want to ask about this data. Zachary got very interested in modeling how cells interact, how they are localized side-by-side, how they communicate with each other, and how the communications affect their gene expression patterns.
Zachary knew that a lot of work has been done on cell interactions using single-cell data and protein-protein interaction data, and that there were many curated, annotated ligand-receptor-pair databases. So, he applied the strategies designed for single-cell data to spatial data; for example, which cell types might be communicating with other cell types. In spatial interactions, if cells are located side-by-side, we can find whether these cells are expressing specific ligands or receptors and communicating with each other. Zachary created the first ligand-receptor analysis and heatmap for the manuscript. He was really surprised by how well this worked, and how spatial single-cell data had inherent information content far exceeding single-cell RNA-seq data (since now all the spatial relationships are directly measured, versus having to infer in scRNA-seq). It was especially rewarding to see the PD-1/PD-L1 interaction literally “jump out” of the data in our NSCLC samples. Combining the prior knowledge of ligand-receptor pairs with the actual spatially resolved single-cell data, hundreds of ligand-receptor pairs can be simultaneously measured in an incredibly high-throughput manner in a single experiment!
Joseph M Beechem – Concluding Summary
In my opinion, the most difficult and challenging “endeavor” to accomplish in science, is a de-novo, the first of its kind Platform development. It literally takes ~500 full-time devoted Research and Development staff, often spanning multiple geographic locations across multiple continents and multiple disciplines. Most importantly, every Platform “Function” (e.g., fluidics, optics, electronics, control software, analysis software, protein and RNA consumables, buffers, sample-prep reagents, field-service engineers, trainers, manual writers, etc.) has to work together in complete harmony and absolute robustness and has to be accomplished within absolutely “crazy” timeline constraints (and maintain this high level of intense work for multiple years). However, what makes platform launches most difficult is also exactly the same thing that makes them so REWARDING! Few things in science will demand as much from each individual, all working together as a completely integrated team than a Platform Launch. By the time launch is accomplished, one almost always says (internally) “Whew… I don’t think I’m ever going to do that again!”. But it seems like by the time one- or two-weeks post-launch, you are absolutely “itching” to do it all over again!!!
In this blog, only a very small fraction of the team members who were involved in the development of the CosMx Spatial Molecular Imaging platform are referenced (by necessity; otherwise this wouldn’t be a blog, but rather an encyclopedia!). Hopefully, this blog gives you a “feeling” of what it is like to be “inside” a Platform Launch team. In order to accomplish something so difficult, you really have to enjoy what you are doing, have fun with your teammates, and find imaginative ways to celebrate the intermediate milestone “wins” and mourn together the milestone “misses” (that can happen!).
I have had the pleasure of leading Platform Development teams in the Industry for over 20 years now, and there is no more rewarding feeling than seeing “platforms in crates” heading to major research centers around the world. Each of them will become a part of the Spatial Life Science Revolution, destined to potentially help develop the new drug, new diagnostic, and new fundamental discoveries that may save the life of your son or daughter, mom or dad, cousin, next-door neighbor, people you have never met, or people who haven’t even been born yet! That is why all of my team works so hard. They know that what they are doing can change the future of the world in some small way and certainly make the world of the future a better place to live in for everyone.