Behind the Paper

Storing data in molecules - High data storage capacity by dual-sequence-definition

The Passerini three-component reaction allows for independent variation of side chain and backbone

Published in Chemistry

May 21, 2020

Katharina Wetzel

Postdoc, Karlsruhe Institute of Technology

Storing data in molecules - High data storage capacity by dual-sequence-definition

Like Be the first to like this

Explore the Research

Sequence-defined macromolecules represent a relatively young section in the field of polymer chemistry. These macromolecules of uniform size and constitution are particularly interesting, as they resemble the unprecedented precision of some natural macromolecules, such as DNA or peptides. Thus, they bear great potential for many new applications, ranging from enzyme mimicking over anti-counterfeiting tags to data storage. After many innovative approaches towards sequence-defined macromolecules have been developed from research groups all over the world, the focus of research is starting to shift towards finding an application for this new type of macromolecules. Especially an application in the field of data storage seems promising. In this context, the number of permutations is an important benchmark to compare the data storage capacity of different systems. This number indicates the chemical diversity in terms of possibly achievable structures and thus the data storage capacity of a certain system. DNA represents a natural example and can be considered as a prototype for artificial data storage systems, as it carries the genetic code. In case of DNA, four nucleobases are arranged in long sequences in a certain order which defines genetic information.

Very long sequences are synthetically demanding and so far, sequence-defined macromolecules have only been achieved with lengths in the range of oligomers. In this work, we decided to focus on increasing the degree of definition per repeat unit and thus the data storage capacity achieved per synthesis step, rather than aiming for longer sequences. In order to do so, we chose a very powerful and, in our group, well-established tool: the Passerini three-component reaction (P-3CR). By using a certain monomer containing a protected acid and an isocyanide group, sequence-defined oligomers can be formed stepwise via a two-step iterative cycle consisting of the P-3CR and a subsequent deprotection step. It is well-known that sequence-defined macromolecules with defined side chains can be prepared by applying this concept. However, in this work, we did not only define the side chain by varying the aldehyde component, but additionally established a set of nine different monomers, which allowed to define the backbone of the prepared macromolecules. Thus, we were able to vary the side chain and the backbone independently, increasing the structural variety and thus the data storage capacity drastically. Of course, the read-out of the sequences was the important second part of our work – one can only claim data storage if a read-out is demonstrated unambiguously. Having the oligomers in hand, we performed fragmentation experiments via tandem mass spectrometry to read the sequences. Interestingly, we found two characteristic fragmentation patterns, simplifying the analysis of complex sequences and providing possibilities for error-correction. Finally, we compared oligomers with different degree of definition (side chain, backbone, or both) with DNA as natural prototype and with the commonly used binary system. This comparison clearly shows the advantages of our system, as it achieves a significantly increased data storage capacity (i.e. 33 bits for a pentamer) compared to the so far known systems.

If you want to learn more about our work, please check out or paper: https://www.nature.com/article...

Katharina Wetzel

Postdoc, Karlsruhe Institute of Technology

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in

Follow the Topic

Chemistry

Physical Sciences > Chemistry

Communications Chemistry

Communications Chemistry

An open access journal from Nature Portfolio publishing high-quality research, reviews and commentary in all areas of the chemical sciences.

More about the journal

Related Collections

With Collections, you can get published faster and increase your visibility.

Chemical modification of proteins

This cross-journal Collection highlights advances in methods for chemical protein modification.

Publishing Model: Open Access

Deadline: Sep 30, 2026

Explore this Collection

Sustainable waste management through polymer upcycling

This Collection invites innovative research that transforms waste into new materials or products in smart, energy efficient, scalable and eco-friendly ways through polymer upcycling.

Publishing Model: Open Access

Deadline: Aug 31, 2026

Explore this Collection

Latest Content

International Conference on Emerging Markets (ICEM) 2027: Technology, Innovation, Management & Entrepreneurship for Sustainable Development & Impact

Behind the Paper

Structural and Functional Insights into Akkermansia Sulfatases

Behind the Paper

From Static Efficiency to Predictive Intelligence: A Review of Data Envelopment Analysis Integrated with Machine Learning

Threading Precision: Trends in Aptamer-Based Nanopore Sensing

Cookies

We use cookies to ensure the functionality of our website, to personalize content and advertising, to provide social media features, and to analyze our traffic. If you allow us to do so, we also inform our social media, advertising and analysis partners about your use of our website. You can decide for yourself which categories you want to deny or allow. Please note that based on your settings not all functionalities of the site are available.

Further information can be found in our privacy policy.

Storing data in molecules - High data storage capacity by dual-sequence-definition

Share this post

Share with...

...or copy the link