DNA hard disk repair with enzymes

Introducing the enzymatic repair service for DNA hard drives: using nature’s repair mechanisms to save digital data stored in DNA from complete data loss.
DNA hard disk repair with enzymes
Like

Share this post

Choose a social network to share with, or copy the URL to share elsewhere

This is a representation of how your post may appear on social media. The actual post will vary between social networks

Data corruption in storage media is ubiquitous in today’s digital world, from failing hard drives in data centres to data loss in personal computers. While the initial reaction to a corrupted hard drive usually involves panic and doubts about one’s backup strategy, repair and subsequent data recovery is sometimes possible for hard drives with physical or logical damage. In the last few years, interest in storing digital data in DNA has steadily increased, with a wide range of parties – from academic research groups to manufacturers of hard drives – now envisioning a future in which of our archival data is stored in DNA. Of course, data stored in DNA also experiences data corruption, so we set out to develop an enzymatic repair service for DNA hard drives based on nature’s mechanisms for genome repair.

While DNA’s high information density on the Exabyte-per-gram scale and ubiquity as nature’s storage medium of choice work towards the vision of DNA-based archival storage, its stability is not infinite. Similar to hard drives, time, temperature, and humidity are the critical parameters affecting durability of DNA data storage media. DNA’s major failure mode is the loss of sequences caused by hydrolysis, its main decay mechanism. The resulting breakage of the phosphate backbone of a given sequence, referred to as nicking, prevents amplification by polymerase chain reaction, and thus renders the data saved in this sequence unreadable.

Our goal was to present a simple method to rescue the data from a file stored in DNA, essentially the DNA-based equivalent of a hard drive recovery service. The result: a mix of three enzymes – borrowed from nature’s repair mechanisms for genomic DNA – capable of reversing the hydrolytic damage and restoring the unreadable sequences. To validate the approach, we performed experiments using heavily decayed DNA – aged at 30°C for more than a month – representing the worst-case scenario. Sequencing showed that the repair enzymes successfully reversed some DNA decay by increasing the amount of amplifiable, full-length DNA and enabling error-free data recovery.

Our “enzymatic repair service” has two implications for DNA data storage: it extends the storage horizon for DNA data storage applications, and it further reduces the minimum number of sequence copies required for durable storage. Considering the timescales for archival storage in DNA, our repair process would allow data recovery from archival media left in storage for multiple hundred years longer than they were intended to. This is hard to imagine with today’s hard drives, but there is still a wide gap between today’s research on DNA data storage and today’s scale of conventional storage systems. Nonetheless, enzymatic repair for DNA data storage brings us one step closer to realizing the potential for DNA-based archival storage.

 

Cover: iStock.com/Kirillm

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in

Follow the Topic

Biotechnology
Life Sciences > Biological Sciences > Biotechnology

Related Collections

With collections, you can get published faster and increase your visibility.

Neurological disorders as a window into cognitive function

This cross-journal Collection shines a spotlight on research exploring neural mechanisms underlying cognitive functions in people affected by neurological conditions.

Publishing Model: Open Access

Deadline: Jan 31, 2025

Artificial intelligence in genomics

Communications Biology, Nature Communications and Scientific Reports welcome submissions that showcase how artificial intelligence can be used to improve our understanding of the genetic basis for complex traits or diseases.

Publishing Model: Open Access

Deadline: Jan 12, 2025