Next Revolution in Data Storage

Next Revolution in Data Storage

In the age of big data, we are quickly producing far more digital information than we can possibly store.

Last year, $20 billion was spent on new data centers in the U.S. alone, doubling the capital expenditure on data center infrastructure from 2016.

And even with skyrocketing investment in data storage, corporations and the public sector are falling behind.

But there’s hope.

With a nascent technology leveraging DNA for data storage, this may soon become a problem of the past. By encoding bits of data into tiny molecules of DNA, researchers and companies like Microsoft hope to fit entire data centers in a few flasks of DNA by the end of the decade.

But let’s back up. (Pun intended.)

Backdrop

After the 20th century, we graduated from magnetic tape, floppy disks and CDs to sophisticated semiconductor memory chips capable of holding data in countless tiny transistors.

In keeping with Moore’s Law, we’ve seen an exponential increase in the storage capacity of silicon chips.

At the same time, however, the rate at which humanity produces new digital information is exploding (as seen in the graph below).

 The size of the global datasphere is increasing exponentially, predicted to reach 160 zettabytes (160 trillion gigabytes) by 2025.

As of 2016, digital users produced over 44 billion gigabytes of data per day. By 2025, the International Data Corporation (IDC) estimates this figure will surpass 460 billion.

And with private sector efforts to improve global connectivity—such as OneWeb and Google’s Project Loon—we’re about to see an influx of data from 5 billion new minds.

By 2020, 3 billion new minds are predicted to join the web. With private sector efforts, this number could reach 5 billion.

While companies and services are profiting enormously from this influx, it’s extremely costly to build data centers at the rate needed.

At present, about $50 million worth of new data center construction is required just to keep up, not to mention millions in furnishings, equipment, power and cooling.

Moreover, memory-grade silicon is rarely found pure in nature, and researchers predict it will run out by 2040.

Take DNA, on the other hand. At its theoretical limit, we could fit 215 million gigabytes of data in a single gram of DNA.

But how?

Crash Course

DNA is built from a double helix chain of four nucleotide bases—A, T, C and G. Once formed, these chains fold tightly to form extremely dense, space-saving data stores.

To encode data files into these bases, we can use various algorithms that convert binary to base nucleotides—0s and 1s into A, T, C and G. ‘00’ might be encoded as A, ‘01’ as G, ‘10’ as C, and ‘11’ as T, for instance.

Once encoded, information is then stored by synthesizing DNA with specific base patterns, and the final encoded sequences are stored in vials with an extraordinary shelf-life.

To retrieve data, encoded DNA can then be read using any number of sequencing technologies, such as Oxford Nanopore’s portable MinION.

Still in its deceptive growth phase, DNA data storage—or NAM (nucleic acid memory)—is only beginning to approach the knee of its exponential growth curve. But while the process remains costly and slow, several players are beginning to crack its greatest challenge: retrieval.

Just as you might click on a specific file and filter a search term on your desktop, random-access across large data stores has become a top priority for scientists at Microsoft Research and the University of Washington.

Storing over 400 DNA-encoded megabytes of data, U Washington’s DNA storage system now offers random access across all its data with no bit errors.

Applications

Even before we guarantee random access for data retrieval, DNA data storage has immediate market applications.

As seen in the graph below, a huge proportion of enterprise data goes straight to an archive.

Over time, the majority of stored data becomes only potentially critical, making it less of a target for immediate retrieval.

Particularly for storing past legal documents, medical records and other archive data, why waste precious computing power, infrastructure and overhead?

Data-encoded DNA can last ten thousand years—guaranteed—in cold, dark and dry conditions at a fraction of the storage cost.

Now that we can easily use natural enzymes to replicate DNA, corporations and SMEs have tons to gain (literally) by using DNA as a backup system—duplicating files for later retrieval and risk mitigation.

And as retrieval algorithms and biochemical technologies improve, random access across data-encoded DNA may become as easy as clicking a file on your desktop.

As you scroll, researchers are already investigating the potential of molecular computing, completely devoid of silicon and electronics.

Harvard professor George Church and his lab, for instance, envision capturing data directly into DNA. As Church has stated, “I’m interested in making biological cameras that don’t have any electronic or mechanical components,” whereby information “goes straight into DNA.”

According to Church, DNA recorders would capture audiovisual data automatically. “You could paint it up on walls, and if anything interesting happens, just scrape a little bit off and read it—it’s not that far off.”

One day, we may even be able to record biological events in the body. In pursuit of this end, Church’s lab is working to develop an in vivo DNA recorder of neural activity, skipping electrodes entirely.

Perhaps the most ultra-compact, long-lasting and universal storage mechanism at our fingertips, DNA offers us unprecedented applications in data storage—perhaps even computing.

Potential

As DNA data storage plummets in tech costs and rises in speed, commercial user interfaces will become both critical and wildly profitable.

Once corporations, startups and people alike can easily save files, images or even neural activity to DNA, opportunities for disruption abound.

Imagine uploading files to the cloud, which travel to encrypted DNA vials, as opposed to massive and inefficient silicon-enabled data centers.

Corporations could have their own warehouses and local data networks could allow for heightened cybersecurity—particularly for archives.

And since DNA lasts millennia without maintenance, forget the need to copy databases and power digital archives. As long as we’re human, regardless of technological advances and changes, DNA will always be relevant and readable for generations to come.

But perhaps the most exciting potential of DNA is its portability.

If we were to send a single exabyte of data (one billion gigabytes) to Mars using silicon binary media, it would take 5 Falcon Heavies and cost $486 million in freight alone.

With DNA, we would need 5 cubic centimeters.

At scale, DNA has the true potential to dematerialize entire space colonies worth of data.

Throughout evolution, DNA has unlocked extraordinary possibilities—from humans to bacteria.

Soon hosting limitless data in almost zero space, it may one day unlock many more. 

Join Me

1. A360 Executive Mastermind: This is the sort of conversation I explore at my Executive Mastermind group called Abundance 360. The program is highly selective, for 360 abundance and exponentially minded CEOs (running $10M to $10B companies). If you’d like to be considered, apply here.

Share this with your friends, especially if they are interested in any of the areas outlined above.

2. Abundance-Digital Online Community: I’ve also created a Digital/Online community of bold, abundance-minded entrepreneurs called Abundance-Digital.

Abundance-Digital is my ‘onramp’ for exponential entrepreneurs – those who want to get involved and play at a higher level. Click here to learn more.

Shweta Sharma

Research Analyst at Freelancer

6 年

Get Sample Copy @ https://goo.gl/a1cvFK Orian Research published a new in-depth research that is targeted on Global Data Storage Tape Market 2018, offers in depth analysis of market. An in-depth knowledge about market size, trends, demand, growth, key manufacturers and 2025 forecast.

回复
Stephen Sharples

Product Owner, Cloud Data Platforms Front Office

6 年

Very interesting article. Will there be real world applications? Time will tell. In a similar vein as quantum computing. Also wondering how this fits with GDPR. How to wipe one persons data from a DNA molecule? I think there is a bigger issue at hand than storage - how do we exact useful information from all this data? The needle is getting harder and harder to find in the ever growing data haystack. And worse still, everyone is looking for different coloured needles.

回复
Ivan Dorna

Senior IT Technology Specialist. Senior IT Technology Consultant. (20+ years of real experience in IT Infrastructure and cybersecurity). Founder, CEO @Anthilla: IT Cloud Infrastructure and Data experts.

6 年

We are working in R&D on DNA inspired data storage, the methods behind the DNA Logic, and more in mithocondrial work (assembly and disassembly) and the data transfer performed by RNA-m permit to obtain a real scalable and fault tolerance model, but not always to "Save space" (the experiments made ti Save space are made in a limited prestructured information asset) because replication checks are faulty for non predefined schema structured Data ("known Key ref: known values set"). So, other algos Can solve this problem ti make a coherent data structure check, AI in this case does not help a lot, helps only in algos R&D. Not in usage usage have to be near real time or working at filesystem levels. In fact Who have worked ti store data ad DNA have not worked on live data, but only store offline, making a pre-process(X-time)/post-process(Y-time) flux to Save/Read data. We are using a different approach about this.

回复
Tim Wessels

Principal Technical Consultant @ Tim Wessels & Associates | Network Security

6 年

Well, it is an easy to go from laboratory experiments to make broad over-reaching expectations for the research. Last year IBM researchers were able to store one bit of data on a single atom and read it under highly controlled laboratory conditions. This was a significant experiment because currently it takes 100,000 atoms to store one bit of data and read it. Research can be fascinating, that's why people do it, but reducing research to practice takes a lot of time and fatal flaws can lurk in the shadows.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了