Shotgun Storage Solution
Will history be preserved with synthetic polymers decoded using shotgun sequencing?
| 2 min read | News
In a new approach to molecular data storage, researchers have developed a sequencing method that enables targeted retrieval of information stored in synthetic polymers, overcoming traditional mass spectrometry (MS) constraints on polymer length and access. The study demonstrates that polymers could serve as high-density, durable alternatives to traditional digital storage media for archival data, with the ability to retrieve specific “bits” without reading the entire data sequence.
Synthetic polymers have shown potential for data storage due to their robustness and space-efficiency, but they have long been limited by the size constraints of MS. For synthetic polymers used in data storage, MS can accurately analyze only up to a certain molecular weight before it struggles with resolution and precision. This limit restricts the length of polymer chains that can be effectively decoded in a single pass, as longer chains (with higher molecular weights) exceed the analytical capacity of typical MS systems. Moreover, standard MS methods decode data sequentially, meaning that reading a particular section of stored information requires processing the entire chain from start to finish.
To address this, researchers at Seoul National University, led by Kyoung Taek Kim, introduced shotgun sequencing to polymer data storage, a strategy allowing both extended storage lengths and random access to specific data segments within the chain.
In the study, researchers encoded a 512-bit binary sequence representing ASCII text into a polymer chain, using lactic acid for “1” and phenyllactic acid for “0.” To allow selective data retrieval, they incorporated fragmentation points along the polymer chain marked by mandelic acid monomers, which cause the chain to break into MS-compatible fragments. This fragmentation enabled the team to bypass MS length constraints, allowing each fragment to be independently analyzed while retaining sequence integrity.
The decoding process involved activating these fragmentation points, yielding 18 separate fragments. Each fragment was identified and custom software mapped the fragments by calculating mass differences. To ensure accuracy, the sequence was verified using a CRC error-detection code embedded in the data.
The technique also enables random access within the polymer. By mapping specific fragments in the encoded address, the researchers isolated sections of interest, such as the word “chemistry,” without reading the entire chain. This capability brings synthetic polymers closer to the flexible retrieval systems of traditional digital media, allowing selective data access and potentially reducing retrieval time.
Beyond data capacity improvements, shotgun sequencing could transform molecular storage by enabling high-volume encoding while reducing physical storage needs, making it particularly suitable for archival applications in secure and long-term data storage, according to the authors.