Storing Images in DNA via base128 Encoding

5 min read Original article ↗

Bioinformatics

Storing Images in DNA via base128 Encoding

Click to copy article linkArticle link copied!

  • Kun Wang

    The Key Laboratory of Advanced Design and Intelligent Computing, Ministry of Education, School of Software Engineering, Dalian University, Dalian 116622, China

  • Ben Cao

    Ben Cao

    School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China

  • Tao Ma

    Tao Ma

    Brain Function Research Section, China Medical University, Shenyang 110001, China

  • Yunzhu Zhao

    Yunzhu Zhao

    The Key Laboratory of Advanced Design and Intelligent Computing, Ministry of Education, School of Software Engineering, Dalian University, Dalian 116622, China

  • Yanfen Zheng

    Yanfen Zheng

    School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China

  • Bin Wang*

    Bin Wang

    The Key Laboratory of Advanced Design and Intelligent Computing, Ministry of Education, School of Software Engineering, Dalian University, Dalian 116622, China

    *Email: [email protected]

  • Shihua Zhou*

    Shihua Zhou

    The Key Laboratory of Advanced Design and Intelligent Computing, Ministry of Education, School of Software Engineering, Dalian University, Dalian 116622, China

    *Email: [email protected]

  • Qiang Zhang

    Qiang Zhang

    The Key Laboratory of Advanced Design and Intelligent Computing, Ministry of Education, School of Software Engineering, Dalian University, Dalian 116622, China

Journal of Chemical Information and Modeling

Cite this: J. Chem. Inf. Model. 2024, 64, 5

Click to copy citationCitation copied!

Published February 22, 2024

research-article

Copyright © 2024 American Chemical Society

Abstract

Click to copy section linkSection link copied!

Abstract Image

Current DNA storage schemes lack flexibility and consistency in processing highly redundant and correlated image data, resulting in low sequence stability and image reconstruction rates. Therefore, according to the characteristics of image storage, this paper proposes storing images in DNA via base128 encoding (DNA-base128). In the data writing stage, data segmentation and probability statistics are carried out, and then, the data block frequency and constraint encoding set are associated with achieving encoding. When the image needs to be recovered, DNA-base128 completes internal error correction by threshold setting and drift comparison. Compared with representative work, the DNA-base128 encoding results show that the undesired motifs were reduced by 71.2–90.7% and that the local guanine-cytosine content variance was reduced by 3 times, indicating that DNA-base128 can store images more stably. In addition, the structural similarity index (SSIM) and multiscale structural similarity (MS-SSIM) of image reconstruction using DNA-base128 were improved by 19–102 and 6.6–20.3%, respectively. In summary, DNA-base128 provides image encoding with internal error correction and provides a potential solution for DNA image storage. The data and code are available at the GitHub repository: https://github.com/123456wk/DNA_base128.

ACS Publications

Copyright © 2024 American Chemical Society

Cited By

Click to copy section linkSection link copied!

This article is cited by 9 publications.

  1. Jiadong Wang, Bin Wang, Shihua Zhou, Ben Cao, Wei Li, Pan Zheng. DNACSE: Enhancing Genomic LLMs with Contrastive Learning for DNA Barcode Identification. Journal of Chemical Information and Modeling 2026, 66 (2) , 976-993. https://doi.org/10.1021/acs.jcim.5c02747
  2. Ying Zhou, Kun Bi, Qi Xu, Quanjun Liu, Xiangwei Zhao, Qinyu Ge, Zuhong Lu. Ultrafast and Accurate DNA Storage and Reading Integrated System Via Microfluidic Magnetic Beads Polymerase Chain Reaction. ACS Nano 2025, 19 (7) , 7306-7316. https://doi.org/10.1021/acsnano.4c17817
  3. Xiang Liu, Yanfen Zheng, Xue Li, Bin Wang, Shihua Zhou, Ben Cao, Pan Zheng. An end-to-end DNA storage coding method based on a low-complexity multiple biological constraints loss and RL-inspired differentiable solver. Expert Systems with Applications 2026, 315 , 131726. https://doi.org/10.1016/j.eswa.2026.131726
  4. Xue Li, Yanfen Zheng, Qi Shao, Jiadong Wang, Wei Li, Bin Wang, Shihua Zhou, Ben Cao, Pan Zheng. Highly biased DNA sequence reconstruction in DNA storage with multi-scale attention mechanism and contrast learning. Synthetic and Systems Biotechnology 2026, 12 , 422-432. https://doi.org/10.1016/j.synbio.2026.01.028
  5. Qi Xu, Ying Zhou, Qingjiang Sun, Xiangwei Zhao, Zuhong Lu, Kun Bi. DNA-CTMF: Reconstruct high quality image from lossy DNA storage via Pixel-Base codebook and median filter. Synthetic and Systems Biotechnology 2025, 10 (3) , 925-935. https://doi.org/10.1016/j.synbio.2025.04.015
  6. Qi Xu, Yitong Ma, Zuhong Lu, Kun Bi. DP-ID: Interleaving and Denoising to Improve the Quality of DNA Storage Image. Interdisciplinary Sciences: Computational Life Sciences 2025, 17 (2) , 306-320. https://doi.org/10.1007/s12539-024-00671-6
  7. Guanjin Qu, Zihui Yan, Xin Chen, Huaming Wu. DNA data storage for biomedical images using HELIX. Nature Computational Science 2025, 5 (5) , 397-404. https://doi.org/10.1038/s43588-025-00793-x
  8. Qi Xu, Zuhong Lu, Kun Bi. DNA-LSIED: DNA lossy storage for images by encryption and corrective denoising method. Signal, Image and Video Processing 2025, 19 (1) https://doi.org/10.1007/s11760-024-03587-2
  9. Jianxia Zhang. Levy Sooty Tern Optimization Algorithm Builds DNA Storage Coding Sets for Random Access. Entropy 2024, 26 (9) , 778. https://doi.org/10.3390/e26090778

Journal of Chemical Information and Modeling

Cite this: J. Chem. Inf. Model. 2024, 64, 5

Click to copy citationCitation copied!

Published February 22, 2024

Copyright © 2024 American Chemical Society

Altmetric

-

Citations

Learn about these metrics

Article Views are the COUNTER-compliant sum of full text article downloads since November 2008 (both PDF and HTML) across all institutions and individuals. These metrics are regularly updated to reflect usage leading up to the last few days.

Citations are the number of other articles citing this article, calculated by Crossref and updated daily. Find more information about Crossref citation counts.

The Altmetric Attention Score is a quantitative measure of the attention that a research article has received online. Clicking on the donut icon will load a page at altmetric.com with additional details about the score and the social media presence for the given article. Find more information on the Altmetric Attention Score and how the score is calculated.