DeepSME: De Novo Nanopore Basecalling of Motif-insensitive DNA Methylation and Alignment-free Digital Information Decryptions at Single-Molecule Level

PRODUCTS USED

Genes
Read Full Article

ABSTRACT

Biomolecular encryption employing chemical modifications enables secure approaches for information storage and communications. However, constructing high information density pathways for rapid synthesis and readout remains a challenge to guarantee confidentiality, integrity, and availability (CIA). Here we develop a nanopore sequencing based protocol, demonstrated by complete substitution using 5-hydroxymethylcytosine (5hmC) for individual nucleotide recognition rather than sequential interactions. Such motif-insensitive methylation at the single-molecule level does not naturally exist and results in severe ion current disruption and a 67.2%-100% readout failure, which ensure its ability on the encryption of the data encoded inside the DNA. We further propose and establish an alignment-free DeepSME basecaller, which is a deep learning-based platform independent on prior models and knowledges. DeepSME utilizes a three-stage training pipeline that initiates tolerable for 11.55% errors, expands its neighboring k-mer dictionary model size from 4^6 to 4^9, and mitigates the errors by only three microbial genomes, giving rise to 92% precision with 92% recall. Fully 5hmC encrypted digital information were deciphered by DeepSME within 16× coverage depth. The versatile and transparent DeepSME pipeline and its F1-score performance of 86.4% surpassing all the state-of-the-art basecallers, support its great potential for meeting the rapidly increasing CIA demands of DNA-based secure communications.

Read Full Article

PRODUCTS USED

Genes