DNA comprises molecular information stored in genetic and epigenetic bases, both of which are vital to our understanding of biology. In human genomes, an epigenetic modification at the fifth carbon of cytosines bases comprises one fundamental pathway by which genes canbe silenced or activated. Methods widely used to detect epigenetic modifications at cytosinebases rely on either deamination of unmodified cytosines to read as thymine or borane reduction of the modified oxidised cytosine bases to read as thymine. As a result, such methods fail to capture common C-to-T mutations, and importantly also fail to distinguish 5-methylcytosine (5mC – mainly found in silenced parts of the genome) from 5-hydroxymethylcytosine (5hmC – enriched in active gene bodies and enhancers). Hence, existing methods are unable to read the complete information stored in our genomes in a single workflow.
Here, we build on our newly reported duet multiomics solution +modC, which has the unique ability to read genetic and epigenetic cytosine modifications (modC) in a single workflow. Byintroducing an important high-fidelity methyltransferase during the workflow, we are able to construct an expanded information table that can be used to deconvolute A,T,C,G, 5–mC and5-hmC simultaneously in a single read within a DNA molecule. Using synthetic controls we demonstrate the performance of this technology and report on error rates associated with the method. Finally, we use a mouse embryonic stem cell-line, ES-E14TG2A, to map for the first time a simultaneous reading of the genome and epigenome at high depth and show how these epigenetic modifications are segregated across the genome.
In summary, using our modified approach we demonstrate simultaneous, phased reading of all six genetic and epigenetic bases. This tool provides a more complete picture of the information stored in genomes and has applications throughout biology and medicine.