More information from limited DNA: simultaneous measurement of genetics, 5hmC and 5mC in cell-free DNA

More information from limited DNA: simultaneous measurement of genetics, 5hmC and 5mC in cell-free DNA

Download this poster

Credits

  • Fabio Puddu
  • Tom Charlesworth
  • Robert Crawford
  • Nick Harding
  • Riccha Sethi
  • Jamie Scotcher
  • Annelie Johansson
  • Ermira Leslie
  • Aurelie Modat
  • Michael S Wilson
  • PĂ¡idĂ­ Creed

1. Introduction

Liquid biopsy for profiling of cell free DNA (cfDNA) in blood holds huge promise to transform how we experience and manage cancer by early detection and identification of residual disease and subtype. However, a standard blood draw yields an average of only 10 ng of cfDNA, of which DNA derived from the tumour is a small minority.

Genetic and methylation data together have been shown to be more powerful for the detection of early cancer than either alone. Constrained to measuring four states of information, existing NGS-based technologies sacrifice genetic information for methylation calling.

duet multiomics solution evoC is a new sequencing technology that simultaneously derives all four genetic bases without ambiguity in C or T calls alongside distinguishing 5‑methylcytosine and 5‑hydroxymethylcytosine (6-base data) in a single read from a single DNA molecule. The technology consists of pre-sequencing library prep and post-sequencing analysis pipeline, providing single-base resolution of genetics and epigenetics at high accuracy.

State Standard sequencing protocol Protocol with C→T deamination
1 A A
2 C/mC/hmC mC/hmC
3 G G
4 T C/T
How biomodal duet multiomics works

2. duet multiomics solution evoC

  1. Strand synthesis: creates a single molecule with a direct copy of the original information tethered together with a hairpin. The copy strand is without cytosine modifications initially, but importantly, utilises a high fidelity methyltransferase to copy over only 5mC from the original to the copy strand.
  2. Sequencing paired-end read: generates sequence information after protection of cytosine modifications followed by deamination of all remaining cytosines (read as thymine in NGS).
  3. Read resolution: aligns original and copy strands to correctly call all 4 canonical bases along with 5mC and 5hmC.
  4. Aligned (4 base) reads with 5mC & 5hmC are tagged (6 base information).
The duet evoC workflow for 6-base sequencing

duet multiomics solution evoC is a 6-base calling technology that reads all four canonical bases plus 5mC and 5hmC.

3. Accurate genetic and epigenetic data

Complete accurate genome and methylome information from duet evoC. Genomic information is provided in full at higher or equivalent accuracy as other genetic and epigenetic methods (3A).

SNP calling accuracy achieved with biomodal duet vs other technologies
Epigenetic calling accuracy achieved with biomodal duet evoC for 5mC and 5hmC

Simultaneously full methylome information is provided at high accuracy for both 5mC and 5hmC (B). Figure (A) uses data generated on the Genome-in-a-bottle reference materials.

4. Accurate detection of genetic variants in a CRC patient cohort

cfDNA is thought to enter the bloodstream through apoptosis or necrosis, with cfDNA from healthy and cancer tissues released into the blood of cancer patients. To assess the ability of duet evoC to generate multimodal information from liquid biopsy samples, we obtained and sequenced cfDNA from 87 individuals, ranging from healthy volunteers to stage IV colorectal cancer (CRC) patients. (A) Somatic variants were called using Mutect2 on duet evoC in tumour-only mode and the presence of pathogenic or likely pathogenic variants associated with CRC is shown. An increase in the prevalence of these variants from stage I to stage IV patients mirrors an increase in the amounts of ctDNA (B), as estimated by ichorCNA.

Somatic variants associated with CRC detected in healthy and stage I-IV CRC cfDNA samples using duet evoC
Screenshot 2024 04 08 114132

5. Accurate epigenetic information for CRC detection

DMRs detected in stage IV CRC cfDNA samples using duet evoC
Differential methylation and hydroxymethylation in stage IV CRC cfDNA samples detected using duet evoC
Schematic for using modality to analyse 6-base readouts from duet evoC
ROC curves demonstrate combined 5mC and 5hmC information better separates healthy and stage I CRC cfDNA samples than 5mC or 5hmC alone
(E) Candidate features identified using LOOCV 5mC and 5hmC model ranked by the number of times they were selected. (F) Analysing 5hmC improves the ability to distinguish between CRC stages in genomic regions with subtle 5mC differences.
  1. DMRs identified from stage IV CRC tissue in the TCGA-COAD cohort using Infinium Human Methylation 450K arrays were reproduced from 5mC levels in cfDNA obtained from stage IV CRC patients using duet evoC (12 CRC; 24 healthy volunteers). Cell-free DMRs where 5mC was greater in stage IV were defined as hypermethylated, and vice versa (p < 0.05; t-test).
  2. Within these DMRs from the TCGA-COAD cohort, duet evoC captured differential (hydroxy)methylation in cfDNA from stage IV CRC patients.
  3. Analysing cfDNA with duet evoC produces 6-base readouts that can be summarised across regions with ease using modality, part of the duet analysis suite.
  4. ROC curves demonstrating that combining 5mC and 5hmC features in cfDNA improved separation of stage I CRC patients from heathy volunteers, when compared with using 5mC or 5hmC alone. Candidate features were defined as 5mC and 5hmC fractions of regions defined as stage IV tissue DMRs in TCGA-COAD, calculated from cfDNA from 24 stage I CRC patients and 25 healthy volunteers. Using glmnet, generalised linear models were trained on either 5mC or 5hmC features, or both (5mc + 5hmC), and evaluated using a leave-one-out cross-validation (LOOCV) approach.
  5. Candidate features ranked by the number of times they were selected in each train/test split during LOOCV of the (5mC + 5hmC) model.
  6. Analysing 5hmC improved the ability to distinguish between CRC stages in genomic regions with subtle 5mC differences. 5hmC (but not 5mC) fractions of regions overlapping KIFC3 and CDH4 were selected as features in every LOOCV split. Averages were taken across the stage I CRC patients and healthy volunteers from (D), and cfDNA from 12 additional stage IV CRC patients.

6. Copy Number Variation

Copy number variation in healthy and stage I-IV CRC cfDNA samples generated using duet evoC.

Copy number information extracted with CNVkit from 10 cfDNA samples for each stage shows an increasing prevalence of chromosomal aberrations in later stages of CRC. Note that some chromosomes appear to be consistently rearranged in several samples (blue arrows).

7. Fragmentomics information

Nucleosomes partially protect cfDNA from degradation as is evident from the mononucleosomal distribution of fragment-length profiles of cfDNA. This metric is used in liquid biopsy because circulating tumour DNA (ctDNA) fragments are shorter than regular cfDNA fragments.

Fragment length profiles for healthy and stage I-IV CRC cfDNA samples generated using duet evoC
Proportion of healthy and stage I-IV CRC cfDNA fragments under 140bp generated using duet evoC
Correlation between end motif frequencies obtained from duet evoC and Illumina WGS.
  1. Fragment lenght distributions obtained from the CRC cohort using duet evoC and 250bp-long reads display results in line with the expected increase the in the abundance of shorter fragments in late-stage CRC patients.
  2. This can also be observed as an increase in the proportion of fragments shorter than 140bp, a metric that is accessible with the more common 150b-long reads.
  3. High correlation between end motif frequencies obtained from duet evoC reads and regular Illumina reads for the same sample indicates that duet evoC can also be used to accurately measure this cfDNA feature. A sample for each of the indicated stages is shown.

8. Variant-associated methylation

Simultaneous sequencing of genetic and epigenetic information allows phasing of the methylation/hydroxymethylation signal with heterozygous variants, and detection of variant-associated methylation across approximately 300 bp around the variant of interest.

In this panel an example from the NDUFA9 gene is shown, where the G allele is associated with near complete methylation, while the A allele displays

Variant-associated methylation identified using duet evoC

9. Conclusion

We have presented data illustrating the potential of duet evoC for liquid biopsy. With duet evoC it is possible to obtain multi-modal information, including SNPs, methylation, hydroxymethylation, fragmentomics, copy number variation and novel 2 dimensional biomarkers, from a single low-input sample of cfDNA. Further we have demonstrated improved ability to differentiate between healthy and stage IV CRC using combined methylation and hydroxymethylation features.

duet evoC, which provides 6-base sequencing, is available to order now as a product from biomodal.

Cambridge Epigenetix is now biomodal