Liquid biopsy for profiling of cell free DNA (cfDNA) in blood holds huge promise to transform how we experience and manage cancer by early detection and identification of residual disease and subtype. While early work in liquid biopsy focused on the identification of actionable somatic variations at specific loci, the past decade has seen an expansion into epigenetic features, notably methylation. 5-methylcytosine (5-mC) profiles of cancer are differential from non-cancer at many more loci and so provide a stronger signal. Moreover, recent research has suggested that 5-hydroxymethylcytosine (5-hmC) profiles in cfDNA can be a marker for early cancer. However, a standard blood draw yields an average of only 10ng of cfDNA, presenting the dilemma of how to use limited sample to obtain maximum information.
We will present the application of a technology which sequences at single base resolution the complete genetic sequence of input DNA fragments integrated with the modification status (unmodified, 5-mC or 5-hmC) for each CpG from low nanogram input quantities of cell-free DNA (cfDNA). Using this technology, we generated whole genome 6-base data (A, C, G, T, 5-mC, and 5-hmC) on cfDNA extracted from plasma of healthy volunteers and patients with colorectal cancer at stages I-IV. We demonstrate how the technology can be used to extract as complete information as possible from a cfDNA sample.
We compare differential 5-mC and 5-hmC regions, identify copy number variation and clinically relevant germline and somatic single nucleotide polymorphisms, and multiple fragmentomic features including size, end motif and nucleosome position, across different stages of colorectal cancer. We demonstrate that the AUC for detection of Stage I CRC from healthy volunteers is 0.69 or 0.72 when training a classifier using only 5mC or 5hmC features, respectively, and that this increases to 0.93 when training a classifier using the combination of 5mC and 5hmC features. We highlight individual CRC-relevant features where the resolution of 5mC and 5hmC distinguishes between healthy, Stage I and Stage IV samples, and that these differences would be invisible when just using 5mC information. We propose that the ability to measure 5-mC and 5-hmC at high accuracy and single base resolution, alongside genomic and fragmentomic profiles, from a limited quantity of DNA will enable greater insight into the ctDNA in plasma and enable the development of more accurate liquid biopsy based disease detection.