Analyse multimodal data at scale

Providing insight and guidance for future experiments in a single workflow

The duet multiomics solution

Combines the duet assay and the duet software package to investigate genetic sequence, 5‑methylcytosine (5mC) and 5‑hydroxymethylcytosine (5hmC) simultaneously.

Our comprehensive software and data analysis package allows you to easily identify methylation profiles, genetic variants, and variant-associated methylation in a single sample or multi-sample cohort.

See how the technology works

Explore the duet software solution

1. Core processing

Core processing

Takes output data directly from your sequencing run, and processes it into an integrated/combined genetic and epigenetic format ready for further analysis.

Steps

Read resolution: pairs the original and copy strand sequences from each DNA fragment, and combines them into a single resolved sequence encoding the genetic and epigenetic status at each base.
Trims any hairpin or sequencing adapter bases using cutadapt
Quality filtering

Outputs

Resolved fastQ files ready for further analysis

duet evoC read resolution

2. Genome alignment

Genome alignment

Aligns the resolved sequence data against a reference genome to report on variants, and against the methylation controls to assess the quality of genetic and epigenetic calling.

Steps

Align against your defined reference genome and the spike-in controls using BWA-MEM
Duplicates are removed using Picard MarkDuplicates
Epigenetic quantification – methylation status is counted at each CpG site by default, and at CHG and CHH sites if specified by you
Accuracy of genetic and epigenetic calls are measured using the spike-in controls
Variant calling – germline variants called using GATK HaplotypeCaller, and you can specify to call somatic variants using Mutect2.
Epigenetic calls are conserved from FASTQ to BAM files using MM tags

Outputs

BAM files including MM tags
Variant Calling Files (VCF)
duet Cytosine report for epigenetic quantification

Genome alignment

3. Functional analysis

Functional analysis

Explore your data further with comprehensive analysis tools. Investigate the correlation between genetic and epigenetic data, compare across genomic regions or contexts that you define, and perform cohort analysis with multiple data sets.

Steps

Variant associated methylation (VAM): see how genetic and epigenetic information are correlated in your samples.
Allele-specific methylation (ASM) – resolved reads are separated into alleles at heterozygous SNV sites. Methylation status is quantified at CpGs associated with each allele, and labelled as ASM if methylation levels differ by more than 30% between alleles.
Perform exploratory analysis:
- Use genomic windows to summarise metrics in regions of interest eg. Known regions of open chromatin
- Use contextual annotations to summarise metrics at genetic regions such as gene bodies or areas correlated with high expression
- Combine outputs into feature sets. Plot summaries of feature sets
Perform cohort analysis:
- Compare across multiple experiments using genomic windows defined by you or taken from an external definition.
- Identify differentially methylated regions (DMRs) and differentially hydroxymethylated regions (DhMRs)

Outputs

ASM files
Zarr files to enable efficient processing of multidimensional cohort data
Comprehensive summary plots

biomodal allele-specific methylation

Let's collaborate

Find out how our solution can help you gain a deeper understanding of biology

FAQs

What are the recommended requirements for running the duet pipeline software?

The duet pipeline can seamlessly perform read resolution, alignment, variant calling, summarisation of methylation status and produce reports and QC information. Here is what you can expect in terms of processing times:

Running in a cloud tenancy, allocating exactly the compute resources required at each stage: ~30 hours
Running on a single dedicated host with 32 cores and 64 GB RAM: 11-12 days
Using a Slurm cluster of 4 hosts, each with 32 cores and 64 GB RAM: 3-4 days

The above estimations are highly dependent on your environment. The timings have been calculated assuming 8 samples and approximately 30X coverage.

Have questions or need assistance with your system setup? We’re here to help! Feel free to reach out to us for further discussions.

Products

Discover

Applications and research areas

Explore

Resources

Learn

About biomodal

Connect

Analyse multimodal data at scale

The duet multiomics solution

Explore the duet software solution

Core processing

Steps

Outputs

Genome alignment

Steps

Outputs

Functional analysis

Steps

Outputs

Let's collaborate

FAQs

Discover

Explore

Connect

Legal