Target-enrichment

Static Badge Static Badge Static Badge

Table of Contents

(back to main documentation)

4.1 Overview

The biomodal duet pipeline includes a mode for processing data generated using targeted panels. The analysis pathways for processing targeted libraries are slightly different because the pipeline calculates targeted metrics and makes use of additional reference files describing the bait and target regions associated with the target panel being used.

(back to main documentation) | (back to top)

4.2. Processing data using a supported target panel

Relevant reference files for the following target panels are available by default in pipeline version 1.2.0:

  • Twist Alliance Pan-cancer Methylation Panel
  • Human Methylome Panel

A targeted analysis for either of these panels can be achieved by passing the following parameters to the biomodal analyse command.

Parameter Description
targeted Set to true
targeted-panel For the Twist Alliance Pan-cancer Methylation Panel set to: twist_cancer. For the Twist Human Methylome Panel set to: twist_methylome

Here is an example biomodal analyse command with these additional parameters passed in:

biomodal analyse \
  --input-path [path_for_input_files] \
  --output-path [path_for output_files] \
  --meta-file [meta_file_name] \
  --run-name [name_for_run] \
  --tag [date_time_or_other_tag] \
  --additional-profile deep_seq \
  --targeted true \
  --targeted-panel twist_cancer

For any other target panels, it is still possible to perform a targeted analysis, but it is necessary to generate relevant reference files first. The remaining instructions describe how to generate the necessary reference file for alternative target panels and how to launch the pipeline in these circumstances.

(back to main documentation) | (back to top)

4.3. Processing data using alternative target panels

Three additional files are required when running the duet pipeline in targeted mode with alternative target panels. These are the bait and target interval files and a BED file containing the target coordinates. The target and bait intervals are used by gatk CollectHsMetrics to calculate target metrics including target coverage and target enrichment. The BED file is used to filter methylation and variant calls down to the targeted regions of interest.

(back to main documentation) | (back to top)

4.4 Generating bait and target interval files

Bait and target interval files are generated from BED files containing bait and target coordinates with the tool picard BedToIntervalList as shown below:

# Make probes intervals 
picard BedToIntervalList I=probes.bed \ 
O=reference_probe.intervals \ 
SD=reference.dict 

# Make targets intervals 
picard BedToIntervalList I=targets.bed \ 
O=reference_targets.intervals \ 
SD=reference.dict 

Note that this command should be run independently of the biomodal duet multiomics solution CLI in an environment with picard available.

This process requires an additional genome dict file which is created from a FASTA file with your reference genome of interest:

picard CreateSequenceDictionary \  
R=reference.fasta \  
O=reference.dict 

If you are working with the reference files provided with the pipeline, then you can obtain a genome dict file from the following locations in the reference file directory:

Human: duet/duet-ref-0.1.0/reference_genomes/GRCh38Decoy/GRCh38Decoy.dict

Mouse: duet/duet-ref-0.1.0/reference_genomes/GRCm38p6/GRCm38p6.dict

(back to main documentation) | (back to top)

4.5 Launching the duet pipeline in targeted mode

You should use the same profiles and commands when launching the duet pipeline in targeted mode as you would when running the duet pipeline in non-targeted mode, except that there are additional parameters and file paths that need to be supplied as key-value pairs using the --additional-params option when running the pipeline in targeted mode with UMIs.

Parameter Description
targeted Set to true
targeted_bait The path to the bait interval file generated from the steps described above
targeted_target The path to target interval file generated from the steps described above
targeted_bait_bed The path to the target bed file

These additional parameters should be supplied as key-value pairs to the biomodal analyse command when launching the pipeline by using the –-additional-params option, as in the following example:

biomodal analyse \
  --input-path [path_for_input_files] \
  --output-path [path_for output_files] \
  --meta-file [meta_file_name] \
  --run-name [name_for_run] \
  --tag [date_time_or_other_tag] \
  --additional-profile deep_seq \
  --targeted true \
  --additional-params targeted_bait=[path_to_file],targeted_target=[path_to_file],targeted_bait_bed=[path_to_file]

(back to main documentation) | (back to top) | (Next)

Cambridge Epigenetix is now biomodal