nf-core/sarek

An open-source analysis pipeline to detect germline or somatic variants from whole genome or targeted sequencing

Introduction¶

nf-core/sarek is a workflow designed to detect variants on whole genome or targeted sequencing data. Initially designed for Human, and Mouse, it can work on any species with a reference genome. Sarek can also handle tumour / normal pairs and could include additional relapses.

The pipeline is built using Nextflow, a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It uses Docker/Singularity containers making installation trivial and results highly reproducible. The Nextflow DSL2 implementation of this pipeline uses one container per process which makes it much easier to maintain and update software dependencies. Where possible, these processes have been submitted to and installed from nf-core/modules in order to make them available to all nf-core pipelines, and to everyone within the Nextflow community!

On release, automated continuous integration tests run the pipeline on a full-sized dataset on the AWS cloud infrastructure. This ensures that the pipeline runs on AWS, has sensible resource allocation defaults set to run on real-world datasets, and permits the persistent storage of results to benchmark between pipeline releases and other analysis sources. The results obtained from the full-sized test can be viewed on the nf-core website.

It's listed on Elixir - Tools and Data Services Registry and Dockstore.

Sarek Workflow

Pipeline summary¶

Depending on the options and samples provided, the pipeline can currently perform the following:

Form consensus reads from UMI sequences (fgbio)
Sequencing quality control and trimming (enabled by --trim_fastq) (FastQC, fastp)
Contamination removal (BBSplit, enabled by --tools bbsplit)
Map Reads to Reference (BWA-mem, BWA-mem2, dragmap or Sentieon BWA-mem)
Process BAM file (GATK MarkDuplicates, GATK BaseRecalibrator and GATK ApplyBQSR or Sentieon LocusCollector and Sentieon Dedup)
Experimental Feature: Use GPU-accelerated parabricks implementation as alternative to "Map Reads to Reference" + "Process BAM file" (--aligner parabricks)
Summarise alignment statistics (samtools stats, mosdepth)
Variant calling (enabled by --tools, see compatibility):
- ASCAT
- CNVkit
- Control-FREEC
- DeepVariant
- freebayes
- GATK HaplotypeCaller
- GATK Mutect2
- indexcov
- Lofreq
- Manta
- mpileup
- MSIsensor2
- MSIsensor-pro
- MuSE
- Sentieon Haplotyper
- Strelka
- TIDDIT
Post-variant calling options, one of:
- Filtering (bcftools view (default: filter by PASS,.)), normalisation (bcftools norm) and consensus calling (bcftools isec, default: called by at least 2 tools -n+2) on all vcfs and/or bcftools concat for germline vcfs
- Varlociraptor for all vcfs
Variant filtering and annotation (SnpEff, Ensembl VEP, BCFtools annotate)
Summarise and represent QC (MultiQC)

Sarek Workflow

Usage¶

Note

If you are new to Nextflow and nf-core, please refer to this page on how to set-up Nextflow. Make sure to test your setup with -profile test before running the workflow on actual data.

First, prepare a samplesheet with your input data that looks as follows:

samplesheet.csv:

patient,sample,lane,fastq_1,fastq_2
ID1,S1,L002,ID1_S1_L002_R1_001.fastq.gz,ID1_S1_L002_R2_001.fastq.gz

Each row represents a pair of fastq files (paired end).

Now, you can run the pipeline using:

nextflow run nf-core/sarek \
   -profile <docker/singularity/.../institute> \
   --input samplesheet.csv \
   --outdir <OUTDIR>

Warning

Please provide pipeline parameters via the CLI or Nextflow -params-file option. Custom config files including those provided by the -c Nextflow option can be used to provide any configuration except for parameters; see docs.

For more details and further functionality, please refer to the usage documentation and the parameter documentation.

Pipeline output¶

To see the results of an example test run with a full size dataset refer to the results tab on the nf-core website pipeline page. For more details about the output files and reports, please refer to the output documentation.

Benchmarking¶

On each release, the pipeline is run on 3 full size tests:

test_full runs tumor-normal data for one patient from the SEQ2C consortium
test_full_germline runs a WGS 30X Genome-in-a-Bottle(NA12878) dataset
test_full_germline_ncbench_agilent runs two WES samples with 75M and 200M reads (data available here). The results are uploaded to Zenodo, evaluated against a truth dataset, and results are made available via the NCBench dashboard.

Credits¶

Sarek was originally written by Maxime U Garcia and Szilveszter Juhos at the National Genomics Infastructure and National Bioinformatics Infastructure Sweden which are both platforms at SciLifeLab, with the support of The Swedish Childhood Tumor Biobank (Barntumörbanken). Friederike Hanssen and Gisela Gabernet at QBiC later joined and helped with further development.

The Nextflow DSL2 conversion of the pipeline was lead by Friederike Hanssen and Maxime U Garcia.

Maintenance is now lead by Friederike Hanssen and Maxime U Garcia (now at Seqera)

Main developers:

We thank the following people for their extensive assistance in the development of this pipeline:

Acknowledgements¶

Contributions & Support¶

If you would like to contribute to this pipeline, please see the contributing guidelines.

For further information or help, don't hesitate to get in touch on the Slack #sarek channel (you can join with this invite), or contact us: Maxime U Garcia, Friederike Hanssen

Citations¶

If you use nf-core/sarek for your analysis, please cite the Sarek article as follows:

Friederike Hanssen, Maxime U Garcia, Lasse Folkersen, Anders Sune Pedersen, Francesco Lescai, Susanne Jodoin, Edmund Miller, Oskar Wacker, Nicholas Smith, nf-core community, Gisela Gabernet, Sven Nahnsen Scalable and efficient DNA sequencing analysis on different compute infrastructures aiding variant discovery NAR Genomics and Bioinformatics Volume 6, Issue 2, June 2024, lqae031, doi: 10.1093/nargab/lqae031.

Garcia M, Juhos S, Larsson M et al. Sarek: A portable workflow for whole-genome sequencing analysis of germline and somatic variants [version 2; peer review: 2 approved] F1000Research 2020, 9:63 doi: 10.12688/f1000research.16665.2.

You can cite the sarek zenodo record for a specific version using the following doi: 10.5281/zenodo.3476425

An extensive list of references for the tools used by the pipeline can be found in the CITATIONS.md file.

You can cite the nf-core publication as follows:

The nf-core framework for community-curated bioinformatics pipelines.

Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.

Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x.

CHANGELOG¶

CHANGELOG

Pipeline Inputs

This page documents all input parameters for the pipeline.

Input/output options ¶

--input ¶

string file-path Optional

Path to comma-separated file containing information about the samples in the experiment.

A design file with information about the samples in your experiment. Use this parameter to specify the location of the input files. It has to be a comma-separated file with a header row. See usage docs.

If no input file is specified, sarek will attempt to locate one in the {outdir} directory. If no input should be supplied, i.e. when --step is supplied or --build_only_index, then set --input false

--input_restart ¶

string file-path Optional

Automatic retrieval for restart

--step ¶

string Required

Starting step

The pipeline starts from this step and then runs through the possible subsequent steps.

Default: mapping

Allowed values: mapping , markduplicates , prepare_recalibration , recalibrate , variant_calling , annotate

--outdir ¶

string directory-path Required

The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.

Main options ¶

--split_fastq ¶

integer Optional

Specify how many reads each split of a FastQ file contains. Set 0 to turn off splitting at all.

Use the the tool FastP to split FASTQ file by number of reads. This parallelizes across fastq file shards speeding up mapping. Note although the minimum value is 250 reads, if you have fewer than 250 reads a single FASTQ shard will still be created.

Default: 50000000

--nucleotides_per_second ¶

integer Optional

Estimate interval size.

Intervals are parts of the chopped up genome used to speed up preprocessing and variant calling. See --intervals for more info.

Changing this parameter, changes the number of intervals that are grouped and processed together. Bed files from target sequencing can contain thousands or small intervals. Spinning up a new process for each can be quite resource intensive. Instead it can be desired to process small intervals together on larger nodes. In order to make use of this parameter, no runtime estimate can be present in the bed file (column 5).

Default: 200000

--intervals ¶

string file-path Optional

Path to target bed file in case of whole exome or targeted sequencing or intervals file.

To speed up preprocessing and variant calling processes, the execution is parallelized across a reference chopped into smaller pieces.

Parts of preprocessing and variant calling are done by these intervals, the different resulting files are then merged. This can parallelize processes, and push down wall clock time significantly.

We are aligning to the whole genome, and then run Base Quality Score Recalibration and Variant Calling on the supplied regions.

Whole Genome Sequencing:

The (provided) intervals are chromosomes cut at their centromeres (so each chromosome arm processed separately) also additional unassigned contigs.

We are ignoring the hs37d5 contig that contains concatenated decoy sequences.

The calling intervals can be defined using a .list or a BED file. A .list file contains one interval per line in the format chromosome:start-end (1-based coordinates). A BED file must be a tab-separated text file with one interval per line. There must be at least three columns: chromosome, start, and end (0-based coordinates). Additionally, the score column of the BED file can be used to provide an estimate of how many seconds it will take to call variants on that interval. The fourth column remains unused.

|chr1|10000|207666|NA|47.3|

This indicates that variant calling on the interval chr1:10001-207666 takes approximately 47.3 seconds.

The runtime estimate is used in two different ways. First, when there are multiple consecutive intervals in the file that take little time to compute, they are processed as a single job, thus reducing the number of processes that needs to be spawned. Second, the jobs with largest processing time are started first, which reduces wall-clock time. If no runtime is given, a time of 200000 nucleotides per second is assumed. See --nucleotides_per_second on how to customize this. Actual figures vary from 2 nucleotides/second to 30000 nucleotides/second. If you prefer, you can specify the full path to your reference genome when you run the pipeline:

NB If none provided, will be generated automatically from the FASTA reference NB Use --no_intervals to disable automatic generation.

Targeted Sequencing:

The recommended flow for targeted sequencing data is to use the workflow as it is, but also provide a BED file containing targets for all steps using the --intervals option. In addition, the parameter --wes should be set. It is advised to pad the variant calling regions (exons or target) to some extent before submitting to the workflow.

The procedure is similar to whole genome sequencing, except that only BED file are accepted. See above for formatting description. Adding every exon as an interval in case of WES can generate >200K processes or jobs, much more forks, and similar number of directories in the Nextflow work directory. These are appropriately grouped together to reduce number of processes run in parallel (see above and --nucleotides_per_second for details). Furthermore, primers and/or baits are not 100% specific, (certainly not for MHC and KIR, etc.), quite likely there going to be reads mapping to multiple locations. If you are certain that the target is unique for your genome (all the reads will certainly map to only one location), and aligning to the whole genome is an overkill, it is actually better to change the reference itself.

--no_intervals ¶

boolean Optional

Disable usage of intervals.

Intervals are parts of the chopped up genome used to speed up preprocessing and variant calling. See --intervals for more info.

If --no_intervals is set no intervals will be taken into account for speed up or data processing.

--wes ¶

boolean Optional

Enable when exome or panel data is provided.

With this parameter flags in various tools are set for targeted sequencing data. It is recommended to enable for whole-exome and panel data analysis.

--tools ¶

string Optional

Tools to use for contamination removal, duplicate marking, variant calling and/or for annotation.

Multiple tools separated with commas.

Variant Calling:

Germline variant calling can currently be performed with the following variant callers:

SNPs/Indels: DeepVariant, FreeBayes, GATK HaplotypeCaller, mpileup, Sentieon Haplotyper
Structural Variants: indexcov, Manta, TIDDIT
Copy-number: CNVKit

Tumor-only somatic variant calling can currently be performed with the following variant callers:

SNPs/Indels: FreeBayes, Lofreq, mpileup, Mutect2, Sentieon TNScope, Strelka
Structural Variants: Manta, Sentieon TNScope, TIDDIT
Copy-number: CNVKit, ControlFREEC

Somatic variant calling can currently only be performed with the following variant callers:

SNPs/Indels: FreeBayes, Mutect2, Sentieon TNScope, Strelka2
Structural variants: Manta, TIDDIT
Copy-Number: ASCAT, CNVKit, Control-FREEC, Sentieon TNScope
Microsatellite Instability: MSIsensor2, MSIsensorpro

NB Mutect2 for somatic variant calling cannot be combined with --no_intervals

Annotation:

snpEff, VEP, merge (both consecutively), and bcftools annotate (needs --bcftools_annotation).

NB As Sarek will use bgzip and tabix to compress and index VCF files annotated, it expects VCF files to be sorted when starting from --step annotate.

--skip_tools ¶

string Optional

Disable specified tools.

Multiple tools can be specified, separated by commas.

NB --skip_tools baserecalibrator_report is actually just not saving the reports. NB --skip_tools markduplicates_report does not skip MarkDuplicates but prevent the collection of duplicate metrics that slows down performance.

FASTQ Preprocessing ¶

--trim_fastq ¶

boolean Optional

Run FastP for read trimming

Use this to perform adapter trimming. Adapter are detected automatically by using the FastP flag --detect_adapter_for_pe. For more info see FastP.

--clip_r1 ¶

integer Optional

Remove bp from the 5' end of read 1

This may be useful if the qualities were very poor, or if there is some sort of unwanted bias at the 5' end. Corresponds to the FastP flag --trim_front1.

Default: 0

--clip_r2 ¶

integer Optional

Remove bp from the 5' end of read 2

This may be useful if the qualities were very poor, or if there is some sort of unwanted bias at the 5' end. Corresponds to the FastP flag --trim_front2.

Default: 0

--three_prime_clip_r1 ¶

integer Optional

Remove bp from the 3' end of read 1

This may remove some unwanted bias from the 3'. Corresponds to the FastP flag --trim_tail1.

Default: 0

--three_prime_clip_r2 ¶

integer Optional

Remove bp from the 3' end of read 2

This may remove some unwanted bias from the 3' end. Corresponds to the FastP flag --trim_tail2.

Default: 0

--trim_nextseq ¶

boolean Optional

Removing poly-G tails.

DetectS polyG in read tails and trim them. Corresponds to the FastP flag --trim_poly_g.

--length_required ¶

integer Optional

Minimum length of reads to keep

This is the minimum length of reads to keep after trimming. Corresponds to the FastP flag --length_required (default in FastP is 15bp).

Default: 15

--save_trimmed ¶

boolean Optional

Save trimmed FastQ file intermediates.

--save_split_fastqs ¶

boolean Optional

If set, publishes split FASTQ files. Intended for testing purposes.

Unique Molecular Identifiers ¶

--umi_read_structure ¶

string Optional

Specify UMI read structure for fgbio UMI consensus read generation

One structure if UMI is present on one end (i.e. '+T 2M11S+T'), or two structures separated by a blank space if UMIs a present on both ends (i.e. '2M11S+T 2M11S+T'); please note, this does not handle duplex-UMIs.

For more info on UMI usage in the pipeline, also check docs here.

--group_by_umi_strategy ¶

string Optional

Default strategy for fgbio UMI-based consensus read generation

Default: Adjacency

Allowed values: Identity , Edit , Adjacency , Paired

--umi_in_read_header ¶

boolean Optional

Move UMIs from fastq read headers to a tag prior to deduplication.

Set to true if UMIs are already present in the header of the read, for instance from using OverrideCycles in bclconvert or umi_tools/extract.

--umi_location ¶

string Optional

Location of the UMI(s) to be extracted with fastp.

Use if UMIs are not present in the read header, but in a specific location within the reads/fastq header index. This will be used to extract UMIs from reads or index in the fastq header and store them in the RX tag.

Allowed values: read1 , read2 , per_read , index1 , index2 , per_index

--umi_length ¶

integer Optional

Length of the UMI(s) in the read.

If UMIs are being extracted using fastp, specify the length of the UMI here. This will be used to extract UMIs from reads and store them in the RX tag.

--umi_base_skip ¶

integer Optional

Number of bases to skip after the UMI(s) in the read when extracting with fastp.

If UMIs are being extracted using fastp, specify the number of bases to skip after the UMI here. This will trim some bases after the UMI.

--umi_tag ¶

string Optional

Tag detailing where UMIs are present inside the bam/cram file (e.g. RX).

If UMIs are already present in the cram/bam file, this details the tag which will be used in GATK MarkDuplicates and Sentieon dedup. This should be set to RX if restarting from bam files where the UMIs have been extracted by the umi_in_read_header or umi_length options. Note this is not compatible with MarkDuplicates Spark.

--bbsplit_fasta_list ¶

string file-path Optional

Path to comma-separated file containing a list of reference genomes to filter reads against with BBSplit. You have to also explicitly set --tools bbsplit if you want to use BBSplit.

The file should contain 2 columns: short name and full path to reference genome(s) e.g.

mm10,/path/to/mm10.fa
ecoli,/path/to/ecoli.fa

--bbsplit_index ¶

string path Optional

Path to directory or tar.gz archive for pre-built BBSplit index.

The BBSplit index will have to be built at least once with this pipeline (see --save_reference to save index). It can then be provided via --bbsplit_index for future runs.

--save_bbsplit_reads ¶

boolean Optional

If this option is specified, FastQ files split by reference will be saved in the results directory.

Preprocessing ¶

--aligner ¶

string Optional

Specify aligner to be used to map reads to reference genome.

Sarek will build missing indices automatically if not provided. Set --bwa false if indices should be (re-)built. If DragMap is selected as aligner, it is recommended to skip baserecalibration with --skip_tools baserecalibrator. For more info see here.

Default: bwa-mem

Allowed values: bwa-mem , bwa-mem2 , dragmap , sentieon-bwamem , parabricks

--save_mapped ¶

boolean Optional

Save mapped files.

If the parameter --split-fastq is used, the sharded bam files are merged and converted to CRAM before saving them.

--save_output_as_bam ¶

boolean Optional

Saves output from mapping (if --save_mapped), Markduplicates & Baserecalibration as BAM file instead of CRAM

--use_gatk_spark ¶

string Optional

Enable usage of GATK Spark implementation for duplicate marking and/or base quality score recalibration

Multiple separated with commas.

The GATK4 Base Quality Score recalibration tools Baserecalibrator and ApplyBQSR are currently available as Beta release. Please be aware that --use_gatk_spark is not compatible with --save_output_as_bam --save_mapped. Use with caution!

--markduplicates_pixel_distance ¶

integer Optional

Pixel distance for GATK MarkDuplicates.

--sentieon_consensus ¶

boolean Optional

Generate consensus reads with Sentieon dedup rather than choosing one best read.

If set, the Sentieon dedup output will combine duplicate reads into single consensus read. This is only relevant if --tools contains sentieon_dedup.

Variant Calling ¶

--only_paired_variant_calling ¶

boolean Optional

If true, skips germline variant calling for matched normal to tumor sample. Normal samples without matched tumor will still be processed through germline variant calling tools.

This can speed up computation for somatic variant calling with matched normal samples. If false, all normal samples are processed as well through the germline variantcalling tools. If true, only somatic variant calling is done.

--ascat_min_base_qual ¶

integer Optional

Overwrite Ascat min base quality required for a read to be counted.

For more details see here

Default: 20

--ascat_min_counts ¶

integer Optional

Overwrite Ascat minimum depth required in the normal for a SNP to be considered.

For more details, see here.

Default: 10

--ascat_min_map_qual ¶

integer Optional

Overwrite Ascat min mapping quality required for a read to be counted.

For more details, see here.

Default: 35

--ascat_ploidy ¶

number Optional

Overwrite ASCAT ploidy.

ASCAT: optional argument to override ASCAT optimization and supply psi parameter (expert parameter, do not adapt unless you know what you are doing). See here

--ascat_purity ¶

number Optional

Overwrite ASCAT purity.

Overwrites ASCAT's rho_manual parameter. Expert use only, see here for details. Requires that --ascat_ploidy is set.

--cf_chrom_len ¶

string file-path Optional

Specify a custom chromosome length file.

Control-FREEC requires a file containing all chromosome lengths. By default the fasta.fai is used. If the fasta.fai file contains chromosomes not present in the intervals, it fails (see: https://github.com/BoevaLab/FREEC/issues/106).

In this case, a custom chromosome length can be specified. It must be of the same format as the fai, but only contain the relevant chromosomes.

--cf_coeff ¶

number Optional

Overwrite Control-FREEC coefficientOfVariation

Details, see ControlFREEC manual.

Default: 0.05

--cf_contamination_adjustment ¶

boolean Optional

Overwrite Control-FREEC contaminationAdjustement

Details, see ControlFREEC manual.

--cf_contamination ¶

integer Optional

Design known contamination value for Control-FREEC

Details, see ControlFREEC manual.

Default: 0

--cf_minqual ¶

integer Optional

Minimal sequencing quality for a position to be considered in BAF analysis.

Details, see ControlFREEC manual.

Default: 0

--cf_mincov ¶

integer Optional

Minimal read coverage for a position to be considered in BAF analysis.

Details, see ControlFREEC manual.

Default: 0

--cf_ploidy ¶

string Optional

Genome ploidy used by ControlFREEC

In case of doubt, you can set different values and Control-FREEC will select the one that explains most observed CNAs Example: ploidy=2 , ploidy=2,3,4. For more details, see the manual.

Default: 2

--cf_window ¶

number Optional

Overwrite Control-FREEC window size.

Details, see ControlFREEC manual.

--cnvkit_reference ¶

string file-path Optional

Copy-number reference for CNVkit

https://cnvkit.readthedocs.io/en/stable/pipeline.html?highlight=reference.cnn#batch

--freebayes_filter ¶

string Optional

Filtering expression for vcflib/vcffilter

Freebayes offers a QUAL score for each called variant. The QUAL estimate provides the phred-scaled probability that the locus is not polymorphic provided the data and the model. This is reasonably-well calibrated, so you can specify that you want things where we expect error rates of no more than 1/100 (QUAL > 20) or 1/1000 (QUAL > 30). Where the default setting for sarek is QUAL > 30.

Default: 30

--joint_germline ¶

boolean Optional

Turn on the joint germline variant calling for GATK haplotypecaller

Uses all normal germline samples (as designated by status in the input csv) in the joint germline variant calling process.

--joint_mutect2 ¶

boolean Optional

Runs Mutect2 in joint (multi-sample) mode for better concordance among variant calls of tumor samples from the same patient. Mutect2 outputs will be stored in a subfolder named with patient ID under variant_calling/mutect2/ folder. Only a single normal sample per patient is allowed. Tumor-only mode is also supported.

--ignore_soft_clipped_bases ¶

boolean Optional

Do not analyze soft clipped bases in the reads for GATK Mutect2.

use the --dont-use-soft-clipped-bases params with GATK Mutect2.

--pon ¶

string file-path Optional

Panel-of-normals VCF (bgzipped) for GATK Mutect2

Without PON, there will be no calls with PASS in the INFO field, only an unfiltered VCF is written. It is highly recommended to make your own PON, as it depends on sequencer and library preparation.

The pipeline is shipped with a panel-of-normals for --genome GATK.GRCh38 provided by GATK.

See PON documentation

NB PON file should be bgzipped.

--pon_tbi ¶

string file-path Optional

Index of PON panel-of-normals VCF.

If none provided, will be generated automatically from the PON bgzipped VCF file.

--sentieon_haplotyper_emit_mode ¶

string Optional

Option for selecting output and emit-mode of Sentieon's Haplotyper.

The option --sentieon_haplotyper_emit_mode can be set to the same string values as the Haplotyper's --emit_mode. To output both a vcf and a gvcf, specify both a vcf-option (currently, all, confident and variant) and gvcf. For example, to obtain a vcf and gvcf one could set --sentieon_haplotyper_emit_mode to variant, gvcf.

Default: variant

--sentieon_dnascope_emit_mode ¶

string Optional

Option for selecting output and emit-mode of Sentieon's Dnascope.

The option --sentieon_dnascope_emit_mode can be set to the same string values as the Dnascope's --emit_mode. To output both a vcf and a gvcf, specify both a vcf-option (currently, all, confident and variant) and gvcf. For example, to obtain a vcf and gvcf one could set --sentieon_dnascope_emit_mode to variant, gvcf.

Default: variant

--sentieon_dnascope_pcr_indel_model ¶

string Optional

Option for selecting the PCR indel model used by Sentieon Dnascope.

PCR indel model used to weed out false positive indels more or less aggressively. The possible MODELs are: NONE (used for PCR free samples), and HOSTILE, AGGRESSIVE and CONSERVATIVE, in order of decreasing aggressiveness. The default value is CONSERVATIVE.

Default: CONSERVATIVE

--gatk_pcr_indel_model ¶

string Optional

Option for selecting the PCR indel model used by GATK HaplotypeCaller.

Default: CONSERVATIVE

Post variant calling ¶

--filter_vcfs ¶

boolean Optional

Enable filtering of VCFs with bcftools view

Filtering of all vcf-files from each applied variant-caller using bfctools filter and applying filtering criteria specified in --bcftools_filter_criteria.

--bcftools_filter_criteria ¶

string Optional

Filter criteria. Uses bcftools view filter options. To customize, follow instructions here: https://samtools.github.io/bcftools/bcftools.html#view

Default: -f PASS,.

--normalize_vcfs ¶

boolean Optional

Option for normalization of vcf-files.

Normalization of all vcf-files from each applied variant-caller using bfctools norm.

--snv_consensus_calling ¶

boolean Optional

Enable consensus calling of multiple VCF files from one sample

Intersects multiple VCF files from one sample using bcftools isec. As consensus criterium -n+${params.snv_consensus_calling} is used, meaning a variant is found in this many or more files. For details, visit: https://samtools.github.io/bcftools/bcftools.html#isec

--consensus_min_count ¶

integer Optional

Minimum number of variant callers calling a variant for consensus results

Determines the minimum number of variant callers a variant must be called in to be included in the consensus results. As consensus criterium -n+${params.consensus_min_count} is used, meaning a variant is found in this many or more files. For details, visit: https://samtools.github.io/bcftools/bcftools.html#isec

Default: 2

--concatenate_vcfs ¶

boolean Optional

Option for concatenating germline vcf-files.

Enable concatenation of germline vcf-files from each applied variant-caller into one vcf-file using bfctools concat.

--varlociraptor_chunk_size ¶

integer Optional

Number of chunks to split the vcf-files for varlociraptor. Minimum 1, indicates no splitting

Default: 15

--varlociraptor_scenario_tumor_only ¶

string Optional

Yte compatible scenario file for tumor only samples. Defaults to assets/varlociraptor_tumor_only.yte.yaml

--varlociraptor_scenario_somatic ¶

string Optional

Yte compatible scenario file for somatic samples. Defaults to assets/varlociraptor_somatic.yte.yaml

--varlociraptor_scenario_germline ¶

string Optional

Yte compatible scenario file for germline samples. Defaults to assets/varlociraptor_germline.yte.yaml

Annotation ¶

--vep_include_fasta ¶

boolean Optional

Allow usage of fasta file for annotation with VEP

By pointing VEP to a FASTA file, it is possible to retrieve reference sequence locally. This enables VEP to retrieve HGVS notations (--hgvs), check the reference sequence given in input data, and construct transcript models from a GFF or GTF file without accessing a database.

For details, see here.

--vep_dbnsfp ¶

boolean Optional

Enable the use of the VEP dbNSFP plugin.

For details, see here.

--dbnsfp ¶

string file-path Optional

Path to dbNSFP processed file.

Will not work without a provided dbnsfp_tbi. To be used with --vep_dbnsfp. dbNSFP files and more information are available at https://www.ensembl.org/info/docs/tools/vep/script/vep_plugins.html#dbnsfp and https://sites.google.com/site/jpopgen/dbNSFP/

--dbnsfp_tbi ¶

string file-path Optional

Path to dbNSFP tabix indexed file.

To be used with --vep_dbnsfp.

--dbnsfp_consequence ¶

string Optional

Consequence to annotate with

To be used with --vep_dbnsfp. This params is used to filter/limit outputs to a specific effect of the variant. The set of consequence terms is defined by the Sequence Ontology and an overview of those used in VEP can be found here: https://www.ensembl.org/info/genome/variation/prediction/predicted_data.html If one wants to filter using several consequences, then separate those by using '&' (i.e. 'consequence=3_prime_UTR_variant&intron_variant'.

--dbnsfp_fields ¶

string Optional

Fields to annotate with

To be used with --vep_dbnsfp. This params can be used to retrieve individual values from the dbNSFP file. The values correspond to the name of the columns in the dbNSFP file and are separated by comma. The column names might differ between the different dbNSFP versions. Please check the Readme.txt file, which is provided with the dbNSFP file, to obtain the correct column names. The Readme file contains also a short description of the provided values and the version of the tools used to generate them.

Default value are explained below:

rs_dbSNP - rs number from dbSNP HGVSc_VEP - HGVS coding variant presentation from VEP. Multiple entries separated by ';', corresponds to Ensembl_transcriptid HGVSp_VEP - HGVS protein variant presentation from VEP. Multiple entries separated by ';', corresponds to Ensembl_proteinid 1000Gp3_EAS_AF - Alternative allele frequency in the 1000Gp3 East Asian descendent samples 1000Gp3_AMR_AF - Alternative allele counts in the 1000Gp3 American descendent samples LRT_score - Original LRT two-sided p-value (LRTori), ranges from 0 to 1 GERP++_RS - Conservation score. The larger the score, the more conserved the site, ranges from -12.3 to 6.17 gnomAD_exomes_AF - Alternative allele frequency in the whole gnomAD exome samples.

Default: rs_dbSNP,HGVSc_VEP,HGVSp_VEP,1000Gp3_EAS_AF,1000Gp3_AMR_AF,LRT_score,GERP++_RS,gnomAD_exomes_AF

--vep_loftee ¶

boolean Optional

Enable the use of the VEP LOFTEE plugin.

For details, see here.

--vep_spliceai ¶

boolean Optional

Enable the use of the VEP SpliceAI plugin.

For details, see here.

--spliceai_snv ¶

string file-path Optional

Path to spliceai raw scores snv file.

To be used with --vep_spliceai.

--spliceai_snv_tbi ¶

string file-path Optional

Path to spliceai raw scores snv tabix indexed file.

To be used with --vep_spliceai.

--spliceai_indel ¶

string file-path Optional

Path to spliceai raw scores indel file.

To be used with --vep_spliceai.

--spliceai_indel_tbi ¶

string file-path Optional

Path to spliceai raw scores indel tabix indexed file.

To be used with --vep_spliceai.

--vep_spliceregion ¶

boolean Optional

Enable the use of the VEP SpliceRegion plugin.

For details, see here and here.

--vep_custom_args ¶

string Optional

Add an extra custom argument to VEP.

Using this params you can add custom args to VEP.

Default: --everything --filter_common --per_gene --total_length --offline --format vcf

--vep_version ¶

string Optional

Should reflect the VEP version used in the container.

Used by the loftee plugin that need the full path.

Default: 111.0-0

--outdir_cache ¶

string directory-path Optional

The output directory where the cache will be saved. You have to use absolute paths to storage on Cloud infrastructure.

--vep_out_format ¶

string Optional

VEP output-file format.

Sets the format of the output-file from VEP. Available formats: json, tab and vcf.

Default: vcf

Allowed values: json , tab , vcf

--bcftools_annotations ¶

string file-path Optional

A vcf file containing custom annotations to be used with bcftools annotate. Needs to be bgzipped.

--bcftools_annotations_tbi ¶

string file-path Optional

Index file for bcftools_annotations

--bcftools_columns ¶

string Optional

Optional text file with list of columns to use from bcftools_annotations, one name per row

--bcftools_header_lines ¶

string Optional

Text file with the header lines of bcftools_annotations

General reference genome options ¶

--igenomes_base ¶

string directory-path Optional

The base path to the igenomes reference files

Default: s3://ngi-igenomes/igenomes/

--igenomes_ignore ¶

boolean Optional

Do not load the iGenomes reference config.

Do not load igenomes.config when running the pipeline. You may choose this option if you observe clashes between custom parameters and those supplied in igenomes.config. NB You can then run Sarek by specifying at least a FASTA genome file

--save_reference ¶

boolean Optional

Save built references.

Set this parameter, if you wish to save all computed reference files. This is useful to avoid re-computation on future runs.

--build_only_index ¶

boolean Optional

Only built references.

Set this parameter, if you wish to compute and save all computed reference files. No alignment or any other downstream steps will be performed.

--download_cache ¶

boolean Optional

Download annotation cache.

Set this parameter, if you wish to download annotation cache. Using this parameter will download cache even if --snpeff_cache and --vep_cache are provided.

Reference genome options ¶

--genome ¶

string Optional

Name of iGenomes reference.

If using a reference genome configured in the pipeline using iGenomes, use this parameter to give the ID for the reference. This is then used to build the full paths for all required reference genome files e.g. --genome GRCh38.

See the nf-core website docs for more details.

Default: GATK.GRCh38

--ascat_genome ¶

string Optional

ASCAT genome.

Must be set to run ASCAT, either hg19 or hg38.

If you use AWS iGenomes, this has already been set for you appropriately.

Allowed values: hg19 , hg38

--ascat_alleles ¶

string file-path Optional

Path to ASCAT allele zip file.

If you use AWS iGenomes, this has already been set for you appropriately.

--ascat_loci ¶

string file-path Optional

Path to ASCAT loci zip file.

If you use AWS iGenomes, this has already been set for you appropriately.

--ascat_loci_gc ¶

string file-path Optional

Path to ASCAT GC content correction file.

If you use AWS iGenomes, this has already been set for you appropriately.

--ascat_loci_rt ¶

string file-path Optional

Path to ASCAT RT (replictiming) correction file.

If you use AWS iGenomes, this has already been set for you appropriately.

--bwa ¶

string directory-path Optional

Path to BWA mem indices.

If you wish to recompute indices available on igenomes, set --bwa false.

NB If none provided, will be generated automatically from the FASTA reference. Combine with --save_reference to save for future runs.

If you use AWS iGenomes, this has already been set for you appropriately.

--bwamem2 ¶

string directory-path Optional

Path to bwa-mem2 mem indices.

If you use AWS iGenomes, this has already been set for you appropriately.

If you wish to recompute indices available on igenomes, set --bwamem2 false.

NB If none provided, will be generated automatically from the FASTA reference, if --aligner bwa-mem2 is specified. Combine with --save_reference to save for future runs.

--chr_dir ¶

string path Optional

Path to chromosomes folder used with ControLFREEC.

If you use AWS iGenomes, this has already been set for you appropriately.

--dbsnp ¶

string file-path Optional

Path to dbsnp file.

If you use AWS iGenomes, this has already been set for you appropriately.

--dbsnp_tbi ¶

string file-path Optional

Path to dbsnp index.

NB If none provided, will be generated automatically from the dbsnp file. Combine with --save_reference to save for future runs.

If you use AWS iGenomes, this has already been set for you appropriately.

--dbsnp_vqsr ¶

string Optional

Label string for VariantRecalibration (haplotypecaller joint variant calling).

If you use AWS iGenomes, this has already been set for you appropriately.

--dict ¶

string file-path Optional

Path to FASTA dictionary file.

NB If none provided, will be generated automatically from the FASTA reference. Combine with --save_reference to save for future runs.

If you use AWS iGenomes, this has already been set for you appropriately.

--dragmap ¶

string directory-path Optional

Path to dragmap indices.

If you wish to recompute indices available on igenomes, set --dragmap false.

NB If none provided, will be generated automatically from the FASTA reference, if --aligner dragmap is specified. Combine with --save_reference to save for future runs.

If you use AWS iGenomes, this has already been set for you appropriately.

--fasta ¶

string file-path Optional

Path to FASTA genome file.

This parameter is mandatory if --genome is not specified.

If you use AWS iGenomes, this has already been set for you appropriately.

--fasta_fai ¶

string file-path Optional

Path to FASTA reference index.

NB If none provided, will be generated automatically from the FASTA reference. Combine with --save_reference to save for future runs.

If you use AWS iGenomes, this has already been set for you appropriately.

--germline_resource ¶

string file-path Optional

Path to GATK Mutect2 Germline Resource File.

The germline resource VCF file (bgzipped and tabixed) needed by GATK4 Mutect2 is a collection of calls that are likely present in the sample, with allele frequencies. The AF info field must be present. You can find a smaller, stripped gnomAD VCF file (most of the annotation is removed and only calls signed by PASS are stored) in the AWS iGenomes Annotation/GermlineResource folder.

If you use AWS iGenomes, this has already been set for you appropriately.

--germline_resource_tbi ¶

string file-path Optional

Path to GATK Mutect2 Germline Resource Index.

NB If none provided, will be generated automatically from the Germline Resource file, if provided. Combine with --save_reference to save for future runs.

If you use AWS iGenomes, this has already been set for you appropriately.

--known_indels ¶

string file-path-pattern Optional

Path to known indels file.

If you use AWS iGenomes, this has already been set for you appropriately.

--known_indels_tbi ¶

string file-path-pattern Optional

Path to known indels file index.

NB If none provided, will be generated automatically from the known index file, if provided. Combine with --save_reference to save for future runs.

If you use AWS iGenomes, this has already been set for you appropriately.

--known_indels_vqsr ¶

string Optional

Label string for VariantRecalibration (haplotypecaller joint variant calling). If you use AWS iGenomes, this has already been set for you appropriately.

--known_snps ¶

string file-path Optional

Path to known snps file.

If you use AWS iGenomes, this has already been set for you appropriately.

--known_snps_tbi ¶

string file-path Optional

Path to known snps file snps.

NB If none provided, will be generated automatically from the known index file, if provided. Combine with --save_reference to save for future runs.

If you use AWS iGenomes, this has already been set for you appropriately.

--known_snps_vqsr ¶

string Optional

Label string for VariantRecalibration (haplotypecaller joint variant calling).If you use AWS iGenomes, this has already been set for you appropriately.

--mappability ¶

string file-path Optional

Path to Control-FREEC mappability file.

If you use AWS iGenomes, this has already been set for you appropriately.

--msisensor2_models ¶

string path Optional

Path to models folder used with MSIsensor2.

If you use AWS iGenomes, this has already been set for you appropriately.

--msisensor2_scan ¶

string path Optional

Path to scan file used with MSIsensor2.

If you use AWS iGenomes, this has already been set for you appropriately.

--msisensorpro_scan ¶

string path Optional

Path to scan file used with MSIsensorPro.

If you use AWS iGenomes, this has already been set for you appropriately.

--ngscheckmate_bed ¶

string file-path Optional

Path to SNP bed file for sample checking with NGSCheckMate

If you use AWS iGenomes, this has already been set for you appropriately.

--sentieon_dnascope_model ¶

string file-path Optional

Machine learning model for Sentieon Dnascope.

It is recommended to use DNAscope with a machine learning model to perform variant calling with higher accuracy by improving the candidate detection and filtering. Sentieon can provide you with a model trained using a subset of the data from the GiAB truth-set found in https://github.com/genome-in-a-bottle. In addition, Sentieon can assist you in the creation of models using your own data, which will calibrate the specifics of your sequencing and bio-informatics processing.

If you use AWS iGenomes, this has already been set for you appropriately.

--snpeff_cache ¶

string directory-path Optional

Path to snpEff cache.

Path to snpEff cache which should contain the relevant genome and build directory in the path ${snpeff_species}.${snpeff_version}

If you use AWS iGenomes, this has already been set for you appropriately.

Default: s3://annotation-cache/snpeff_cache/

--snpeff_db ¶

string Optional

snpEff DB version.

This is used to specify the database to be use to annotate with. Alternatively databases' names can be listed with the snpEff databases.

If you use AWS iGenomes, this has already been set for you appropriately.

--vep_cache ¶

string directory-path Optional

Path to VEP cache.

Path to VEP cache which should contain the relevant species, genome and build directories at the path ${vep_species}/${vep_genome}_${vep_cache_version}

If you use AWS iGenomes, this has already been set for you appropriately.

Default: s3://annotation-cache/vep_cache/

--vep_cache_version ¶

string Optional

VEP cache version.

Alternative cache version can be used to specify the correct Ensembl Genomes version number as these differ from the concurrent Ensembl/VEP version numbers.

If you use AWS iGenomes, this has already been set for you appropriately.

--vep_genome ¶

string Optional

VEP genome.

This is used to specify the genome when looking for local cache, or cloud based cache.

If you use AWS iGenomes, this has already been set for you appropriately.

--vep_species ¶

string Optional

VEP species.

Alternatively species listed in Ensembl Genomes caches can be used.

If you use AWS iGenomes, this has already been set for you appropriately.

Institutional config options ¶

--custom_config_version ¶

string Optional

Git commit id for Institutional configs.

Default: master

--custom_config_base ¶

string Optional

Base directory for Institutional configs.

If you're running offline, Nextflow will not be able to fetch the institutional config files from the internet. If you don't need them, then this is not a problem. If you do need them, you should download the files from the repo and tell Nextflow where to find them with this parameter.

Default: https://raw.githubusercontent.com/nf-core/configs/master

--config_profile_name ¶

string Optional

Institutional config name.

--config_profile_description ¶

string Optional

Institutional config description.

--config_profile_contact ¶

string Optional

Institutional config contact information.

--config_profile_url ¶

string Optional

Institutional config URL link.

--test_data_base ¶

string Optional

Base path / URL for data used in the test profiles

Warning: The -profile test samplesheet file itself contains remote paths. Setting this parameter does not alter the contents of that file.

Default: https://raw.githubusercontent.com/nf-core/test-datasets/sarek3

--modules_testdata_base_path ¶

string Optional

Base path / URL for data used in the modules

--seq_center ¶

string Optional

Sequencing center information to be added to read group (CN field).

--seq_platform ¶

string Optional

Sequencing platform information to be added to read group (PL field).

Default: ILLUMINA. Will be used to create a proper header for further GATK4 downstream analysis.

Default: ILLUMINA

Generic options ¶

--version ¶

boolean Optional

Display version and exit.

--publish_dir_mode ¶

string Optional

Method used to save pipeline results to output directory.

The Nextflow publishDir option specifies which intermediate files should be saved to the output directory. This option tells the pipeline what method should be used to move these files. See Nextflow docs for details.

Default: copy

Allowed values: symlink , rellink , link , copy , copyNoFollow , move

--email ¶

string Optional

Email address for completion summary.

Set this parameter to your e-mail address to get a summary e-mail with details of the run sent to you when the workflow exits. If set in your user config file (~/.nextflow/config) then you don't need to specify this on the command line for every run.

--email_on_fail ¶

string Optional

Email address for completion summary, only when pipeline fails.

An email address to send a summary email to when the pipeline is completed - ONLY sent if the pipeline does not exit successfully.

--plaintext_email ¶

boolean Optional

Send plain-text email instead of HTML.

--max_multiqc_email_size ¶

string Optional

File size limit when attaching MultiQC reports to summary emails.

Default: 25.MB

--monochrome_logs ¶

boolean Optional

Do not use coloured log outputs.

--hook_url ¶

string Optional

Incoming hook URL for messaging service

Incoming hook URL for messaging service. Currently, MS Teams and Slack are supported.

--multiqc_title ¶

string Optional

MultiQC report title. Printed as page header, used for filename if not otherwise specified.

--multiqc_config ¶

string file-path Optional

Custom config file to supply to MultiQC.

--multiqc_logo ¶

string Optional

Custom logo file to supply to MultiQC. File name must also be set in the MultiQC config file

--multiqc_methods_description ¶

string Optional

Custom MultiQC yaml file containing HTML including a methods description.

--validate_params ¶

boolean Optional

Boolean whether to validate parameters against the schema at runtime

Default: True

--pipelines_testdata_base_path ¶

string Optional

Base URL or local path to location of pipeline test dataset files

Default: https://raw.githubusercontent.com/nf-core/test-datasets/

--trace_report_suffix ¶

string Optional

Suffix to add to the trace report filename. Default is the date and time in the format yyyy-MM-dd_HH-mm-ss.

--help ¶

boolean Optional

Display the help message.

--help_full ¶

boolean Optional

Display the full detailed help message.

--show_hidden ¶

boolean Optional

Display hidden parameters in the help message (only works when --help or --help_full are provided).

Workflows

This page documents all workflows in the pipeline.

workflow <entry> Entry Point [source] ¶

Defined in main.nf:298

workflow ANNOTATION_CACHE_INITIALISATION [source] ¶

Defined in subworkflows/local/annotation_cache_initialisation/main.nf:11

Inputs (take)

Name	Description
`snpeff_enabled`	-
`snpeff_cache`	-
`snpeff_db`	-
`vep_enabled`	-
`vep_cache`	-
`vep_species`	-
`vep_cache_version`	-
`vep_genome`	-
`vep_custom_args`	-
`help_message`	-

Outputs (emit)

Name	Description
`?`	-
`?`	-

workflow BAM_APPLYBQSR [source] ¶

Defined in subworkflows/local/bam_applybqsr/main.nf:11

Inputs (take)

Name	Description
`cram`	-
`dict`	-
`fasta`	-
`fasta_fai`	-
`intervals`	-

Outputs (emit)

Name	Description
`bam`	-
`cram`	-
`?`	-

workflow BAM_APPLYBQSR_SPARK [source] ¶

Defined in subworkflows/local/bam_applybqsr_spark/main.nf:11

Inputs (take)

Name	Description
`cram`	-
`dict`	-
`fasta`	-
`fasta_fai`	-
`intervals`	-

Outputs (emit)

Name	Description
`bam`	-
`cram`	-
`?`	-

workflow BAM_BASERECALIBRATOR [source] ¶

Defined in subworkflows/local/bam_baserecalibrator/main.nf:10

Inputs (take)

Name	Description
`cram`	-
`dict`	-
`fasta`	-
`fasta_fai`	-
`intervals`	-
`known_sites`	-
`known_sites_tbi`	-

Outputs (emit)

Name	Description
`?`	-
`?`	-

workflow BAM_BASERECALIBRATOR_SPARK [source] ¶

Defined in subworkflows/local/bam_baserecalibrator_spark/main.nf:10

Inputs (take)

Name	Description
`cram`	-
`dict`	-
`fasta`	-
`fasta_fai`	-
`intervals`	-
`known_sites`	-
`known_sites_tbi`	-

Outputs (emit)

Name	Description
`?`	-
`?`	-

workflow BAM_CONVERT_SAMTOOLS [source] ¶

Defined in subworkflows/local/bam_convert_samtools/main.nf:14

Inputs (take)

Name	Description
`input`	-
`fasta`	-
`fasta_fai`	-
`interleaved`	-

Outputs (emit)

Name	Description
`?`	-
`?`	-

workflow BAM_JOINT_CALLING_GERMLINE_GATK [source] ¶

Defined in subworkflows/local/bam_joint_calling_germline_gatk/main.nf:17

Inputs (take)

Name	Description
`input`	-
`fasta`	-
`fai`	-
`dict`	-
`dbsnp`	-
`dbsnp_tbi`	-
`dbsnp_vqsr`	-
`resource_indels_vcf`	-
`resource_indels_tbi`	-
`known_indels_vqsr`	-
`resource_snps_vcf`	-
`resource_snps_tbi`	-
`known_snps_vqsr`	-

Outputs (emit)

Name	Description
`?`	-
`?`	-
`?`	-

workflow BAM_JOINT_CALLING_GERMLINE_SENTIEON [source] ¶

Defined in subworkflows/local/bam_joint_calling_germline_sentieon/main.nf:15

Inputs (take)

Name	Description
`input`	-
`fasta`	-
`fai`	-
`dict`	-
`dbsnp`	-
`dbsnp_tbi`	-
`dbsnp_vqsr`	-
`resource_indels_vcf`	-
`resource_indels_tbi`	-
`known_indels_vqsr`	-
`resource_snps_vcf`	-
`resource_snps_tbi`	-
`known_snps_vqsr`	-
`variant_caller`	-

Outputs (emit)

Name	Description
`?`	-
`?`	-
`?`	-

workflow BAM_MARKDUPLICATES [source] ¶

Defined in subworkflows/local/bam_markduplicates/main.nf:10

Inputs (take)

Name	Description
`bam`	-
`fasta`	-
`fasta_fai`	-
`intervals_bed_combined`	-

Outputs (emit)

Name	Description
`?`	-
`?`	-
`?`	-

workflow BAM_MARKDUPLICATES_SPARK [source] ¶

Defined in subworkflows/local/bam_markduplicates_spark/main.nf:12

Inputs (take)

Name	Description
`bam`	-
`dict`	-
`fasta`	-
`fasta_fai`	-
`intervals_bed_combined`	-

Outputs (emit)

Name	Description
`?`	-
`?`	-
`?`	-

workflow BAM_MERGE_INDEX_SAMTOOLS [source] ¶

Defined in subworkflows/local/bam_merge_index_samtools/main.nf:10

Inputs (take)

Name	Description
`bam`	-

Outputs (emit)

Name	Description
`?`	-
`?`	-

workflow BAM_NGSCHECKMATE [source] ¶

Defined in subworkflows/nf-core/bam_ngscheckmate/main.nf:4

ngscheckmate qc bam snp

Take a set of bam files and run NGSCheckMate to determine whether samples match with each other, using a set of SNPs.

Components

bcftools/mpileup ngscheckmate/ncm

Inputs (take)

Name	Description
`meta1`	Groovy Map containing sample information e.g. [ id:'test' ]
`bam`	BAM files for each sample
`meta2`	Groovy Map containing bed file information e.g. [ id:'sarscov2' ]
`snp_bed`	BED file containing the SNPs to analyse. NGSCheckMate provides some default ones for hg19/hg38.
`meta3`	Groovy Map containing reference genome meta information e.g. [ id:'sarscov2' ]
`fasta`	fasta file for the genome

Outputs (emit)

Name	Description
`pdf`	A pdf containing a dendrogram showing how the samples match up
`corr_matrix`	A text file containing the correlation matrix between each sample
`matched`	A txt file containing only the samples that match with each other
`all`	A txt file containing all the sample comparisons, whether they match or not
`vcf`	vcf files for each sample giving the SNP calls
`versions`	File containing software versions

Authors: @SPPearce Maintainers: @SPPearce

workflow BAM_SENTIEON_DEDUP [source] ¶

Defined in subworkflows/local/bam_sentieon_dedup/main.nf:7

Inputs (take)

Name	Description
`bam`	-
`bai`	-
`fasta`	-
`fasta_fai`	-
`intervals_bed_combined`	-

Outputs (emit)

Name	Description
`?`	-
`?`	-
`?`	-

workflow BAM_VARIANT_CALLING_CNVKIT [source] ¶

Defined in subworkflows/local/bam_variant_calling_cnvkit/main.nf:12

Inputs (take)

Name	Description
`cram`	-
`fasta`	-
`fasta_fai`	-
`targets`	-
`reference`	-

Outputs (emit)

Name	Description
`cnv_calls_raw`	-
`cnv_calls_export`	-
`?`	-

workflow BAM_VARIANT_CALLING_DEEPVARIANT [source] ¶

Defined in subworkflows/local/bam_variant_calling_deepvariant/main.nf:12

Inputs (take)

Name	Description
`cram`	-
`dict`	-
`fasta`	-
`fasta_fai`	-
`intervals`	-

Outputs (emit)

Name	Description
`?`	-
`?`	-
`?`	-
`?`	-

workflow BAM_VARIANT_CALLING_FREEBAYES [source] ¶

Defined in subworkflows/local/bam_variant_calling_freebayes/main.nf:14

Inputs (take)

Name	Description
`ch_cram`	-
`ch_dict`	-
`ch_fasta`	-
`ch_fasta_fai`	-
`ch_intervals`	-

Outputs (emit)

Name	Description
`vcf_unfiltered`	-
`vcf`	-
`tbi`	-
`?`	-

workflow BAM_VARIANT_CALLING_GERMLINE_ALL [source] ¶

Defined in subworkflows/local/bam_variant_calling_germline_all/main.nf:22

Inputs (take)

Name	Description
`tools`	-
`skip_tools`	-
`bam`	-
`cram`	-
`bwa`	-
`cnvkit_reference`	-
`dbsnp`	-
`dbsnp_tbi`	-
`dbsnp_vqsr`	-
`dict`	-
`fasta`	-
`fasta_fai`	-
`intervals`	-
`intervals_bed_combined`	-
`intervals_bed_gz_tbi_combined`	-
`intervals_bed_combined_haplotypec`	-
`intervals_bed_gz_tbi`	-
`known_indels_vqsr`	-
`known_sites_indels`	-
`known_sites_indels_tbi`	-
`known_sites_snps`	-
`known_sites_snps_tbi`	-
`known_snps_vqsr`	-
`joint_germline`	-
`skip_haplotypecaller_filter`	-
`sentieon_haplotyper_emit_mode`	-
`sentieon_dnascope_emit_mode`	-
`sentieon_dnascope_pcr_indel_model`	-
`sentieon_dnascope_model`	-

Outputs (emit)

Name	Description
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-

workflow BAM_VARIANT_CALLING_GERMLINE_MANTA [source] ¶

Defined in subworkflows/local/bam_variant_calling_germline_manta/main.nf:10

Inputs (take)

Name	Description
`cram`	-
`fasta`	-
`fasta_fai`	-
`intervals`	-

Outputs (emit)

Name	Description
`?`	-
`?`	-
`?`	-

workflow BAM_VARIANT_CALLING_HAPLOTYPECALLER [source] ¶

Defined in subworkflows/local/bam_variant_calling_haplotypecaller/main.nf:11

Inputs (take)

Name	Description
`cram`	-
`fasta`	-
`fasta_fai`	-
`dict`	-
`dbsnp`	-
`dbsnp_tbi`	-
`intervals`	-

Outputs (emit)

Name	Description
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-

workflow BAM_VARIANT_CALLING_INDEXCOV [source] ¶

Defined in subworkflows/local/bam_variant_calling_indexcov/main.nf:11

Inputs (take)

Name	Description
`cram`	-
`fasta`	-
`fasta_fai`	-

Outputs (emit)

Name	Description
`out_indexcov`	-
`?`	-

workflow BAM_VARIANT_CALLING_MPILEUP [source] ¶

Defined in subworkflows/local/bam_variant_calling_mpileup/main.nf:12

Inputs (take)

Name	Description
`cram`	-
`dict`	-
`fasta`	-
`intervals`	-

Outputs (emit)

Name	Description
`?`	-
`?`	-
`?`	-
`?`	-

workflow BAM_VARIANT_CALLING_SENTIEON_DNASCOPE [source] ¶

Defined in subworkflows/local/bam_variant_calling_sentieon_dnascope/main.nf:11

Inputs (take)

Name	Description
`cram`	-
`fasta`	-
`fasta_fai`	-
`dict`	-
`dbsnp`	-
`dbsnp_tbi`	-
`dbsnp_vqsr`	-
`intervals`	-
`joint_germline`	-
`sentieon_dnascope_emit_mode`	-
`sentieon_dnascope_pcr_indel_model`	-
`sentieon_dnascope_model`	-

Outputs (emit)

Name	Description
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-

workflow BAM_VARIANT_CALLING_SENTIEON_HAPLOTYPER [source] ¶

Defined in subworkflows/local/bam_variant_calling_sentieon_haplotyper/main.nf:11

Inputs (take)

Name	Description
`cram`	-
`fasta`	-
`fasta_fai`	-
`dict`	-
`dbsnp`	-
`dbsnp_tbi`	-
`dbsnp_vqsr`	-
`intervals`	-
`joint_germline`	-
`sentieon_haplotyper_emit_mode`	-

Outputs (emit)

Name	Description
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-

workflow BAM_VARIANT_CALLING_SINGLE_STRELKA [source] ¶

Defined in subworkflows/local/bam_variant_calling_single_strelka/main.nf:11

Inputs (take)

Name	Description
`cram`	-
`dict`	-
`fasta`	-
`fasta_fai`	-
`intervals`	-

Outputs (emit)

Name	Description
`?`	-
`?`	-
`?`	-

workflow BAM_VARIANT_CALLING_SINGLE_TIDDIT [source] ¶

Defined in subworkflows/local/bam_variant_calling_single_tiddit/main.nf:10

Inputs (take)

Name	Description
`cram`	-
`fasta`	-
`bwa`	-

Outputs (emit)

Name	Description
`?`	-
`?`	-
`?`	-
`?`	-

workflow BAM_VARIANT_CALLING_SOMATIC_ALL [source] ¶

Defined in subworkflows/local/bam_variant_calling_somatic_all/main.nf:21

Inputs (take)

Name	Description
`tools`	-
`bam`	-
`cram`	-
`bwa`	-
`cf_chrom_len`	-
`chr_files`	-
`dbsnp`	-
`dbsnp_tbi`	-
`dict`	-
`fasta`	-
`fasta_fai`	-
`germline_resource`	-
`germline_resource_tbi`	-
`intervals`	-
`intervals_bed_gz_tbi`	-
`intervals_bed_combined`	-
`intervals_bed_gz_tbi_combined`	-
`mappability`	-
`msisensorpro_scan`	-
`panel_of_normals`	-
`panel_of_normals_tbi`	-
`allele_files`	-
`loci_files`	-
`gc_file`	-
`rt_file`	-
`joint_mutect2`	-
`wes`	-

Outputs (emit)

Name	Description
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-

workflow BAM_VARIANT_CALLING_SOMATIC_ASCAT [source] ¶

Defined in subworkflows/local/bam_variant_calling_somatic_ascat/main.nf:9

Inputs (take)

Name	Description
`cram_pair`	-
`allele_files`	-
`loci_files`	-
`intervals_bed`	-
`fasta`	-
`gc_file`	-
`rt_file`	-

Outputs (emit)

Name	Description
`versions`	-

workflow BAM_VARIANT_CALLING_SOMATIC_CONTROLFREEC [source] ¶

Defined in subworkflows/local/bam_variant_calling_somatic_controlfreec/main.nf:13

Inputs (take)

Name	Description
`controlfreec_input`	-
`fasta`	-
`fasta_fai`	-
`dbsnp`	-
`dbsnp_tbi`	-
`chr_files`	-
`mappability`	-
`intervals_bed`	-

Outputs (emit)

Name	Description
`versions`	-

workflow BAM_VARIANT_CALLING_SOMATIC_MANTA [source] ¶

Defined in subworkflows/local/bam_variant_calling_somatic_manta/main.nf:9

Inputs (take)

Name	Description
`cram`	-
`fasta`	-
`fasta_fai`	-
`intervals`	-

Outputs (emit)

Name	Description
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-

workflow BAM_VARIANT_CALLING_SOMATIC_MUSE [source] ¶

Defined in subworkflows/local/bam_variant_calling_somatic_muse/main.nf:11

Inputs (take)

Name	Description
`bam_normal`	-
`bam_tumor`	-
`fasta`	-
`dbsnp`	-

Outputs (emit)

Name	Description
`?`	-
`?`	-
`?`	-

workflow BAM_VARIANT_CALLING_SOMATIC_MUTECT2 [source] ¶

Defined in subworkflows/local/bam_variant_calling_somatic_mutect2/main.nf:17

gatk4 mutect2 learnreadorientationmodel getpileupsummaries calculatecontamination filtermutectcalls variant_calling tumor_only filtered_vcf

Perform variant calling on a paired tumor normal set of samples using mutect2 tumor normal mode. f1r2 output of mutect2 is run through learnreadorientationmodel to get the artifact priors. Run the input bam files through getpileupsummarries and then calculatecontamination to get the contamination and segmentation tables. Filter the mutect2 output vcf using filtermutectcalls, artifact priors and the contamination & segmentation tables for additional filtering.

Components

gatk4/mutect2 gatk4/learnreadorientationmodel gatk4/getpileupsummaries gatk4/calculatecontamination gatk4/filtermutectcalls

Inputs (take)

Name	Description
`meta`	Groovy Map containing sample information e.g. [ id:'test' ]
`input`	list containing the tumor and normal BAM files, in that order, also able to take CRAM as an input
`input_index`	list containing the tumor and normal BAM file indexes, in that order, also able to take CRAM index as an input
`which_norm`	optional list of sample headers contained in the normal sample input file.
`fasta`	The reference fasta file
`fai`	Index of reference fasta file
`dict`	GATK sequence dictionary
`germline_resource`	Population vcf of germline sequencing, containing allele fractions.
`germline_resource_tbi`	Index file for the germline resource.
`panel_of_normals`	vcf file to be used as a panel of normals.
`panel_of_normals_tbi`	Index for the panel of normals.
`interval_file`	File containing intervals.

Outputs (emit)

Name	Description
`versions`	File containing software versions
`mutect2_vcf`	Compressed vcf file to be used for variant_calling.
`mutect2_tbi`	Indexes of the mutect2_vcf file
`mutect2_stats`	Stats files for the mutect2 vcf
`mutect2_f1r2`	file containing information to be passed to LearnReadOrientationModel.
`artifact_priors`	file containing artifact-priors to be used by filtermutectcalls.
`pileup_table_tumor`	File containing the tumor pileup summary table, kept separate as calculatecontamination needs them individually specified.
`pileup_table_normal`	File containing the normal pileup summary table, kept separate as calculatecontamination needs them individually specified.
`contamination_table`	File containing the contamination table.
`segmentation_table`	Output table containing segmentation of tumor minor allele fractions.
`filtered_vcf`	file containing filtered mutect2 calls.
`filtered_tbi`	tbi file that pairs with filtered vcf.
`filtered_stats`	file containing statistics of the filtermutectcalls run.

Authors: @GCJMackenzie

workflow BAM_VARIANT_CALLING_SOMATIC_STRELKA [source] ¶

Defined in subworkflows/local/bam_variant_calling_somatic_strelka/main.nf:11

Inputs (take)

Name	Description
`cram`	-
`dict`	-
`fasta`	-
`fasta_fai`	-
`intervals`	-

Outputs (emit)

Name	Description
`?`	-
`?`	-
`?`	-

workflow BAM_VARIANT_CALLING_SOMATIC_TIDDIT [source] ¶

Defined in subworkflows/local/bam_variant_calling_somatic_tiddit/main.nf:11

Inputs (take)

Name	Description
`cram_normal`	-
`cram_tumor`	-
`fasta`	-
`bwa`	-

Outputs (emit)

Name	Description
`?`	-
`?`	-
`?`	-

workflow BAM_VARIANT_CALLING_SOMATIC_TNSCOPE [source] ¶

Defined in subworkflows/local/bam_variant_calling_somatic_tnscope/main.nf:9

Inputs (take)

Name	Description
`input`	-
`fasta`	-
`fai`	-
`dict`	-
`germline_resource`	-
`germline_resource_tbi`	-
`panel_of_normals`	-
`panel_of_normals_tbi`	-
`intervals`	-

Outputs (emit)

Name	Description
`?`	-
`?`	-
`?`	-

workflow BAM_VARIANT_CALLING_TUMOR_ONLY_ALL [source] ¶

Defined in subworkflows/local/bam_variant_calling_tumor_only_all/main.nf:17

Inputs (take)

Name	Description
`tools`	-
`bam`	-
`cram`	-
`bwa`	-
`cf_chrom_len`	-
`chr_files`	-
`cnvkit_reference`	-
`dbsnp`	-
`dbsnp_tbi`	-
`dict`	-
`fasta`	-
`fasta_fai`	-
`germline_resource`	-
`germline_resource_tbi`	-
`intervals`	-
`intervals_bed_gz_tbi`	-
`intervals_bed_combined`	-
`intervals_bed_gz_tbi_combined`	-
`mappability`	-
`msisensor2_models`	-
`panel_of_normals`	-
`panel_of_normals_tbi`	-
`joint_mutect2`	-
`wes`	-

Outputs (emit)

Name	Description
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-

workflow BAM_VARIANT_CALLING_TUMOR_ONLY_CONTROLFREEC [source] ¶

Defined in subworkflows/local/bam_variant_calling_tumor_only_controlfreec/main.nf:13

Inputs (take)

Name	Description
`controlfreec_input`	-
`fasta`	-
`fasta_fai`	-
`dbsnp`	-
`dbsnp_tbi`	-
`chr_files`	-
`mappability`	-
`intervals_bed`	-

Outputs (emit)

Name	Description
`versions`	-

workflow BAM_VARIANT_CALLING_TUMOR_ONLY_LOFREQ [source] ¶

Defined in subworkflows/local/bam_variant_calling_tumor_only_lofreq/main.nf:4

Inputs (take)

Name	Description
`input`	-
`fasta`	-
`fai`	-
`intervals`	-
`dict`	-

Outputs (emit)

Name	Description
`?`	-
`?`	-
`?`	-

workflow BAM_VARIANT_CALLING_TUMOR_ONLY_MANTA [source] ¶

Defined in subworkflows/local/bam_variant_calling_tumor_only_manta/main.nf:10

Inputs (take)

Name	Description
`cram`	-
`fasta`	-
`fasta_fai`	-
`intervals`	-

Outputs (emit)

Name	Description
`?`	-
`?`	-
`?`	-

workflow BAM_VARIANT_CALLING_TUMOR_ONLY_MUTECT2 [source] ¶

Defined in subworkflows/local/bam_variant_calling_tumor_only_mutect2/main.nf:16

gatk4 mutect2 getpileupsummaries calculatecontamination filtermutectcalls variant_calling tumor_only filtered_vcf

Perform variant calling on a single tumor sample using mutect2 tumor only mode. Run the input bam file through getpileupsummarries and then calculatecontaminationto get the contamination and segmentation tables. Filter the mutect2 output vcf using filtermutectcalls and the contamination & segmentation tables for additional filtering.

Components

gatk4/mutect2 gatk4/getpileupsummaries gatk4/calculatecontamination gatk4/filtermutectcalls

Inputs (take)

Name	Description
`meta`	Groovy Map containing sample information e.g. [ id:'test' ]
`input`	list containing one BAM file, also able to take CRAM as an input
`input_index`	list containing one BAM file indexe, also able to take CRAM index as an input
`fasta`	The reference fasta file
`fai`	Index of reference fasta file
`dict`	GATK sequence dictionary
`germline_resource`	Population vcf of germline sequencing, containing allele fractions.
`germline_resource_tbi`	Index file for the germline resource.
`panel_of_normals`	vcf file to be used as a panel of normals.
`panel_of_normals_tbi`	Index for the panel of normals.
`interval_file`	File containing intervals.

Outputs (emit)

Name	Description
`versions`	File containing software versions
`mutect2_vcf`	Compressed vcf file to be used for variant_calling.
`mutect2_tbi`	Indexes of the mutect2_vcf file
`mutect2_stats`	Stats files for the mutect2 vcf
`pileup_table`	File containing the pileup summary table.
`contamination_table`	File containing the contamination table.
`segmentation_table`	Output table containing segmentation of tumor minor allele fractions.
`filtered_vcf`	file containing filtered mutect2 calls.
`filtered_tbi`	tbi file that pairs with filtered vcf.
`filtered_stats`	file containing statistics of the filtermutectcalls run.

Authors: @GCJMackenzie

workflow BAM_VARIANT_CALLING_TUMOR_ONLY_TNSCOPE [source] ¶

Defined in subworkflows/local/bam_variant_calling_tumor_only_tnscope/main.nf:9

gatk4 mutect2 getpileupsummaries calculatecontamination filtermutectcalls variant_calling tumor_only filtered_vcf

Components

gatk4/mutect2 gatk4/getpileupsummaries gatk4/calculatecontamination gatk4/filtermutectcalls

Inputs (take)

Name	Description
`meta`	Groovy Map containing sample information e.g. [ id:'test' ]
`input`	list containing one BAM file, also able to take CRAM as an input
`input_index`	list containing one BAM file indexe, also able to take CRAM index as an input
`fasta`	The reference fasta file
`fai`	Index of reference fasta file
`dict`	GATK sequence dictionary
`germline_resource`	Population vcf of germline sequencing, containing allele fractions.
`germline_resource_tbi`	Index file for the germline resource.
`panel_of_normals`	vcf file to be used as a panel of normals.
`panel_of_normals_tbi`	Index for the panel of normals.
`interval_file`	File containing intervals.

Outputs (emit)

Name	Description
`versions`	File containing software versions
`mutect2_vcf`	Compressed vcf file to be used for variant_calling.
`mutect2_tbi`	Indexes of the mutect2_vcf file
`mutect2_stats`	Stats files for the mutect2 vcf
`pileup_table`	File containing the pileup summary table.
`contamination_table`	File containing the contamination table.
`segmentation_table`	Output table containing segmentation of tumor minor allele fractions.
`filtered_vcf`	file containing filtered mutect2 calls.
`filtered_tbi`	tbi file that pairs with filtered vcf.
`filtered_stats`	file containing statistics of the filtermutectcalls run.

Authors: @GCJMackenzie

workflow CHANNEL_ALIGN_CREATE_CSV [source] ¶

Defined in subworkflows/local/channel_align_create_csv/main.nf:5

Inputs (take)

Name	Description
`bam_indexed`	-
`outdir`	-
`save_output_as_bam`	-

Outputs (emit)

Name	Description
	-

workflow CHANNEL_APPLYBQSR_CREATE_CSV [source] ¶

Defined in subworkflows/local/channel_applybqsr_create_csv/main.nf:5

Inputs (take)

Name	Description
`cram_recalibrated_index`	-
`outdir`	-
`save_output_as_bam`	-

Outputs (emit)

Name	Description
	-

workflow CHANNEL_BASERECALIBRATOR_CREATE_CSV [source] ¶

Defined in subworkflows/local/channel_baserecalibrator_create_csv/main.nf:5

Inputs (take)

Name	Description
`cram_table_bqsr`	-
`tools`	-
`skip_tools`	-
`outdir`	-
`save_output_as_bam`	-

Outputs (emit)

Name	Description
	-

workflow CHANNEL_MARKDUPLICATES_CREATE_CSV [source] ¶

Defined in subworkflows/local/channel_markduplicates_create_csv/main.nf:5

Inputs (take)

Name	Description
`cram_markduplicates`	-
`csv_subfolder`	-
`outdir`	-
`save_output_as_bam`	-

Outputs (emit)

Name	Description
	-

workflow CHANNEL_VARIANT_CALLING_CREATE_CSV [source] ¶

Defined in subworkflows/local/channel_variant_calling_create_csv/main.nf:5

Inputs (take)

Name	Description
`vcf_to_annotate`	-
`outdir`	-

Outputs (emit)

Name	Description
	-

workflow CONCATENATE_GERMLINE_VCFS [source] ¶

Defined in subworkflows/local/vcf_concatenate_germline/main.nf:12

Inputs (take)

Name	Description
`vcfs`	-

Outputs (emit)

Name	Description
`vcfs`	-
`tbis`	-
`?`	-

workflow CONSENSUS [source] ¶

Defined in subworkflows/local/vcf_consensus/main.nf:8

Inputs (take)

Name	Description
`vcfs`	-

Outputs (emit)

Name	Description
`versions`	-
`vcfs`	-
`tbis`	-

workflow CRAM_MERGE_INDEX_SAMTOOLS [source] ¶

Defined in subworkflows/local/cram_merge_index_samtools/main.nf:10

Inputs (take)

Name	Description
`cram`	-
`fasta`	-
`fasta_fai`	-

Outputs (emit)

Name	Description
`?`	-
`?`	-

workflow CRAM_QC_MOSDEPTH_SAMTOOLS [source] ¶

Defined in subworkflows/local/cram_qc_mosdepth_samtools/main.nf:10

Inputs (take)

Name	Description
`cram`	-
`fasta`	-
`intervals`	-

Outputs (emit)

Name	Description
`?`	-
`?`	-

workflow CRAM_SAMPLEQC [source] ¶

Defined in subworkflows/local/cram_sampleqc/main.nf:4

Inputs (take)

Name	Description
`cram`	-
`ngscheckmate_bed`	-
`fasta`	-
`skip_baserecalibration`	-
`intervals_for_preprocessing`	-

Outputs (emit)

Name	Description
`corr_matrix`	-
`matched`	-
`all`	-
`vcf`	-
`pdf`	-
`?`	-
`?`	-

workflow DOWNLOAD_CACHE_SNPEFF_VEP [source] ¶

Defined in subworkflows/local/download_cache_snpeff_vep/main.nf:14

Inputs (take)

Name	Description
`ensemblvep_info`	-
`snpeff_info`	-

Outputs (emit)

Name	Description
`ensemblvep_cache`	-
`snpeff_cache`	-
`?`	-

workflow FASTQ_ALIGN [source] ¶

Defined in subworkflows/local/fastq_align/main.nf:12

Inputs (take)

Name	Description
`reads`	-
`index`	-
`sort`	-
`fasta`	-
`fasta_fai`	-

Outputs (emit)

Name	Description
`?`	-
`?`	-
`?`	-
`?`	-

workflow FASTQ_CREATE_UMI_CONSENSUS_FGBIO [source] ¶

Defined in subworkflows/local/fastq_create_umi_consensus_fgbio/main.nf:16

Inputs (take)

Name	Description
`reads`	-
`fasta`	-
`fai`	-
`map_index`	-
`groupreadsbyumi_strategy`	-

Outputs (emit)

Name	Description
`umibam`	-
`groupbam`	-
`consensusbam`	-
`versions`	-

workflow FASTQ_PREPROCESS_GATK [source] ¶

Defined in subworkflows/local/fastq_preprocess_gatk/main.nf:52

Inputs (take)

Name	Description
`input_fastq`	-
`input_sample`	-
`dict`	-
`fasta`	-
`fasta_fai`	-
`index_alignment`	-
`intervals_and_num_intervals`	-
`intervals_for_preprocessing`	-
`known_sites_indels`	-
`known_sites_indels_tbi`	-
`bbsplit_index`	-

Outputs (emit)

Name	Description
`?`	-
`?`	-
`?`	-

workflow FASTQ_PREPROCESS_PARABRICKS [source] ¶

Defined in subworkflows/local/fastq_preprocess_parabricks/main.nf:4

Inputs (take)

Name	Description
`ch_reads`	-
`ch_fasta`	-
`ch_index`	-
`ch_interval_file`	-
`ch_known_sites`	-
`val_output_fmt`	-

Outputs (emit)

Name	Description
`cram`	-
`versions`	-
`reports`	-

workflow NFCORE_SAREK [source] ¶

Defined in main.nf:86

Inputs (take)

Name	Description
`samplesheet`	-

Outputs (emit)

Name	Description
`multiqc_report`	-

workflow NORMALIZE_VCFS [source] ¶

Defined in subworkflows/local/vcf_normalization/main.nf:10

Inputs (take)

Name	Description
`vcfs`	-
`fasta`	-

Outputs (emit)

Name	Description
`vcfs`	-
`tbis`	-
`?`	-

workflow PIPELINE_COMPLETION [source] ¶

Defined in subworkflows/local/utils_nfcore_sarek_pipeline/main.nf:203

Inputs (take)

Name	Description
`email`	-
`email_on_fail`	-
`plaintext_email`	-
`outdir`	-
`monochrome_logs`	-
`hook_url`	-
`multiqc_report`	-

Outputs (emit)

Name	Description
	-

workflow PIPELINE_INITIALISATION [source] ¶

Defined in subworkflows/local/utils_nfcore_sarek_pipeline/main.nf:26

Inputs (take)

Name	Description
`version`	-
`validate_params`	-
`nextflow_cli_args`	-
`outdir`	-
`input`	-
`help`	-
`help_full`	-
`show_hidden`	-

Outputs (emit)

Name	Description
`samplesheet`	-
`?`	-

workflow POST_VARIANTCALLING [source] ¶

Defined in subworkflows/local/post_variantcalling/main.nf:12

Inputs (take)

Name	Description
`tools`	-
`cram_germline`	-
`germline_vcfs`	-
`germline_tbis`	-
`cram_tumor_only`	-
`tumor_only_vcfs`	-
`tumor_only_tbis`	-
`cram_somatic`	-
`somatic_vcfs`	-
`somatic_tbis`	-
`fasta`	-
`fai`	-
`concatenate_vcfs`	-
`filter_vcfs`	-
`snv_consensus_calling`	-
`normalize_vcfs`	-
`varlociraptor_chunk_size`	-
`varlociraptor_scenario_germline`	-
`varlociraptor_scenario_somatic`	-
`varlociraptor_scenario_tumor_only`	-

Outputs (emit)

Name	Description
`?`	-
`?`	-
`?`	-

workflow PREPARE_GENOME [source] ¶

Defined in subworkflows/local/prepare_genome/main.nf:22

Inputs (take)

Name	Description
`ascat_alleles_in`	-
`ascat_loci_in`	-
`ascat_loci_gc_in`	-
`ascat_loci_rt_in`	-
`bbsplit_fasta_list_in`	-
`bbsplit_index_in`	-
`bcftools_annotations_in`	-
`bcftools_annotations_tbi_in`	-
`bwa_in`	-
`bwamem2_in`	-
`chr_dir_in`	-
`dbsnp_in`	-
`dbsnp_tbi_in`	-
`dict_in`	-
`dragmap_in`	-
`fasta_in`	-
`fasta_fai_in`	-
`germline_resource_in`	-
`germline_resource_tbi_in`	-
`known_indels_in`	-
`known_indels_tbi_in`	-
`known_snps_in`	-
`known_snps_tbi_in`	-
`msisensor2_models_in`	-
`msisensorpro_scan_in`	-
`pon_in`	-
`pon_tbi_in`	-
`aligner`	-
`step`	-
`tools`	-
`vep_include_fasta`	-

Outputs (emit)

Name	Description
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-

workflow PREPARE_REFERENCE_CNVKIT [source] ¶

Defined in subworkflows/local/prepare_reference_cnvkit/main.nf:4

Inputs (take)

Name	Description
`fasta`	-
`intervals_bed_combined`	-

Outputs (emit)

Name	Description
`cnvkit_reference`	-
`?`	-

workflow SAMPLESHEET_TO_CHANNEL [source] ¶

Defined in subworkflows/local/samplesheet_to_channel/main.nf:3

Inputs (take)

Name	Description
`ch_from_samplesheet`	-
`aligner`	-
`ascat_alleles`	-
`ascat_loci`	-
`ascat_loci_gc`	-
`ascat_loci_rt`	-
`bcftools_annotations`	-
`bcftools_annotations_tbi`	-
`bcftools_columns`	-
`bcftools_header_lines`	-
`build_only_index`	-
`dbsnp`	-
`fasta`	-
`germline_resource`	-
`intervals`	-
`joint_germline`	-
`joint_mutect2`	-
`known_indels`	-
`known_snps`	-
`no_intervals`	-
`pon`	-
`sentieon_dnascope_emit_mode`	-
`sentieon_haplotyper_emit_mode`	-
`seq_center`	-
`seq_platform`	-
`skip_tools`	-
`snpeff_cache`	-
`snpeff_db`	-
`step`	-
`tools`	-
`umi_length`	-
`umi_location`	-
`umi_in_read_header`	-
`umi_read_structure`	-
`wes`	-

Outputs (emit)

Name	Description
`?`	-

workflow SAREK [source] ¶

Defined in workflows/sarek/main.nf:63

Inputs (take)

Name	Description
`input_sample`	-
`aligner`	-
`skip_tools`	-
`step`	-
`tools`	-
`ascat_alleles`	-
`ascat_loci`	-
`ascat_loci_gc`	-
`ascat_loci_rt`	-
`bbsplit_index`	-
`bcftools_annotations`	-
`bcftools_annotations_tbi`	-
`bcftools_columns`	-
`bcftools_header_lines`	-
`cf_chrom_len`	-
`chr_files`	-
`cnvkit_reference`	-
`dbsnp`	-
`dbsnp_tbi`	-
`dbsnp_vqsr`	-
`dict`	-
`fasta`	-
`fasta_fai`	-
`germline_resource`	-
`germline_resource_tbi`	-
`index_alignment`	-
`intervals_and_num_intervals`	-
`intervals_bed_combined`	-
`intervals_bed_combined_for_variant_calling`	-
`intervals_bed_gz_tbi_and_num_intervals`	-
`intervals_bed_gz_tbi_combined`	-
`intervals_for_preprocessing`	-
`known_indels_vqsr`	-
`known_sites_indels`	-
`known_sites_indels_tbi`	-
`known_sites_snps`	-
`known_sites_snps_tbi`	-
`known_snps_vqsr`	-
`mappability`	-
`msisensor2_models`	-
`msisensorpro_scan`	-
`ngscheckmate_bed`	-
`pon`	-
`pon_tbi`	-
`sentieon_dnascope_model`	-
`varlociraptor_scenario_germline`	-
`varlociraptor_scenario_somatic`	-
`varlociraptor_scenario_tumor_only`	-
`snpeff_cache`	-
`snpeff_db`	-
`vep_cache`	-
`vep_cache_version`	-
`vep_extra_files`	-
`vep_fasta`	-
`vep_genome`	-
`vep_species`	-
`versions`	-

Outputs (emit)

Name	Description
`?`	-
`?`	-

workflow UTILS_NEXTFLOW_PIPELINE [source] ¶

Defined in subworkflows/nf-core/utils_nextflow_pipeline/main.nf:11

utility pipeline initialise version

Subworkflow with functionality that may be useful for any Nextflow pipeline

Inputs (take)

Name	Description
`print_version`	Print the version of the pipeline and exit
`dump_parameters`	Dump the parameters of the pipeline to a JSON file
`output_directory`	Path to output dir to write JSON file to.
`check_conda_channel`	Check if the conda channel priority is correct.

Outputs (emit)

Name	Description
`dummy_emit`	Dummy emit to make nf-core subworkflows lint happy

Authors: @adamrtalbot, @drpatelh Maintainers: @adamrtalbot, @drpatelh, @maxulysse

workflow UTILS_NFCORE_PIPELINE [source] ¶

Defined in subworkflows/nf-core/utils_nfcore_pipeline/main.nf:11

utility pipeline initialise version

Subworkflow with utility functions specific to the nf-core pipeline template

Inputs (take)

Name	Description
`nextflow_cli_args`	Nextflow CLI positional arguments

Outputs (emit)

Name	Description
`success`	Dummy output to indicate success

Authors: @adamrtalbot Maintainers: @adamrtalbot, @maxulysse

workflow UTILS_NFSCHEMA_PLUGIN [source] ¶

Defined in subworkflows/nf-core/utils_nfschema_plugin/main.nf:9

validation JSON schema plugin parameters summary

Run nf-schema to validate parameters and create a summary of changed parameters

Inputs (take)

Name	Description
`input_workflow`	The workflow object of the used pipeline. This object contains meta data used to create the params summary log
`validate_params`	Validate the parameters and error if invalid.
`parameters_schema`	Path to the parameters JSON schema. This has to be the same as the schema given to the `validation.parametersSchema` config option. When this input is empty it will automatically use the configured schema or "${projectDir}/nextflow_schema.json" as default. The schema should not be given in this way for meta pipelines.

Outputs (emit)

Name	Description
`dummy_emit`	Dummy emit to make nf-core subworkflows lint happy

Authors: @nvnieuwk Maintainers: @nvnieuwk

workflow VCF_ANNOTATE_ALL [source] ¶

Defined in subworkflows/local/vcf_annotate_all/main.nf:10

Inputs (take)

Name	Description
`vcf`	-
`fasta`	-
`tools`	-
`snpeff_db`	-
`snpeff_cache`	-
`vep_genome`	-
`vep_species`	-
`vep_cache_version`	-
`vep_cache`	-
`vep_extra_files`	-
`bcftools_annotations`	-
`bcftools_annotations_index`	-
`bcftools_columns`	-
`bcftools_header_lines`	-

Outputs (emit)

Name	Description
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-

workflow VCF_ANNOTATE_ENSEMBLVEP [source] ¶

Defined in subworkflows/nf-core/vcf_annotate_ensemblvep/main.nf:8

vcf annotation ensemblvep

Perform annotation with ensemblvep and bgzip + tabix index the resulting VCF file

Components

ensemblvep/vep tabix/tabix

Inputs (take)

Name	Description
`ch_vcf`	vcf file to annotate Structure: [ val(meta), path(vcf), [path(custom_file1), path(custom_file2)... (optional)] ]
`ch_fasta`	Reference genome fasta file (optional) Structure: [ val(meta2), path(fasta) ]
`val_genome`	genome to use
`val_species`	species to use
`val_cache_version`	cache version to use
`ch_cache`	the root cache folder for ensemblvep (optional) Structure: [ val(meta3), path(cache) ]
`ch_extra_files`	any extra files needed by plugins for ensemblvep (optional) Structure: [ path(file1), path(file2)... ]

Outputs (emit)

Name	Description
`vcf_tbi`	Compressed vcf file + tabix index Structure: [ val(meta), path(vcf), path(tbi) ]
`json`	json file Structure: [ val(meta), path(json) ]
`tab`	tab file Structure: [ val(meta), path(tab) ]
`reports`	html reports
`versions`	File containing software versions

Authors: @maxulysse, @matthdsm, @nvnieuwk Maintainers: @maxulysse, @matthdsm, @nvnieuwk

workflow VCF_ANNOTATE_SNPEFF [source] ¶

Defined in subworkflows/nf-core/vcf_annotate_snpeff/main.nf:8

vcf annotation snpeff

Perform annotation with snpEff and bgzip + tabix index the resulting VCF file

Components

snpeff snpeff/snpeff tabix/bgziptabix

Inputs (take)

Name	Description
`ch_vcf`	vcf file Structure: [ val(meta), path(vcf) ]
`val_snpeff_db`	db version to use
`ch_snpeff_cache`	path to root cache folder for snpEff (optional) Structure: [ path(cache) ]

Outputs (emit)

Name	Description
`vcf_tbi`	Compressed vcf file + tabix index Structure: [ val(meta), path(vcf), path(tbi) ]
`reports`	html reports Structure: [ path(html) ]
`summary`	html reports Structure: [ path(csv) ]
`genes_txt`	html reports Structure: [ path(txt) ]
`versions`	Files containing software versions Structure: [ path(versions.yml) ]

Authors: @maxulysse Maintainers: @maxulysse

workflow VCF_QC_BCFTOOLS_VCFTOOLS [source] ¶

Defined in subworkflows/local/vcf_qc_bcftools_vcftools/main.nf:6

Inputs (take)

Name	Description
`vcf`	-
`target_bed`	-

Outputs (emit)

Name	Description
`bcftools_stats`	-
`vcftools_tstv_counts`	-
`vcftools_tstv_qual`	-
`vcftools_filter_summary`	-
`?`	-

workflow VCF_VARIANT_FILTERING_GATK [source] ¶

Defined in subworkflows/local/vcf_variant_filtering_gatk/main.nf:4

Inputs (take)

Name	Description
`vcf`	-
`fasta`	-
`fasta_fai`	-
`dict`	-
`intervals_bed_combined`	-
`known_sites`	-
`known_sites_tbi`	-

Outputs (emit)

Name	Description
`?`	-
`?`	-
`?`	-

workflow VCF_VARLOCIRAPTOR_SINGLE [source] ¶

Defined in subworkflows/local/vcf_varlociraptor_single/main.nf:9

Inputs (take)

Name	Description
`ch_cram`	-
`ch_fasta`	-
`ch_fasta_fai`	-
`ch_scenario`	-
`ch_vcf`	-
`val_num_chunks`	-
`val_sampletype`	-

Outputs (emit)

Name	Description
`vcf`	-
`tbi`	-
`versions`	-

workflow VCF_VARLOCIRAPTOR_SOMATIC [source] ¶

Defined in subworkflows/local/vcf_varlociraptor_somatic/main.nf:15

Inputs (take)

Name	Description
`ch_cram`	-
`ch_fasta`	-
`ch_fasta_fai`	-
`ch_scenario`	-
`ch_somatic_vcf`	-
`ch_germline_vcf`	-
`val_num_chunks`	-

Outputs (emit)

Name	Description
`vcf`	-
`tbi`	-
`versions`	-

Processes

This page documents all processes in the pipeline.

process ADD_INFO_TO_VCF [source] ¶

Defined in modules/local/add_info_to_vcf/main.nf:1

Inputs

Name	Type	Description
`val(meta), path(vcf_gz)`	`tuple`	-

Outputs

Name	Type	Emit	Description
`val(meta), path("*.added_info.vcf")`	`tuple`	`vcf`	-
`versions.yml`	`path`	`versions`	-

process ASCAT [source] ¶

Defined in modules/nf-core/ascat/main.nf:1

bam copy number cram

copy number profiles of tumour cells.

Tools

ascat

ASCAT is a method to derive copy number profiles of tumour cells, accounting for normal cell admixture and tumour aneuploidy. ASCAT infers tumour purity (the fraction of tumour cells) and ploidy (the amount of DNA per tumour cell), expressed as multiples of haploid genomes from SNP array or massively parallel sequencing data, and calculates whole-genome allele-specific copy number profiles (the number of copies of both parental alleles for all SNP loci across the genome).

Documentation biotools:ascat License: GPL v3

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input_normal`	`file`	BAM/CRAM file, must adhere to chr1, chr2, ...chrX notation For modifying chromosome notation in bam files please follow https://josephcckuo.wordpress.com/2016/11/17/modify-chromosome-notation-in-bam-file/.
`index_normal`	`file`	index for normal_bam/cram
`input_tumor`	`file`	BAM/CRAM file, must adhere to chr1, chr2, ...chrX notation
`index_tumor`	`file`	index for tumor_bam/cram
`allele_files`	`file`	allele files for ASCAT WGS. Can be downloaded here https://github.com/VanLoo-lab/ascat/tree/master/ReferenceFiles/WGS
`loci_files`	`file`	loci files for ASCAT WGS. Loci files without chromosome notation can be downloaded here https://github.com/VanLoo-lab/ascat/tree/master/ReferenceFiles/WGS Make sure the chromosome notation matches the bam/cram input files. To add the chromosome notation to loci files (hg19/hg38) if necessary, you can run this command `if [[ $(samtools view <your_bam_file.bam> \| head -n1 \| cut -f3)\" == \"chr\" ]]; then for i in {1..22} X; do sed -i 's/^/chr/' G1000_loci_hg19_chr_${i}.txt; done; fi`
`bed_file`	`file`	Bed file for ASCAT WES (optional, but recommended for WES)
`fasta`	`file`	Reference fasta file (optional)
`gc_file`	`file`	GC correction file (optional) - Used to do logR correction of the tumour sample(s) with genomic GC content
`rt_file`	`file`	replication timing correction file (optional, provide only in combination with gc_file)

Outputs

Name	Type	Emit	Description
`val(meta), path("alleleFrequencies_chr.txt")`	`tuple`	`allelefreqs`	-
`val(meta), path("*BAF.txt")`	`tuple`	`bafs`	-
`val(meta), path("*cnvs.txt")`	`tuple`	`cnvs`	-
`val(meta), path("*LogR.txt")`	`tuple`	`logrs`	-
`val(meta), path("*metrics.txt")`	`tuple`	`metrics`	-
`val(meta), path("*png")`	`tuple`	`png`	-
`val(meta), path("*purityploidy.txt")`	`tuple`	`purityploidy`	-
`val(meta), path("*segments.txt")`	`tuple`	`segments`	-
`versions.yml`	`path`	`versions`	-

Authors: @aasNGC, @lassefolkersen, @FriederikeHanssen, @maxulysse, @SusiJo Maintainers: @aasNGC, @lassefolkersen, @FriederikeHanssen, @maxulysse, @SusiJo

process BBMAP_BBSPLIT [source] ¶

Defined in modules/nf-core/bbmap/bbsplit/main.nf:1

align map fastq genome reference

Split sequencing reads by mapping them to multiple references simultaneously

Tools

bbmap

BBMap is a short read aligner, as well as various other bioinformatic tools.

Homepage Documentation biotools:bbmap License: UC-LBL license (see package)

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`reads`	`file`	List of input FastQ files of size 1 and 2 for single-end and paired-end data, respectively.
`other_ref_names`	`list`	List of other reference ids apart from the primary
`other_ref_paths`	`list`	Path to other references paths corresponding to "other_ref_names"

Outputs

Name	Type	Pattern	Description
`index`	`-`	-	-
`primary_fastq`	`file`	`primaryfastq.gz`	Output reads that map to the primary reference
`all_fastq`	`file`	`*fastq.gz`	All reads mapping to any of the references
`stats`	`file`	`*.txt`	Tab-delimited text file containing mapping statistics
`log`	`file`	`*.log`	Log file

Authors: @joseespinosa, @drpatelh, @pinin4fjords Maintainers: @joseespinosa, @drpatelh, @pinin4fjords

process BCFTOOLS_ANNOTATE [source] ¶

Defined in modules/nf-core/bcftools/annotate/main.nf:1

bcftools annotate vcf remove add

Add or remove annotations.

Tools

annotate

Add or remove annotations.

Homepage Documentation biotools:bcftools License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	Query VCF or BCF file, can be either uncompressed or compressed
`index`	`file`	Index of the query VCF or BCF file
`annotations`	`file`	Bgzip-compressed file with annotations
`annotations_index`	`file`	Index of the annotations file

Outputs

Name	Type	Pattern	Description
`vcf`	`file`	`*{vcf,vcf.gz,bcf,bcf.gz}`	Compressed annotated VCF file
`tbi`	`file`	`*.tbi`	Alternative VCF file index
`csi`	`file`	`*.csi`	Default VCF file index

Authors: @projectoriented, @ramprasadn Maintainers: @projectoriented, @ramprasadn

process BCFTOOLS_CONCAT [source] ¶

Defined in modules/nf-core/bcftools/concat/main.nf:1

variant calling concat bcftools VCF

Concatenate VCF files

Tools

concat

Concatenate VCF files.

Homepage Documentation biotools:bcftools License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`vcfs`	`list`	List containing 2 or more vcf files e.g. [ 'file1.vcf', 'file2.vcf' ]
`tbi`	`list`	List containing 2 or more index files (optional) e.g. [ 'file1.tbi', 'file2.tbi' ]

Outputs

Name	Type	Emit	Description
`val(meta)`	`tuple`	-	-

Authors: @abhi18av, @nvnieuwk Maintainers: @abhi18av, @nvnieuwk

process BCFTOOLS_ISEC [source] ¶

Defined in modules/nf-core/bcftools/isec/main.nf:1

variant calling intersect union complement VCF BCF

Apply set operations to VCF files

Tools

isec

Computes intersections, unions and complements of VCF files.

Homepage Documentation biotools:bcftools License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`vcfs`	`list`	List containing 2 or more vcf/bcf files. These must be compressed and have an associated index. e.g. [ 'file1.vcf.gz', 'file2.vcf' ]
`tbis`	`list`	List containing the tbi index files corresponding to the vcf/bcf input files e.g. [ 'file1.vcf.tbi', 'file2.vcf.tbi' ]

Outputs

Name	Type	Pattern	Description
`results`	`directory`	`${prefix}`	Folder containing the set operations results perform on the vcf files

Authors: @joseespinosa, @drpatelh Maintainers: @joseespinosa, @drpatelh

process BCFTOOLS_MERGE [source] ¶

Defined in modules/nf-core/bcftools/merge/main.nf:1

variant calling merge VCF

Merge VCF files

Tools

merge

Merge VCF files.

Homepage Documentation biotools:bcftools License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`vcfs`	`file`	List containing 2 or more vcf files e.g. [ 'file1.vcf', 'file2.vcf' ]
`tbis`	`file`	List containing the tbi index files corresponding to the vcfs input files e.g. [ 'file1.vcf.tbi', 'file2.vcf.tbi' ]
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fasta`	`file`	(Optional) The fasta reference file (only necessary for the `--gvcf FILE` parameter)
`meta3`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fai`	`file`	(Optional) The fasta reference file index (only necessary for the `--gvcf FILE` parameter)
`meta4`	`map`	Groovy Map containing bed information e.g. [ id:'genome' ]
`bed`	`file`	(Optional) The bed regions to merge on

Outputs

Name	Type	Pattern	Description
`vcf`	`file`	`*.{vcf,vcf.gz,bcf,bcf.gz}`	merged output file
`index`	`file`	`*.{csi,tbi}`	index of merged output

Authors: @joseespinosa, @drpatelh, @nvnieuwk, @ramprasadn Maintainers: @joseespinosa, @drpatelh, @nvnieuwk, @ramprasadn

process BCFTOOLS_MPILEUP [source] ¶

Defined in modules/nf-core/bcftools/mpileup/main.nf:1

variant calling mpileup VCF

Compresses VCF files

Tools

mpileup

Generates genotype likelihoods at each genomic position with coverage.

Homepage Documentation biotools:bcftools License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`bam`	`file`	Input BAM file
`intervals`	`file`	Input intervals file. A file (commonly '.bed') containing regions to subset
`meta2`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`fasta`	`file`	FASTA reference file
`save_mpileup`	`boolean`	Save mpileup file generated by bcftools mpileup

Outputs

Name	Type	Emit	Description
`val(meta), path("*vcf.gz")`	`tuple`	`vcf`	-
`val(meta), path("*vcf.gz.tbi")`	`tuple`	`tbi`	-
`val(meta), path("*stats.txt")`	`tuple`	`stats`	-
`val(meta), path("*.mpileup.gz")`	`tuple`	`mpileup`	-
`versions.yml`	`path`	`versions`	-

Authors: @joseespinosa, @drpatelh Maintainers: @joseespinosa, @drpatelh

process BCFTOOLS_NORM [source] ¶

Defined in modules/nf-core/bcftools/norm/main.nf:1

normalize norm variant calling VCF

Normalize VCF file

Tools

norm

Normalize VCF files.

Homepage Documentation biotools:bcftools License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`vcf`	`file`	The vcf file to be normalized e.g. 'file1.vcf'
`tbi`	`file`	An optional index of the VCF file (for when the VCF is compressed)
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fasta`	`file`	FASTA reference file

Outputs

Name	Type	Emit	Description
`val(meta)`	`tuple`	-	-

Authors: @abhi18av, @ramprasadn Maintainers: @abhi18av, @ramprasadn

process BCFTOOLS_SORT [source] ¶

Defined in modules/nf-core/bcftools/sort/main.nf:1

sorting VCF variant calling

Sorts VCF files

Tools

sort

Sort VCF files by coordinates.

Homepage Documentation biotools:bcftools License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`vcf`	`file`	The VCF/BCF file to be sorted

Outputs

Name	Type	Emit	Description
`val(meta)`	`tuple`	-	-

Authors: @Gwennid Maintainers: @Gwennid

process BCFTOOLS_STATS [source] ¶

Defined in modules/nf-core/bcftools/stats/main.nf:1

variant calling stats VCF

Generates stats from VCF files

Tools

stats

Parses VCF or BCF and produces text file stats which is suitable for machine processing and can be plotted using plot-vcfstats.

Homepage Documentation biotools:bcftools License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`vcf`	`file`	VCF input file
`tbi`	`file`	The tab index for the VCF file to be inspected. Optional: only required when parameter regions is chosen.
`meta2`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`regions`	`file`	Optionally, restrict the operation to regions listed in this file. (VCF, BED or tab-delimited)
`meta3`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`targets`	`file`	Optionally, restrict the operation to regions listed in this file (doesn't rely upon tbi index files)
`meta4`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`samples`	`file`	Optional, file of sample names to be included or excluded. e.g. 'file.tsv'
`meta5`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`exons`	`file`	Tab-delimited file with exons for indel frameshifts (chr,beg,end; 1-based, inclusive, optionally bgzip compressed). e.g. 'exons.tsv.gz'
`meta6`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`fasta`	`file`	Faidx indexed reference sequence file to determine INDEL context. e.g. 'reference.fa'

Outputs

Name	Type	Emit	Description
`val(meta), path("*stats.txt")`	`tuple`	`stats`	-
`versions.yml`	`path`	`versions`	-

Authors: @joseespinosa, @drpatelh, @SusiJo, @TCLamnidis Maintainers: @joseespinosa, @drpatelh, @SusiJo, @TCLamnidis

process BCFTOOLS_VIEW [source] ¶

Defined in modules/nf-core/bcftools/view/main.nf:1

variant calling view bcftools VCF

View, subset and filter VCF or BCF files by position and filtering expression. Convert between VCF and BCF

Tools

view

View, subset and filter VCF or BCF files by position and filtering expression. Convert between VCF and BCF

Homepage Documentation biotools:bcftools License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`vcf`	`file`	The vcf file to be inspected. e.g. 'file.vcf'
`index`	`file`	The tab index for the VCF file to be inspected. e.g. 'file.tbi'

Outputs

Name	Type	Pattern	Description
`vcf`	`file`	`*.{vcf,vcf.gz,bcf,bcf.gz}`	VCF normalized output file
`tbi`	`file`	`*.tbi`	Alternative VCF file index
`csi`	`file`	`*.csi`	Default VCF file index

Authors: @abhi18av Maintainers: @abhi18av

process BWA_INDEX [source] ¶

Defined in modules/nf-core/bwa/index/main.nf:1

index fasta genome reference

Create BWA index for reference genome

Tools

bwa

BWA is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Homepage Documentation biotools:bwa License: GPL-3.0-or-later

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing reference information. e.g. [ id:'test', single_end:false ]
`fasta`	`file`	Input genome fasta file

Outputs

Name	Type	Pattern	Description
`index`	`map`	`*.{amb,ann,bwt,pac,sa}`	Groovy Map containing reference information. e.g. [ id:'test', single_end:false ]

Authors: @drpatelh, @maxulysse Maintainers: @drpatelh, @maxulysse, @gallvp

process BWA_MEM [source] ¶

Defined in modules/nf-core/bwa/mem/main.nf:1

mem bwa alignment map fastq bam sam

Performs fastq alignment to a fasta reference using BWA

Tools

bwa

BWA is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Homepage Documentation biotools:bwa License: GPL-3.0-or-later

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`reads`	`file`	List of input FastQ files of size 1 and 2 for single-end and paired-end data, respectively.
`meta2`	`map`	Groovy Map containing reference information. e.g. [ id:'test', single_end:false ]
`index`	`file`	BWA genome index files
`meta3`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`fasta`	`file`	Reference genome in FASTA format
`sort_bam`	`boolean`	use samtools sort (true) or samtools view (false)

Outputs

Name	Type	Emit	Description
`val(meta), path("*.bam")`	`tuple`	`bam`	-
`val(meta), path("*.cram")`	`tuple`	`cram`	-
`val(meta), path("*.csi")`	`tuple`	`csi`	-
`val(meta), path("*.crai")`	`tuple`	`crai`	-
`versions.yml`	`path`	`versions`	-

Authors: @drpatelh, @jeremy1805, @matthdsm Maintainers: @drpatelh, @jeremy1805, @matthdsm

process BWAMEM2_INDEX [source] ¶

Defined in modules/nf-core/bwamem2/index/main.nf:1

index fasta genome reference

Create BWA-mem2 index for reference genome

Tools

bwamem2

BWA-mem2 is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Homepage Documentation biotools:bwa-mem2 License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`fasta`	`file`	Input genome fasta file

Outputs

Name	Type	Pattern	Description
`index`	`file`	`*.{0123,amb,ann,bwt.2bit.64,pac}`	BWA genome index files

Authors: @maxulysse Maintainers: @maxulysse

process BWAMEM2_MEM [source] ¶

Defined in modules/nf-core/bwamem2/mem/main.nf:1

mem bwa alignment map fastq bam sam

Performs fastq alignment to a fasta reference using BWA

Tools

bwa

BWA-mem2 is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Homepage Documentation biotools:bwa-mem2 License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`reads`	`file`	List of input FastQ files of size 1 and 2 for single-end and paired-end data, respectively.
`meta2`	`map`	Groovy Map containing reference/index information e.g. [ id:'test' ]
`index`	`file`	BWA genome index files
`meta3`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fasta`	`file`	Reference genome in FASTA format
`sort_bam`	`boolean`	use samtools sort (true) or samtools view (false)

Outputs

Name	Type	Emit	Description
`val(meta), path("*.sam")`	`tuple`	`sam`	-
`val(meta), path("*.bam")`	`tuple`	`bam`	-
`val(meta), path("*.cram")`	`tuple`	`cram`	-
`val(meta), path("*.crai")`	`tuple`	`crai`	-
`val(meta), path("*.csi")`	`tuple`	`csi`	-
`versions.yml`	`path`	`versions`	-

Authors: @maxulysse, @matthdsm Maintainers: @maxulysse, @matthdsm

process CAT_CAT [source] ¶

Defined in modules/nf-core/cat/cat/main.nf:1

concatenate gzip cat

A module for concatenation of gzipped or uncompressed files

Tools

cat

Just concatenation

Documentation License: GPL-3.0-or-later

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`files_in`	`file`	List of compressed / uncompressed files

Outputs

Name	Type	Emit	Description
`val(meta)`	`tuple`	-	-

Authors: @erikrikarddaniel, @FriederikeHanssen Maintainers: @erikrikarddaniel, @FriederikeHanssen

process CAT_FASTQ [source] ¶

Defined in modules/nf-core/cat/fastq/main.nf:1

cat fastq concatenate

Concatenates fastq files

Tools

cat

The cat utility reads files sequentially, writing them to the standard output.

Documentation License: GPL-3.0-or-later

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`reads`	`file`	List of input FastQ files to be concatenated.

Outputs

Name	Type	Emit	Description
`val(meta), path("*.merged.fastq.gz")`	`tuple`	`reads`	-
`versions.yml`	`path`	`versions`	-

Authors: @joseespinosa, @drpatelh Maintainers: @joseespinosa, @drpatelh

process CNVKIT_ANTITARGET [source] ¶

Defined in modules/nf-core/cnvkit/antitarget/main.nf:1

cvnkit antitarget cnv copy number

Derive off-target (“antitarget”) bins from target regions.

Tools

cnvkit

CNVkit is a Python library and command-line software toolkit to infer and visualize copy number from high-throughput DNA sequencing data. It is designed for use with hybrid capture, including both whole-exome and custom target panels, and short-read sequencing platforms such as Illumina and Ion Torrent.

Homepage Documentation biotools:cnvkit License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`targets`	`file`	File containing genomic regions

Outputs

Name	Type	Emit	Description
`val(meta), path("*.bed")`	`tuple`	`bed`	-
`versions.yml`	`path`	`versions`	-

Authors: @adamrtalbot, @priesgo, @SusiJo Maintainers: @adamrtalbot, @priesgo, @SusiJo

process CNVKIT_BATCH [source] ¶

Defined in modules/nf-core/cnvkit/batch/main.nf:1

cnvkit bam fasta copy number

Copy number variant detection from high-throughput sequencing data

Tools

cnvkit

Homepage Documentation biotools:cnvkit License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`tumor`	`file`	Input tumour sample bam file (or cram)
`normal`	`file`	Input normal sample bam file (or cram)
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'test' ]
`fasta`	`file`	Input reference genome fasta file (only needed for cram_input and/or when normal_samples are provided)
`meta3`	`map`	Groovy Map containing reference information e.g. [ id:'test' ]
`fasta_fai`	`file`	Input reference genome fasta index (optional, but recommended for cram_input)
`meta4`	`map`	Groovy Map containing information about target file e.g. [ id:'test' ]
`targets`	`file`	Input target bed file
`meta5`	`map`	Groovy Map containing information about reference file e.g. [ id:'test' ]
`reference`	`file`	Input reference cnn-file (only for germline and tumor-only running)
`panel_of_normals`	`file`	Input panel of normals file

Outputs

Name	Type	Emit	Description
`val(meta), path("*.bed")`	`tuple`	`bed`	-
`val(meta), path("*.cnn")`	`tuple`	`cnn`	-
`val(meta), path("*.cnr")`	`tuple`	`cnr`	-
`val(meta), path("*.cns")`	`tuple`	`cns`	-
`val(meta), path("*.pdf")`	`tuple`	`pdf`	-
`val(meta), path("*.png")`	`tuple`	`png`	-
`versions.yml`	`path`	`versions`	-

Authors: @adamrtalbot, @drpatelh, @fbdtemme, @kaurravneet4123, @KevinMenden, @lassefolkersen, @MaxUlysse, @priesgo, @SusiJo Maintainers: @adamrtalbot, @drpatelh, @fbdtemme, @kaurravneet4123, @KevinMenden, @lassefolkersen, @MaxUlysse, @priesgo, @SusiJo

process CNVKIT_CALL [source] ¶

Defined in modules/nf-core/cnvkit/call/main.nf:1

cnvkit bam fasta copy number

Given segmented log2 ratio estimates (.cns), derive each segment’s absolute integer copy number

Tools

cnvkit

Homepage Documentation biotools:cnvkit License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`cns`	`file`	CNVKit CNS file.
`vcf`	`file`	Germline VCF file for BAF.

Outputs

Name	Type	Emit	Description
`val(meta), path("*.cns")`	`tuple`	`cns`	-
`versions.yml`	`path`	`versions`	-

Authors: @adamrtalbot, @priesgo Maintainers: @adamrtalbot, @priesgo

process CNVKIT_EXPORT [source] ¶

Defined in modules/nf-core/cnvkit/export/main.nf:1

cnvkit copy number export

Convert copy number ratio tables (.cnr files) or segments (.cns) to another format.

Tools

cnvkit

Homepage Documentation biotools:cnvkit License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`cns`	`file`	CNVKit CNS file.

Outputs

Name	Type	Emit	Description
`val(meta)`	`tuple`	-	-

Authors: @adamrtalbot, @priesgo Maintainers: @adamrtalbot, @priesgo

process CNVKIT_GENEMETRICS [source] ¶

Defined in modules/nf-core/cnvkit/genemetrics/main.nf:1

cnvkit bam fasta copy number

Copy number variant detection from high-throughput sequencing data

Tools

cnvkit

Homepage Documentation biotools:cnvkit License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`cnr`	`file`	CNR file
`cns`	`file`	CNS file [Optional]

Outputs

Name	Type	Emit	Description
`val(meta), path("*.tsv")`	`tuple`	`tsv`	-
`versions.yml`	`path`	`versions`	-

Authors: @adamrtalbot, @marrip, @priesgo Maintainers: @adamrtalbot, @marrip, @priesgo

process CNVKIT_REFERENCE [source] ¶

Defined in modules/nf-core/cnvkit/reference/main.nf:1

cnvkit reference cnv copy number

Compile a coverage reference from the given files (normal samples).

Tools

cnvkit

Homepage Documentation biotools:cnvkit License: Apache-2.0

Inputs

Name	Type	Description
`fasta`	`file`	File containing reference genome
`targets`	`file`	File containing genomic regions
`antitargets`	`file`	File containing off-target genomic regions

Outputs

Name	Type	Emit	Description
`*.cnn`	`path`	`cnn`	-
`versions.yml`	`path`	`versions`	-

Authors: @adamrtalbot, @priesgo, @SusiJo Maintainers: @adamrtalbot, @priesgo, @SusiJo

process CONTROLFREEC_ASSESSSIGNIFICANCE [source] ¶

Defined in modules/nf-core/controlfreec/assesssignificance/main.nf:1

cna cnv somatic single tumor-only

Add both Wilcoxon test and Kolmogorov-Smirnov test p-values to each CNV output of FREEC

Tools

controlfreec/assesssignificance

Copy number and genotype annotation from whole genome and whole exome sequencing data.

Homepage Documentation License: GPL >=2

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`cnvs`	`file`	_CNVs file generated by FREEC
`ratio`	`file`	ratio file generated by FREEC

Outputs

Name	Type	Pattern	Description
`p_value_txt`	`file`	`*.p.value.txt`	CNV file containing p_values for each call

Authors: @FriederikeHanssen Maintainers: @FriederikeHanssen

process CONTROLFREEC_FREEC [source] ¶

Defined in modules/nf-core/controlfreec/freec/main.nf:1

cna cnv somatic single tumor-only

Copy number and genotype annotation from whole genome and whole exome sequencing data

Tools

controlfreec/freec

Copy number and genotype annotation from whole genome and whole exome sequencing data.

Homepage Documentation License: GPL >=2

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`mpileup_normal`	`file`	miniPileup file
`mpileup_tumor`	`file`	miniPileup file
`cpn_normal`	`file`	Raw copy number profiles (optional)
`cpn_tumor`	`file`	Raw copy number profiles (optional)
`minipileup_normal`	`file`	miniPileup file from previous run (optional)
`minipileup_tumor`	`file`	miniPileup file from previous run (optional)

Outputs

Name	Type	Pattern	Description
`bedgraph`	`file`	`.bedgraph`	Bedgraph format for the UCSC genome browser
`control_cpn`	`file`	`*_control.cpn`	files with raw copy number profiles
`sample_cpn`	`file`	`*_sample.cpn`	files with raw copy number profiles
`gcprofile_cpn`	`file`	`GC_profile.*.cpn`	file with GC-content profile.
`BAF`	`file`	`*_BAF.txt`	file B-allele frequencies for each possibly heterozygous SNP position
`CNV`	`file`	`*_CNVs`	file with coordinates of predicted copy number alterations.
`info`	`file`	`*_info.txt`	parsable file with information about FREEC run
`ratio`	`file`	`*_ratio.txt`	file with ratios and predicted copy number alterations for each window
`config`	`file`	`config.txt`	Config file used to run Control-FREEC

Authors: @FriederikeHanssen Maintainers: @FriederikeHanssen

process CONTROLFREEC_FREEC2BED [source] ¶

Defined in modules/nf-core/controlfreec/freec2bed/main.nf:1

cna cnv somatic single tumor-only

Plot Freec output

Tools

controlfreec

Copy number and genotype annotation from whole genome and whole exome sequencing data.

Homepage Documentation License: GPL >=2

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`ratio`	`file`	ratio file generated by FREEC

Outputs

Name	Type	Pattern	Description
`bed`	`file`	`*.bed`	Bed file

Authors: @FriederikeHanssen Maintainers: @FriederikeHanssen

process CONTROLFREEC_FREEC2CIRCOS [source] ¶

Defined in modules/nf-core/controlfreec/freec2circos/main.nf:1

cna cnv somatic single tumor-only

Format Freec output to circos input format

Tools

controlfreec

Copy number and genotype annotation from whole genome and whole exome sequencing data.

Homepage Documentation License: GPL >=2

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`ratio`	`file`	ratio file generated by FREEC

Outputs

Name	Type	Pattern	Description
`circos`	`file`	`*.circos.txt`	Txt file

Authors: @FriederikeHanssen Maintainers: @FriederikeHanssen

process CONTROLFREEC_MAKEGRAPH2 [source] ¶

Defined in modules/nf-core/controlfreec/makegraph2/main.nf:1

cna cnv somatic single tumor-only

Plot Freec output

Tools

controlfreec

Copy number and genotype annotation from whole genome and whole exome sequencing data.

Homepage Documentation License: GPL >=2

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`ratio`	`file`	ratio file generated by FREEC
`baf`	`file`	.BAF file generated by FREEC

Outputs

Name	Type	Pattern	Description
`png_baf`	`file`	`*_BAF.png`	Image of BAF plot
`png_ratio_log2`	`file`	`*_ratio.log2.png`	Image of ratio log2 plot
`png_ratio`	`file`	`*_ratio.png`	Image of ratio plot

Authors: @FriederikeHanssen

process CREATE_INTERVALS_BED [source] ¶

Defined in modules/local/create_intervals_bed/main.nf:1

Outputs

Name	Type	Emit	Description
`*.bed`	`path`	`bed`	-
`versions.yml`	`path`	`versions`	-

process DEEPVARIANT_RUNDEEPVARIANT [source] ¶

Defined in modules/nf-core/deepvariant/rundeepvariant/main.nf:1

variant calling machine learning neural network

DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data

Tools

deepvariant

DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data

Homepage Documentation biotools:deepvariant License: BSD-3-clause

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	BAM/CRAM file
`index`	`file`	Index of BAM/CRAM file
`intervals`	`file`	file containing intervals
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fasta`	`file`	The reference fasta file
`meta3`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fai`	`file`	Index of reference fasta file
`meta4`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`gzi`	`file`	GZI index of reference fasta file
`meta5`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`par_bed`	`file`	BED file containing PAR regions

Outputs

Name	Type	Pattern	Description
`vcf`	`file`	`*.vcf.gz`	Compressed VCF file
`vcf_index`	`file`	`*.vcf.gz.{tbi,csi}`	Tabix index file of compressed VCF
`gvcf`	`file`	`*.g.vcf.gz`	Compressed GVCF file
`gvcf_index`	`file`	`*.g.vcf.gz.{tbi,csi}`	Tabix index file of compressed GVCF

Authors: @abhi18av, @ramprasadn Maintainers: @abhi18av, @ramprasadn

process DRAGMAP_ALIGN [source] ¶

Defined in modules/nf-core/dragmap/align/main.nf:1

alignment map fastq bam sam

Performs fastq alignment to a reference using DRAGMAP

Tools

dragmap

Dragmap is the Dragen mapper/aligner Open Source Software.

Homepage Documentation License: GPL v3

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`reads`	`file`	List of input FastQ files of size 1 and 2 for single-end and paired-end data, respectively.
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'test', single_end:false ]
`hashmap`	`file`	DRAGMAP hash table
`meta3`	`map`	Groovy Map containing reference information e.g. [ id:'genome']
`fasta`	`file`	Genome fasta reference files
`sort_bam`	`boolean`	Sort the BAM file

Outputs

Name	Type	Emit	Description
`val(meta), path("*.sam")`	`tuple`	`sam`	-
`val(meta), path("*.bam")`	`tuple`	`bam`	-
`val(meta), path("*.cram")`	`tuple`	`cram`	-
`val(meta), path("*.crai")`	`tuple`	`crai`	-
`val(meta), path("*.csi")`	`tuple`	`csi`	-
`val(meta), path('*.log')`	`tuple`	`log`	-
`versions.yml`	`path`	`versions`	-

Authors: @edmundmiller Maintainers: @edmundmiller

process DRAGMAP_HASHTABLE [source] ¶

Defined in modules/nf-core/dragmap/hashtable/main.nf:1

index fasta genome reference

Create DRAGEN hashtable for reference genome

Tools

dragmap

Dragmap is the Dragen mapper/aligner Open Source Software.

Homepage Documentation License: GPL v3

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing reference information e.g. [ id:'test', single_end:false ]
`fasta`	`file`	Input genome fasta file

Outputs

Name	Type	Pattern	Description
`hashmap`	`file`	`*.{cmp,.bin,.txt}`	DRAGMAP hash table

Authors: @edmundmiller Maintainers: @edmundmiller

process ENSEMBLVEP_DOWNLOAD [source] ¶

Defined in modules/nf-core/ensemblvep/download/main.nf:1

annotation cache download

Ensembl Variant Effect Predictor (VEP). The cache downloading options are controlled through task.ext.args.

Tools

ensemblvep

VEP determines the effect of your variants (SNPs, insertions, deletions, CNVs or structural variants) on genes, transcripts, and protein sequence, as well as regulatory regions.

Homepage Documentation License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`assembly`	`string`	Genome assembly
`species`	`string`	Specie
`cache_version`	`string`	cache version

Outputs

Name	Type	Pattern	Description
`cache`	`file`	`*`	cache

Authors: @maxulysse Maintainers: @maxulysse

process ENSEMBLVEP_VEP [source] ¶

Defined in modules/nf-core/ensemblvep/vep/main.nf:1

annotation vcf json tab

Ensembl Variant Effect Predictor (VEP). The output-file-format is controlled through task.ext.args.

Tools

ensemblvep

VEP determines the effect of your variants (SNPs, insertions, deletions, CNVs or structural variants) on genes, transcripts, and protein sequence, as well as regulatory regions.

Homepage Documentation License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`vcf`	`file`	vcf to annotate
`custom_extra_files`	`file`	extra sample-specific files to be used with the `--custom` flag to be configured with ext.args (optional)
`meta2`	`map`	Groovy Map containing fasta reference information e.g. [ id:'test' ]
`fasta`	`file`	reference FASTA file (optional)

Outputs

Name	Type	Pattern	Description
`vcf`	`file`	`*.vcf.gz`	annotated vcf (optional)
`tbi`	`file`	`*.vcf.gz.tbi`	annotated vcf index (optional)
`tab`	`file`	`*.ann.tab.gz`	tab file with annotated variants (optional)
`json`	`file`	`*.ann.json.gz`	json file with annotated variants (optional)
`report`	`-`	-	-

Authors: @maxulysse, @matthdsm, @nvnieuwk Maintainers: @maxulysse, @matthdsm, @nvnieuwk

process FASTP [source] ¶

Defined in modules/nf-core/fastp/main.nf:1

trimming quality control fastq

Perform adapter/quality trimming on sequencing reads

Tools

fastp

A tool designed to provide fast all-in-one preprocessing for FastQ files. This tool is developed in C++ with multithreading supported to afford high performance.

Documentation biotools:fastp License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information. Use 'single_end: true' to specify single ended or interleaved FASTQs. Use 'single_end: false' for paired-end reads. e.g. [ id:'test', single_end:false ]
`reads`	`file`	List of input FastQ files of size 1 and 2 for single-end and paired-end data, respectively. If you wish to run interleaved paired-end data, supply as single-end data but with `--interleaved_in` in your `modules.conf`'s `ext.args` for the module.
`adapter_fasta`	`file`	File in FASTA format containing possible adapters to remove.
`discard_trimmed_pass`	`boolean`	Specify true to not write any reads that pass trimming thresholds. This can be used to use fastp for the output report only.
`save_trimmed_fail`	`boolean`	Specify true to save files that failed to pass trimming thresholds ending in `*.fail.fastq.gz`
`save_merged`	`boolean`	Specify true to save all merged reads to a file ending in `*.merged.fastq.gz`

Outputs

Name	Type	Emit	Description
`val(meta), path('*.fastp.fastq.gz')`	`tuple`	`reads`	-
`val(meta), path('*.json')`	`tuple`	`json`	-
`val(meta), path('*.html')`	`tuple`	`html`	-
`val(meta), path('*.log')`	`tuple`	`log`	-
`val(meta), path('*.fail.fastq.gz')`	`tuple`	`reads_fail`	-
`val(meta), path('*.merged.fastq.gz')`	`tuple`	`reads_merged`	-
`versions.yml`	`path`	`versions`	-

Authors: @drpatelh, @kevinmenden Maintainers: @drpatelh, @kevinmenden

process FASTQC [source] ¶

Defined in modules/nf-core/fastqc/main.nf:1

quality control qc adapters fastq

Run FastQC on sequenced reads

Tools

fastqc

FastQC gives general quality metrics about your reads. It provides information about the quality score distribution across your reads, the per base sequence content (%A/C/G/T).

You get information about adapter contamination and other overrepresented sequences.

Homepage Documentation biotools:fastqc License: GPL-2.0-only

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`reads`	`file`	List of input FastQ files of size 1 and 2 for single-end and paired-end data, respectively.

Outputs

Name	Type	Pattern	Description
`html`	`file`	`*_{fastqc.html}`	FastQC report
`zip`	`file`	`*_{fastqc.zip}`	FastQC report archive

Authors: @drpatelh, @grst, @ewels, @FelixKrueger Maintainers: @drpatelh, @grst, @ewels, @FelixKrueger

process FGBIO_CALLMOLECULARCONSENSUSREADS [source] ¶

Defined in modules/nf-core/fgbio/callmolecularconsensusreads/main.nf:1

UMIs consensus sequence bam

Calls consensus sequences from reads with the same unique molecular tag.

Tools

fgbio

Tools for working with genomic and high throughput sequencing data.

Homepage Documentation biotools:fgbio License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false, collapse:false ]
`grouped_bam`	`file`	The input SAM or BAM file, grouped by UMIs
`min_reads`	`integer`	Minimum number of original reads to build each consensus read.
`min_baseq`	`integer`	Ignore bases in raw reads that have Q below this value.

Outputs

Name	Type	Emit	Description
`val(meta), path("*.bam")`	`tuple`	`bam`	-
`versions.yml`	`path`	`versions`	-

Authors: @sruthipsuresh Maintainers: @sruthipsuresh

process FGBIO_COPYUMIFROMREADNAME [source] ¶

Defined in modules/nf-core/fgbio/copyumifromreadname/main.nf:1

sort example genomics

Copies the UMI at the end of a bam files read name to the RX tag.

Tools

fgbio

A set of tools for working with genomic and high throughput sequencing data, including UMIs

Homepage Documentation biotools:fgbio License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. `[ id:'sample1' ]`
`bam`	`file`	Sorted BAM/CRAM/SAM file
`bai`	`file`	Index for bam file

Outputs

Name	Type	Pattern	Description
`bam`	`file`	`*.{bam}`	Sorted BAM file
`bai`	`file`	`*.{bai}`	Index for bam file

Authors: @sppearce Maintainers: @sppearce

process FGBIO_FASTQTOBAM [source] ¶

Defined in modules/nf-core/fgbio/fastqtobam/main.nf:1

unaligned bam cram

Using the fgbio tools, converts FASTQ files sequenced into unaligned BAM or CRAM files possibly moving the UMI barcode into the RX field of the reads

Tools

fgbio

A set of tools for working with genomic and high throughput sequencing data, including UMIs

Homepage Documentation biotools:fgbio License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`reads`	`file`	pair of reads to be converted into BAM file

Outputs

Name	Type	Emit	Description
`val(meta), path("*.bam")`	`tuple`	`bam`	-
`val(meta), path("*.cram")`	`tuple`	`cram`	-
`versions.yml`	`path`	`versions`	-

Authors: @lescai, @matthdsm, @nvnieuwk Maintainers: @lescai, @matthdsm, @nvnieuwk

process FGBIO_GROUPREADSBYUMI [source] ¶

Defined in modules/nf-core/fgbio/groupreadsbyumi/main.nf:1

UMI groupreads fgbio

Groups reads together that appear to have come from the same original molecule. Reads are grouped by template, and then templates are sorted by the 5’ mapping positions of the reads from the template, used from earliest mapping position to latest. Reads that have the same end positions are then sub-grouped by UMI sequence. (!) Note: the MQ tag is required on reads with mapped mates (!) This can be added using samblaster with the optional argument --addMateTags.

Tools

fgbio

A set of tools for working with genomic and high throughput sequencing data, including UMIs

Homepage Documentation biotools:fgbio License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`bam`	`file`	BAM file. Note: the MQ tag is required on reads with mapped mates (!)
`strategy`	`string`	Required argument: defines the UMI assignment strategy. Must be chosen among: Identity, Edit, Adjacency, Paired.

Outputs

Name	Type	Emit	Description
`val(meta), path("*.bam")`	`tuple`	`bam`	-
`val(meta), path("*histogram.txt")`	`tuple`	`histogram`	-
`versions.yml`	`path`	`versions`	-

Authors: @lescai Maintainers: @lescai

process FREEBAYES [source] ¶

Defined in modules/nf-core/freebayes/main.nf:1

variant caller SNP genotyping somatic variant calling germline variant calling bacterial variant calling bayesian

A haplotype-based variant detector

Tools

freebayes

Bayesian haplotype-based polymorphism discovery and genotyping

Homepage Documentation biotools:freebayes License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input_1`	`file`	BAM/CRAM/SAM file
`input_1_index`	`file`	BAM/CRAM/SAM index file
`input_2`	`file`	BAM/CRAM/SAM file
`input_2_index`	`file`	BAM/CRAM/SAM index file
`target_bed`	`file`	Optional - Limit analysis to targets listed in this BED-format FILE.
`meta2`	`map`	Groovy Map containing reference information. e.g. [ id:'test_reference' ]
`fasta`	`file`	reference fasta file
`meta3`	`map`	Groovy Map containing reference information. e.g. [ id:'test_reference' ]
`fasta_fai`	`file`	reference fasta file index
`meta4`	`map`	Groovy Map containing meta information for the samples file. e.g. [ id:'test_samples' ]
`samples`	`file`	Optional - Limit analysis to samples listed (one per line) in the FILE.
`meta5`	`map`	Groovy Map containing meta information for the populations file. e.g. [ id:'test_populations' ]
`populations`	`file`	Optional - Each line of FILE should list a sample and a population which it is part of.
`meta6`	`map`	Groovy Map containing meta information for the cnv file. e.g. [ id:'test_cnv' ]
`cnv`	`file`	A copy number map BED file, which has either a sample-level ploidy: sample_name copy_number or a region-specific format: seq_name start end sample_name copy_number

Outputs

Name	Type	Pattern	Description
`vcf`	`file`	`*.vcf.gz`	Compressed VCF file

Authors: @maxibor, @FriederikeHanssen, @maxulysse Maintainers: @maxibor, @FriederikeHanssen, @maxulysse

process GATK4_APPLYBQSR [source] ¶

Defined in modules/nf-core/gatk4/applybqsr/main.nf:1

bam base quality score recalibration bqsr cram gatk4

Apply base quality score recalibration (BQSR) to a bam file

Tools

gatk4

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Homepage Documentation License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	BAM/CRAM file from alignment
`input_index`	`file`	BAI/CRAI file from alignment
`bqsr_table`	`file`	Recalibration table from gatk4_baserecalibrator
`intervals`	`file`	Bed file with the genomic regions included in the library (optional)

Outputs

Name	Type	Pattern	Description
`bam`	`file`	`${prefix}.bam`	Recalibrated BAM file
`bai`	`file`	`${prefix}*bai`	Recalibrated BAM index file
`cram`	`file`	`${prefix}.cram`	Recalibrated CRAM file

Authors: @yocra3, @FriederikeHanssen Maintainers: @yocra3, @FriederikeHanssen

process GATK4_APPLYVQSR [source] ¶

Defined in modules/nf-core/gatk4/applyvqsr/main.nf:1

gatk4 variant quality score recalibration vcf vqsr

Apply a score cutoff to filter variants based on a recalibration table. AplyVQSR performs the second pass in a two-stage process called Variant Quality Score Recalibration (VQSR). Specifically, it applies filtering to the input variants based on the recalibration table produced in the first step by VariantRecalibrator and a target sensitivity value.

Tools

gatk4

Homepage Documentation License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test']
`vcf`	`file`	VCF file to be recalibrated, this should be the same file as used for the first stage VariantRecalibrator.
`vcf_tbi`	`file`	tabix index for the input vcf file.
`recal`	`file`	Recalibration file produced when the input vcf was run through VariantRecalibrator in stage 1.
`recal_index`	`file`	Index file for the recalibration file.
`tranches`	`file`	Tranches file produced when the input vcf was run through VariantRecalibrator in stage 1.
`fasta`	`file`	The reference fasta file
`fai`	`file`	Index of reference fasta file
`dict`	`file`	GATK sequence dictionary

Outputs

Name	Type	Emit	Description
`val(meta), path("*.vcf.gz")`	`tuple`	`vcf`	-
`val(meta), path("*.tbi")`	`tuple`	`tbi`	-
`versions.yml`	`path`	`versions`	-

Authors: @GCJMackenzie Maintainers: @GCJMackenzie

process GATK4_BASERECALIBRATOR [source] ¶

Defined in modules/nf-core/gatk4/baserecalibrator/main.nf:1

base quality score recalibration table bqsr gatk4 sort

Generate recalibration table for Base Quality Score Recalibration (BQSR)

Tools

gatk4

Homepage Documentation License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	BAM/CRAM file from alignment
`input_index`	`file`	BAI/CRAI file from alignment
`intervals`	`file`	Bed file with the genomic regions included in the library (optional)
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'genome']
`fasta`	`file`	The reference fasta file
`meta3`	`map`	Groovy Map containing reference information e.g. [ id:'genome']
`fai`	`file`	Index of reference fasta file
`meta4`	`map`	Groovy Map containing reference information e.g. [ id:'genome']
`dict`	`file`	GATK sequence dictionary
`meta5`	`map`	Groovy Map containing reference information e.g. [ id:'genome']
`known_sites`	`file`	VCF files with known sites for indels / snps (optional)
`meta6`	`map`	Groovy Map containing reference information e.g. [ id:'genome']
`known_sites_tbi`	`file`	Tabix index of the known_sites (optional)

Outputs

Name	Type	Emit	Description
`val(meta), path("*.table")`	`tuple`	`table`	-
`versions.yml`	`path`	`versions`	-

Authors: @yocra3, @FriederikeHanssen, @maxulysse Maintainers: @yocra3, @FriederikeHanssen, @maxulysse

process GATK4_CALCULATECONTAMINATION [source] ¶

Defined in modules/nf-core/gatk4/calculatecontamination/main.nf:1

gatk4 calculatecontamination cross-samplecontamination getpileupsummaries filtermutectcalls

Calculates the fraction of reads from cross-sample contamination based on summary tables from getpileupsummaries. Output to be used with filtermutectcalls.

Tools

gatk4

Homepage Documentation License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test' ]
`pileup`	`file`	File containing the pileups summary table of a tumor sample to be used to calculate contamination.
`matched`	`file`	File containing the pileups summary table of a normal sample that matches with the tumor sample specified in pileup argument. This is an optional input.

Outputs

Name	Type	Emit	Description
`val(meta), path('*.contamination.table')`	`tuple`	`contamination`	-
`val(meta), path('*.segmentation.table')`	`tuple`	`segmentation`	-
`versions.yml`	`path`	`versions`	-

Authors: @GCJMackenzie, @maxulysse Maintainers: @GCJMackenzie, @maxulysse

process GATK4_CNNSCOREVARIANTS [source] ¶

Defined in modules/nf-core/gatk4/cnnscorevariants/main.nf:1

cnnscorevariants gatk4 variants

Apply a Convolutional Neural Net to filter annotated variants

Tools

gatk4

Homepage Documentation License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`vcf`	`file`	VCF file
`tbi`	`file`	VCF index file
`aligned_input`	`file`	BAM/CRAM file from alignment (optional)
`intervals`	`file`	Bed file with the genomic regions included in the library (optional)
`fasta`	`file`	The reference fasta file
`fai`	`file`	Index of reference fasta file
`dict`	`file`	GATK sequence dictionary
`architecture`	`file`	Neural Net architecture configuration json file (optional)
`weights`	`file`	Keras model HD5 file with neural net weights. (optional)

Outputs

Name	Type	Emit	Description
`val(meta), path("*cnn.vcf.gz")`	`tuple`	`vcf`	-
`val(meta), path("*cnn.vcf.gz.tbi")`	`tuple`	`tbi`	-
`versions.yml`	`path`	`versions`	-

Authors: @FriederikeHanssen Maintainers: @FriederikeHanssen

process GATK4_CREATESEQUENCEDICTIONARY [source] ¶

Defined in modules/nf-core/gatk4/createsequencedictionary/main.nf:1

createsequencedictionary dictionary fasta gatk4

Creates a sequence dictionary for a reference sequence

Tools

gatk

Homepage Documentation License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fasta`	`file`	Input fasta file

Outputs

Name	Type	Pattern	Description
`dict`	`file`	`*.{dict}`	gatk dictionary file

Authors: @maxulysse, @ramprasadn Maintainers: @maxulysse, @ramprasadn

process GATK4_ESTIMATELIBRARYCOMPLEXITY [source] ¶

Defined in modules/nf-core/gatk4/estimatelibrarycomplexity/main.nf:1

duplication metrics estimatelibrarycomplexity gatk4 reporting

Estimates the numbers of unique molecules in a sequencing library.

Tools

gatk4

Genome Analysis Toolkit (GATK4)

Homepage Documentation License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	BAM/CRAM/SAM file
`fasta`	`file`	The reference fasta file
`fai`	`file`	Index of reference fasta file
`dict`	`file`	GATK sequence dictionary

Outputs

Name	Type	Emit	Description
`val(meta), path('*.metrics')`	`tuple`	`metrics`	-
`versions.yml`	`path`	`versions`	-

Authors: @FriederikeHanssen, @maxulysse Maintainers: @FriederikeHanssen, @maxulysse

process GATK4_FILTERMUTECTCALLS [source] ¶

Defined in modules/nf-core/gatk4/filtermutectcalls/main.nf:1

filtermutectcalls filter gatk4 mutect2 vcf

Filters the raw output of mutect2, can optionally use outputs of calculatecontamination and learnreadorientationmodel to improve filtering.

Tools

gatk4

Homepage Documentation

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test' ]
`vcf`	`file`	compressed vcf file of mutect2calls
`vcf_tbi`	`file`	Tabix index of vcf file
`stats`	`file`	Stats file that pairs with output vcf file
`orientationbias`	`file`	files containing artifact priors for input vcf. Optional input.
`segmentation`	`file`	tables containing segmentation information for input vcf. Optional input.
`table`	`file`	table(s) containing contamination data for input vcf. Optional input, takes priority over estimate.
`estimate`	`float`	estimation of contamination value as a double. Optional input, will only be used if table is not specified.
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fasta`	`file`	The reference fasta file
`meta3`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fai`	`file`	Index of reference fasta file
`meta4`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`dict`	`file`	GATK sequence dictionary

Outputs

Name	Type	Emit	Description
`val(meta), path("*.vcf.gz")`	`tuple`	`vcf`	-
`val(meta), path("*.vcf.gz.tbi")`	`tuple`	`tbi`	-
`val(meta), path("*.filteringStats.tsv")`	`tuple`	`stats`	-
`versions.yml`	`path`	`versions`	-

Authors: @GCJMackenzie, @maxulysse, @ramprasadn Maintainers: @GCJMackenzie, @maxulysse, @ramprasadn

process GATK4_FILTERVARIANTTRANCHES [source] ¶

Defined in modules/nf-core/gatk4/filtervarianttranches/main.nf:1

filtervarianttranches gatk4 tranche filtering

Apply tranche filtering

Tools

gatk4

Homepage Documentation License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`vcf`	`file`	a VCF file containing variants, must have info key:CNN_2D
`tbi`	`file`	tbi file matching with -vcf
`intervals`	`file`	Intervals
`resources`	`list`	resource A VCF containing known SNP and or INDEL sites. Can be supplied as many times as necessary
`resources_index`	`list`	Index of resource VCF containing known SNP and or INDEL sites. Can be supplied as many times as necessary
`fasta`	`file`	The reference fasta file
`fai`	`file`	Index of reference fasta file
`dict`	`file`	GATK sequence dictionary

Outputs

Name	Type	Emit	Description
`val(meta), path("*.vcf.gz")`	`tuple`	`vcf`	-
`val(meta), path("*.vcf.gz.tbi")`	`tuple`	`tbi`	-
`versions.yml`	`path`	`versions`	-

Authors: @FriederikeHanssen Maintainers: @FriederikeHanssen

process GATK4_GATHERBQSRREPORTS [source] ¶

Defined in modules/nf-core/gatk4/gatherbqsrreports/main.nf:1

base quality score recalibration bqsr gatherbqsrreports gatk4

Gathers scattered BQSR recalibration reports into a single file

Tools

gatk4

Genome Analysis Toolkit (GATK4)

Homepage Documentation License: BSD-3-clause

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`table`	`file`	File(s) containing BQSR table(s)

Outputs

Name	Type	Emit	Description
`val(meta), path("*.table")`	`tuple`	`table`	-
`versions.yml`	`path`	`versions`	-

Authors: @FriederikeHanssen Maintainers: @FriederikeHanssen

process GATK4_GATHERPILEUPSUMMARIES [source] ¶

Defined in modules/nf-core/gatk4/gatherpileupsummaries/main.nf:1

gatk4 mpileup sort

write your description here

Tools

gatk4

Genome Analysis Toolkit (GATK4)

Homepage Documentation License: BSD-3-clause

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`pileup`	`file`	Pileup files from gatk4/getpileupsummaries
`dict`	`file`	dictionary

Outputs

Name	Type	Emit	Description
`val(meta), path("*.pileups.table")`	`tuple`	`table`	-
`versions.yml`	`path`	`versions`	-

Authors: @FriederikeHanssen, @maxulysse Maintainers: @FriederikeHanssen, @maxulysse

process GATK4_GENOMICSDBIMPORT [source] ¶

Defined in modules/nf-core/gatk4/genomicsdbimport/main.nf:1

gatk4 genomicsdb genomicsdbimport jointgenotyping panelofnormalscreation

merge GVCFs from multiple samples. For use in joint genotyping or somatic panel of normal creation.

Tools

gatk4

Homepage Documentation

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test']
`vcf`	`list`	either a list of vcf files to be used to create or update a genomicsdb, or a file that contains a map to vcf files to be used.
`tbi`	`list`	list of tbi files that match with the input vcf files
`interval_file`	`file`	file containing the intervals to be used when creating the genomicsdb
`interval_value`	`string`	if an intervals file has not been specified, the value entered here will be used as an interval via the "-L" argument
`wspace`	`file`	path to an existing genomicsdb to be used in update db mode or get intervals mode. This WILL NOT specify name of a new genomicsdb in create db mode.
`run_intlist`	`boolean`	Specify whether to run get interval list mode, this option cannot be specified at the same time as run_updatewspace.
`run_updatewspace`	`boolean`	Specify whether to run update genomicsdb mode, this option takes priority over run_intlist.
`input_map`	`boolean`	Specify whether the vcf input is providing a list of vcf file(s) or a single file containing a map of paths to vcf files to be used to create or update a genomicsdb.

Outputs

Name	Type	Emit	Description
`val(meta)`	`tuple`	-	-

Authors: @GCJMackenzie Maintainers: @GCJMackenzie

process GATK4_GENOTYPEGVCFS [source] ¶

Defined in modules/nf-core/gatk4/genotypegvcfs/main.nf:1

gatk4 genotype gvcf joint genotyping

Perform joint genotyping on one or more samples pre-called with HaplotypeCaller.

Tools

gatk4

Genome Analysis Toolkit (GATK4)

Homepage Documentation License: BSD-3-clause

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	gVCF(.gz) file or a GenomicsDB
`gvcf_index`	`file`	index of gvcf file, or empty when providing GenomicsDB
`intervals`	`file`	Interval file with the genomic regions included in the library (optional)
`intervals_index`	`file`	Interval index file (optional)
`meta2`	`map`	Groovy Map containing fasta information e.g. [ id:'test' ]
`fasta`	`file`	Reference fasta file
`meta3`	`map`	Groovy Map containing fai information e.g. [ id:'test' ]
`fai`	`file`	Reference fasta index file
`meta4`	`map`	Groovy Map containing dict information e.g. [ id:'test' ]
`dict`	`file`	Reference fasta sequence dict file
`meta5`	`map`	Groovy Map containing dbsnp information e.g. [ id:'test' ]
`dbsnp`	`file`	dbSNP VCF file
`meta6`	`map`	Groovy Map containing dbsnp tbi information e.g. [ id:'test' ]
`dbsnp_tbi`	`file`	dbSNP VCF index file

Outputs

Name	Type	Emit	Description
`val(meta), path("*.vcf.gz")`	`tuple`	`vcf`	-
`val(meta), path("*.tbi")`	`tuple`	`tbi`	-
`versions.yml`	`path`	`versions`	-

Authors: @santiagorevale, @maxulysse Maintainers: @santiagorevale, @maxulysse

process GATK4_GETPILEUPSUMMARIES [source] ¶

Defined in modules/nf-core/gatk4/getpileupsummaries/main.nf:1

gatk4 germlinevariantsites getpileupsumaries readcountssummary

Summarizes counts of reads that support reference, alternate and other alleles for given sites. Results can be used with CalculateContamination. Requires a common germline variant sites file, such as from gnomAD.

Tools

gatk4

Homepage Documentation License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test' ]
`input`	`file`	BAM/CRAM file to be summarised.
`index`	`file`	Index file for the input BAM/CRAM file.
`intervals`	`file`	File containing specified sites to be used for the summary. If this option is not specified, variants file is used instead automatically.
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fasta`	`file`	The reference fasta file
`meta3`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fai`	`file`	Index of reference fasta file
`meta4`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`dict`	`file`	GATK sequence dictionary
`variants`	`file`	Population vcf of germline sequencing, containing allele fractions. Is also used as sites file if no separate sites file is specified.
`variants_tbi`	`file`	Index file for the germline resource.

Outputs

Name	Type	Emit	Description
`val(meta), path('*.pileups.table')`	`tuple`	`table`	-
`versions.yml`	`path`	`versions`	-

Authors: @GCJMackenzie Maintainers: @GCJMackenzie

process GATK4_HAPLOTYPECALLER [source] ¶

Defined in modules/nf-core/gatk4/haplotypecaller/main.nf:1

gatk4 haplotype haplotypecaller

Call germline SNPs and indels via local re-assembly of haplotypes

Tools

gatk4

Homepage Documentation License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	BAM/CRAM file from alignment
`input_index`	`file`	BAI/CRAI file from alignment
`intervals`	`file`	Bed file with the genomic regions included in the library (optional)
`dragstr_model`	`file`	Text file containing the DragSTR model of the used BAM/CRAM file (optional)
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'test_reference' ]
`fasta`	`file`	The reference fasta file
`meta3`	`map`	Groovy Map containing reference information e.g. [ id:'test_reference' ]
`fai`	`file`	Index of reference fasta file
`meta4`	`map`	Groovy Map containing reference information e.g. [ id:'test_reference' ]
`dict`	`file`	GATK sequence dictionary
`meta5`	`map`	Groovy Map containing dbsnp information e.g. [ id:'test_dbsnp' ]
`dbsnp`	`file`	VCF file containing known sites (optional)
`meta6`	`map`	Groovy Map containing dbsnp information e.g. [ id:'test_dbsnp' ]
`dbsnp_tbi`	`file`	VCF index of dbsnp (optional)

Outputs

Name	Type	Emit	Description
`val(meta), path("*.vcf.gz")`	`tuple`	`vcf`	-
`val(meta), path("*.tbi")`	`tuple`	`tbi`	-
`val(meta), path("*.realigned.bam")`	`tuple`	`bam`	-
`versions.yml`	`path`	`versions`	-

Authors: @suzannejin, @FriederikeHanssen Maintainers: @suzannejin, @FriederikeHanssen

process GATK4_INTERVALLISTTOBED [source] ¶

Defined in modules/nf-core/gatk4/intervallisttobed/main.nf:1

bed conversion gatk4 interval

Converts an Picard IntervalList file to a BED file.

Tools

gatk4

Genome Analysis Toolkit (GATK4)

Homepage Documentation License: BSD-3-clause

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`intervals`	`file`	IntervalList file

Outputs

Name	Type	Emit	Description
`val(meta)`	`tuple`	-	-

Authors: @FriederikeHanssen Maintainers: @FriederikeHanssen

process GATK4_LEARNREADORIENTATIONMODEL [source] ¶

Defined in modules/nf-core/gatk4/learnreadorientationmodel/main.nf:1

gatk4 learnreadorientationmodel mutect2 readorientationartifacts

Uses f1r2 counts collected during mutect2 to Learn the prior probability of read orientation artifacts

Tools

gatk4

Homepage Documentation License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test' ]
`f1r2`	`list`	list of f1r2 files to be used as input.

Outputs

Name	Type	Emit	Description
`val(meta), path("*.tar.gz")`	`tuple`	`artifactprior`	-
`versions.yml`	`path`	`versions`	-

Authors: @GCJMackenzie Maintainers: @GCJMackenzie

process GATK4_MARKDUPLICATES [source] ¶

Defined in modules/nf-core/gatk4/markduplicates/main.nf:1

bam gatk4 markduplicates sort

This tool locates and tags duplicate reads in a BAM or SAM file, where duplicate reads are defined as originating from a single fragment of DNA.

Tools

gatk4

Homepage Documentation License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`bam`	`file`	Sorted BAM file
`fasta`	`file`	Fasta file
`fasta_fai`	`file`	Fasta index file

Outputs

Name	Type	Emit	Description
`val(meta), path("*cram")`	`tuple`	`cram`	-
`val(meta), path("*bam")`	`tuple`	`bam`	-
`val(meta), path("*.crai")`	`tuple`	`crai`	-
`val(meta), path("*.bai")`	`tuple`	`bai`	-
`val(meta), path("*.metrics")`	`tuple`	`metrics`	-
`versions.yml`	`path`	`versions`	-

Authors: @ajodeh-juma, @FriederikeHanssen, @maxulysse Maintainers: @ajodeh-juma, @FriederikeHanssen, @maxulysse

process GATK4_MERGEMUTECTSTATS [source] ¶

Defined in modules/nf-core/gatk4/mergemutectstats/main.nf:1

gatk4 merge mutect2 mutectstats

Merges mutect2 stats generated on different intervals/regions

Tools

gatk4

Genome Analysis Toolkit (GATK4)

Homepage Documentation License: BSD-3-clause

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`stats`	`file`	Stats file

Outputs

Name	Type	Emit	Description
`val(meta), path("*.vcf.gz.stats")`	`tuple`	`stats`	-
`versions.yml`	`path`	`versions`	-

Authors: @FriederikeHanssen Maintainers: @FriederikeHanssen

process GATK4_MERGEVCFS [source] ¶

Defined in modules/nf-core/gatk4/mergevcfs/main.nf:1

gatk4 merge vcf

Merges several vcf files

Tools

gatk4

Homepage Documentation License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test']
`vcf`	`list`	Two or more VCF files
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'genome']
`dict`	`file`	Optional Sequence Dictionary as input

Outputs

Name	Type	Emit	Description
`val(meta), path('*.vcf.gz')`	`tuple`	`vcf`	-
`val(meta), path("*.tbi")`	`tuple`	`tbi`	-
`versions.yml`	`path`	`versions`	-

Authors: @kevinmenden Maintainers: @kevinmenden

process GATK4_MUTECT2 [source] ¶

Defined in modules/nf-core/gatk4/mutect2/main.nf:1

gatk4 haplotype indels mutect2 snvs somatic

Call somatic SNVs and indels via local assembly of haplotypes.

Tools

gatk4

Homepage Documentation License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test']
`input`	`list`	list of BAM files, also able to take CRAM as an input
`input_index`	`list`	list of BAM file indexes, also able to take CRAM indexes as an input
`intervals`	`file`	Specify region the tools is run on.
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fasta`	`file`	The reference fasta file
`meta3`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fai`	`file`	Index of reference fasta file
`meta4`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`dict`	`file`	GATK sequence dictionary
`germline_resource`	`file`	Population vcf of germline sequencing, containing allele fractions.
`germline_resource_tbi`	`file`	Index file for the germline resource.
`panel_of_normals`	`file`	vcf file to be used as a panel of normals.
`panel_of_normals_tbi`	`file`	Index for the panel of normals.

Outputs

Name	Type	Emit	Description
`val(meta), path("*.vcf.gz")`	`tuple`	`vcf`	-
`val(meta), path("*.tbi")`	`tuple`	`tbi`	-
`val(meta), path("*.stats")`	`tuple`	`stats`	-
`val(meta), path("*.f1r2.tar.gz")`	`tuple`	`f1r2`	-
`versions.yml`	`path`	`versions`	-

Authors: @GCJMackenzie, @ramprasadn Maintainers: @GCJMackenzie, @ramprasadn

process GATK4_VARIANTRECALIBRATOR [source] ¶

Defined in modules/nf-core/gatk4/variantrecalibrator/main.nf:1

gatk4 recalibration model variantrecalibrator

Build a recalibration model to score variant quality for filtering purposes. It is highly recommended to follow GATK best practices when using this module, the gaussian mixture model requires a large number of samples to be used for the tool to produce optimal results. For example, 30 samples for exome data. For more details see https://gatk.broadinstitute.org/hc/en-us/articles/4402736812443-Which-training-sets-arguments-should-I-use-for-running-VQSR-

Tools

gatk4

Homepage Documentation

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test' ]
`vcf`	`file`	input vcf file containing the variants to be recalibrated
`tbi`	`file`	tbi file matching with -vcf
`resource_vcf`	`file`	all resource vcf files that are used with the corresponding '--resource' label
`resource_tbi`	`file`	all resource tbi files that are used with the corresponding '--resource' label
`labels`	`string`	necessary arguments for GATK VariantRecalibrator. Specified to directly match the resources provided. More information can be found at https://gatk.broadinstitute.org/hc/en-us/articles/5358906115227-VariantRecalibrator
`fasta`	`file`	The reference fasta file
`fai`	`file`	Index of reference fasta file
`dict`	`file`	GATK sequence dictionary

Outputs

Name	Type	Emit	Description
`val(meta), path("*.recal")`	`tuple`	`recal`	-
`val(meta), path("*.idx")`	`tuple`	`idx`	-
`val(meta), path("*.tranches")`	`tuple`	`tranches`	-
`val(meta), path("*plots.R")`	`tuple`	`plots`	-
`versions.yml`	`path`	`versions`	-

Authors: @GCJMackenzie, @nickhsmith Maintainers: @GCJMackenzie, @nickhsmith

process GATK4SPARK_APPLYBQSR [source] ¶

Defined in modules/nf-core/gatk4spark/applybqsr/main.nf:1

bam base quality score recalibration bqsr cram gatk4spark

Apply base quality score recalibration (BQSR) to a bam file

Tools

gatk4

Homepage Documentation License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	BAM/CRAM file from alignment
`input_index`	`file`	BAI/CRAI file from alignment
`bqsr_table`	`file`	Recalibration table from gatk4_baserecalibrator
`intervals`	`file`	Bed file with the genomic regions included in the library (optional)

Outputs

Name	Type	Pattern	Description
`bam`	`file`	`${prefix}.bam`	Recalibrated BAM file
`bai`	`file`	`${prefix}*bai`	Recalibrated BAM index file
`cram`	`file`	`${prefix}.cram`	Recalibrated CRAM file

Authors: @yocra3, @FriederikeHanssen, @maxulysse Maintainers: @yocra3, @FriederikeHanssen, @maxulysse

process GATK4SPARK_BASERECALIBRATOR [source] ¶

Defined in modules/nf-core/gatk4spark/baserecalibrator/main.nf:1

base quality score recalibration table bqsr gatk4spark sort

Generate recalibration table for Base Quality Score Recalibration (BQSR)

Tools

gatk4

Homepage Documentation License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	BAM/CRAM file from alignment
`input_index`	`file`	BAI/CRAI file from alignment
`intervals`	`file`	Bed file with the genomic regions included in the library (optional)
`fasta`	`file`	The reference fasta file
`fai`	`file`	Index of reference fasta file
`dict`	`file`	GATK sequence dictionary
`known_sites`	`file`	VCF files with known sites for indels / snps (optional)
`known_sites_tbi`	`file`	Tabix index of the known_sites (optional)

Outputs

Name	Type	Emit	Description
`val(meta), path("*.table")`	`tuple`	`table`	-
`versions.yml`	`path`	`versions`	-

Authors: @yocra3, @FriederikeHanssen, @maxulysse Maintainers: @yocra3, @FriederikeHanssen, @maxulysse

process GATK4SPARK_MARKDUPLICATES [source] ¶

Defined in modules/nf-core/gatk4spark/markduplicates/main.nf:1

bam gatk4spark markduplicates sort

This tool locates and tags duplicate reads in a BAM or SAM file, where duplicate reads are defined as originating from a single fragment of DNA.

Tools

gatk4

Homepage Documentation License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`bam`	`file`	Sorted BAM file
`fasta`	`file`	The reference fasta file
`fasta_fai`	`file`	Index of reference fasta file
`dict`	`file`	GATK sequence dictionary

Outputs

Name	Type	Emit	Description
`val(meta)`	`tuple`	-	-

Authors: @ajodeh-juma, @FriederikeHanssen, @maxulysse, @SusiJo Maintainers: @ajodeh-juma, @FriederikeHanssen, @maxulysse, @SusiJo

process GAWK [source] ¶

Defined in modules/nf-core/gawk/main.nf:1

gawk awk txt text file parsing

If you are like many computer users, you would frequently like to make changes in various text files wherever certain patterns appear, or extract data from parts of certain lines while discarding the rest. The job is easy with awk, especially the GNU implementation gawk.

Tools

gawk

GNU awk

Homepage Documentation License: GPL v3

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	The input file - Specify the logic that needs to be executed on this file on the `ext.args2` or in the program file. If the files have a `.gz` extension, they will be unzipped using `zcat`.
`program_file`	`file`	Optional file containing logic for awk to execute. If you don't wish to use a file, you can use `ext.args2` to specify the logic.
`disable_redirect_output`	`boolean`	Disable the redirection of awk output to a given file. This is useful if you want to use awk's built-in redirect to write files instead of the shell's redirect.

Outputs

Name	Type	Emit	Description
`val(meta)`	`tuple`	-	-

Authors: @nvnieuwk Maintainers: @nvnieuwk

process GOLEFT_INDEXCOV [source] ¶

Defined in modules/nf-core/goleft/indexcov/main.nf:1

coverage cnv genomics depth

Quickly estimate coverage from a whole-genome bam or cram index. A bam index has 16KB resolution so that's what this gives, but it provides what appears to be a high-quality coverage estimate in seconds per genome.

Tools

goleft

goleft is a collection of bioinformatics tools distributed under MIT license in a single static binary

Homepage Documentation License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false]
`bams`	`file`	Sorted BAM/CRAM/SAM files
`indexes`	`file`	BAI/CRAI files
`meta2`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false]
`fai`	`file`	FASTA index

Outputs

Name	Type	Emit	Description
`val(meta)`	`tuple`	-	-

Authors: @lindenb Maintainers: @lindenb

process GUNZIP [source] ¶

Defined in modules/nf-core/gunzip/main.nf:1

gunzip compression decompression

Compresses and decompresses files.

Tools

gunzip

gzip is a file format and a software application used for file compression and decompression.

Documentation License: GPL-3.0-or-later

Inputs

Name	Type	Description
`meta`	`map`	Optional groovy Map containing meta information e.g. [ id:'test', single_end:false ]
`archive`	`file`	File to be compressed/uncompressed

Outputs

Name	Type	Emit	Description
`val(meta)`	`tuple`	-	-

Authors: @joseespinosa, @drpatelh, @jfy133 Maintainers: @joseespinosa, @drpatelh, @jfy133, @gallvp

process LOFREQ_CALLPARALLEL [source] ¶

Defined in modules/nf-core/lofreq/callparallel/main.nf:1

variant calling low frequency variant calling call variants

It predicts variants using multiple processors

Tools

lofreq

Lofreq is a fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data. It's call-parallel programme predicts variants using multiple processors

Homepage Documentation biotools:lofreq License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test' ]
`bam`	`file`	Tumor sample sorted BAM file
`bai`	`file`	BAM index file
`intervals`	`file`	BED file containing target regions for variant calling
`meta2`	`map`	Groovy Map containing sample information about the reference fasta e.g. [ id:'reference' ]
`fasta`	`file`	Reference genome FASTA file
`meta3`	`map`	Groovy Map containing sample information about the reference fasta fai e.g. [ id:'reference' ]
`fai`	`file`	Reference genome FASTA index file

Outputs

Name	Type	Emit	Description
`val(meta), path("*.vcf.gz")`	`tuple`	`vcf`	-
`val(meta), path("*.vcf.gz.tbi")`	`tuple`	`tbi`	-
`versions.yml`	`path`	`versions`	-

Authors: @kaurravneet4123, @bjohnnyd Maintainers: @kaurravneet4123, @bjohnnyd, @nevinwu, @AitorPeseta

process MANTA_GERMLINE [source] ¶

Defined in modules/nf-core/manta/germline/main.nf:1

somatic wgs wxs panel vcf structural variants small indels

Manta calls structural variants (SVs) and indels from mapped paired-end sequencing reads. It is optimized for analysis of germline variation in small sets of individuals and somatic variation in tumor/normal sample pairs.

Tools

manta

Structural variant and indel caller for mapped sequencing data

Homepage Documentation biotools:manta_sv License: GPL v3

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	BAM/CRAM/SAM file. For joint calling use a list of files.
`index`	`file`	BAM/CRAM/SAM index file. For joint calling use a list of files.
`target_bed`	`file`	BED file containing target regions for variant calling
`target_bed_tbi`	`file`	Index for BED file containing target regions for variant calling
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fasta`	`file`	Genome reference FASTA file
`meta3`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fai`	`file`	Genome reference FASTA index file

Outputs

Name	Type	Pattern	Description
`candidate_small_indels_vcf`	`file`	`*.{vcf.gz}`	Gzipped VCF file containing variants
`candidate_small_indels_vcf_tbi`	`file`	`*.{vcf.gz.tbi}`	Index for gzipped VCF file containing variants
`candidate_sv_vcf`	`file`	`*.{vcf.gz}`	Gzipped VCF file containing variants
`candidate_sv_vcf_tbi`	`file`	`*.{vcf.gz.tbi}`	Index for gzipped VCF file containing variants
`diploid_sv_vcf`	`file`	`*.{vcf.gz}`	Gzipped VCF file containing variants
`diploid_sv_vcf_tbi`	`file`	`*.{vcf.gz.tbi}`	Index for gzipped VCF file containing variants

Authors: @maxulysse, @ramprasadn, @nvnieuwk Maintainers: @maxulysse, @ramprasadn, @nvnieuwk

process MANTA_SOMATIC [source] ¶

Defined in modules/nf-core/manta/somatic/main.nf:1

somatic wgs wxs panel vcf structural variants small indels

Tools

manta

Structural variant and indel caller for mapped sequencing data

Homepage Documentation biotools:manta_sv License: GPL v3

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input_normal`	`file`	BAM/CRAM/SAM file
`input_index_normal`	`file`	BAM/CRAM/SAM index file
`input_tumor`	`file`	BAM/CRAM/SAM file
`input_index_tumor`	`file`	BAM/CRAM/SAM index file
`target_bed`	`file`	BED file containing target regions for variant calling
`target_bed_tbi`	`file`	Index for BED file containing target regions for variant calling
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fasta`	`file`	Genome reference FASTA file
`meta3`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fai`	`file`	Genome reference FASTA index file

Outputs

Name	Type	Pattern	Description
`candidate_small_indels_vcf`	`file`	`*.{vcf.gz}`	Gzipped VCF file containing variants
`candidate_small_indels_vcf_tbi`	`file`	`*.{vcf.gz.tbi}`	Index for gzipped VCF file containing variants
`candidate_sv_vcf`	`file`	`*.{vcf.gz}`	Gzipped VCF file containing variants
`candidate_sv_vcf_tbi`	`file`	`*.{vcf.gz.tbi}`	Index for gzipped VCF file containing variants
`diploid_sv_vcf`	`file`	`*.{vcf.gz}`	Gzipped VCF file containing variants
`diploid_sv_vcf_tbi`	`file`	`*.{vcf.gz.tbi}`	Index for gzipped VCF file containing variants
`somatic_sv_vcf`	`file`	`*.{vcf.gz}`	Gzipped VCF file containing variants
`somatic_sv_vcf_tbi`	`file`	`*.{vcf.gz.tbi}`	Index for gzipped VCF file containing variants

Authors: @FriederikeHanssen, @nvnieuwk Maintainers: @FriederikeHanssen, @nvnieuwk

process MANTA_TUMORONLY [source] ¶

Defined in modules/nf-core/manta/tumoronly/main.nf:1

somatic wgs wxs panel vcf structural variants small indels

Tools

manta

Structural variant and indel caller for mapped sequencing data

Homepage Documentation biotools:manta_sv License: GPL v3

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	BAM/CRAM/SAM file
`input_index`	`file`	BAM/CRAM/SAM index file
`target_bed`	`file`	BED file containing target regions for variant calling
`target_bed_tbi`	`file`	Index for BED file containing target regions for variant calling
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fasta`	`file`	Genome reference FASTA file
`meta3`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fai`	`file`	Genome reference FASTA index file

Outputs

Name	Type	Pattern	Description
`candidate_small_indels_vcf`	`file`	`*.{vcf.gz}`	Gzipped VCF file containing variants
`candidate_small_indels_vcf_tbi`	`file`	`*.{vcf.gz.tbi}`	Index for gzipped VCF file containing variants
`candidate_sv_vcf`	`file`	`*.{vcf.gz}`	Gzipped VCF file containing variants
`candidate_sv_vcf_tbi`	`file`	`*.{vcf.gz.tbi}`	Index for gzipped VCF file containing variants
`tumor_sv_vcf`	`file`	`*.{vcf.gz}`	Gzipped VCF file containing variants
`tumor_sv_vcf_tbi`	`file`	`*.{vcf.gz.tbi}`	Index for gzipped VCF file containing variants

Authors: @maxulysse, @nvnieuwk Maintainers: @maxulysse, @nvnieuwk

process MOSDEPTH [source] ¶

Defined in modules/nf-core/mosdepth/main.nf:1

mosdepth bam cram coverage

Calculates genome-wide sequencing coverage.

Tools

mosdepth

Fast BAM/CRAM depth calculation for WGS, exome, or targeted sequencing.

Documentation biotools:mosdepth License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`bam`	`file`	Input BAM/CRAM file
`bai`	`file`	Index for BAM/CRAM file
`bed`	`file`	BED file with intersected intervals
`meta2`	`map`	Groovy Map containing bed information e.g. [ id:'test' ]
`fasta`	`file`	Reference genome FASTA file

Outputs

Name	Type	Emit	Description
`val(meta), path('*.global.dist.txt')`	`tuple`	`global_txt`	-
`val(meta), path('*.summary.txt')`	`tuple`	`summary_txt`	-
`val(meta), path('*.region.dist.txt')`	`tuple`	`regions_txt`	-
`val(meta), path('*.per-base.d4')`	`tuple`	`per_base_d4`	-
`val(meta), path('*.per-base.bed.gz')`	`tuple`	`per_base_bed`	-
`val(meta), path('*.per-base.bed.gz.csi')`	`tuple`	`per_base_csi`	-
`val(meta), path('*.regions.bed.gz')`	`tuple`	`regions_bed`	-
`val(meta), path('*.regions.bed.gz.csi')`	`tuple`	`regions_csi`	-
`val(meta), path('*.quantized.bed.gz')`	`tuple`	`quantized_bed`	-
`val(meta), path('*.quantized.bed.gz.csi')`	`tuple`	`quantized_csi`	-
`val(meta), path('*.thresholds.bed.gz')`	`tuple`	`thresholds_bed`	-
`val(meta), path('*.thresholds.bed.gz.csi')`	`tuple`	`thresholds_csi`	-
`versions.yml`	`path`	`versions`	-

Authors: @joseespinosa, @drpatelh, @ramprasadn, @matthdsm Maintainers: @joseespinosa, @drpatelh, @ramprasadn, @matthdsm

process MSISENSOR2_MSI [source] ¶

Defined in modules/nf-core/msisensor2/msi/main.nf:1

msi microsatellite microsatellite instability tumor cfDNA

msisensor2 detection of MSI regions.

Tools

msisensor2

MSIsensor2 is a novel algorithm based machine learning, featuring a large upgrade in the microsatellite instability (MSI) detection for tumor only sequencing data, including Cell-Free DNA (cfDNA), Formalin-Fixed Paraffin-Embedded(FFPE) and other sample types. The original MSIsensor is specially designed for tumor/normal paired sequencing data.

Homepage Documentation

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`tumor_bam`	`file`	BAM/CRAM/SAM file
`tumor_bam_index`	`file`	BAM/CRAM/SAM index file
`meta2`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`models`	`file`	Folder of MSISensor2 models (available from Github or as a product of msisensor2/scan)

Outputs

Name	Type	Pattern	Description
`msi`	`file`	-	MSI classifications as a text file
`distribution`	`file`	-	Read count distributions of MSI regions
`somatic`	`file`	-	Somatic MSI regions detected.

Authors: @adamrtalbot Maintainers: @adamrtalbot

process MSISENSORPRO_MSISOMATIC [source] ¶

Defined in modules/nf-core/msisensorpro/msisomatic/main.nf:1

micro-satellite-scan msisensor-pro msi somatic

MSIsensor-pro evaluates Microsatellite Instability (MSI) for cancer patients with next generation sequencing data. It accepts the whole genome sequencing, whole exome sequencing and target region (panel) sequencing data as input

Tools

msisensorpro

Microsatellite Instability (MSI) detection using high-throughput sequencing data.

Homepage Documentation License: Custom Licence

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`normal`	`file`	BAM/CRAM/SAM file
`normal_index`	`file`	BAM/CRAM/SAM index file
`tumor`	`file`	BAM/CRAM/SAM file
`tumor_index`	`file`	BAM/CRAM/SAM index file
`intervals`	`file`	bed file containing interval information, optional
`meta2`	`map`	Groovy Map containing genome information e.g. [ id:'genome' ]
`fasta`	`file`	Reference genome

Outputs

Name	Type	Pattern	Description
`output_report`	`file`	-	File containing final report with all detected microsatellites, unstable somatic microsatellites, msi score
`output_dis`	`file`	-	File containing distribution results
`output_germline`	`file`	-	File containing germline results
`output_somatic`	`file`	-	File containing somatic results

Authors: @FriederikeHanssen Maintainers: @FriederikeHanssen

process MSISENSORPRO_SCAN [source] ¶

Defined in modules/nf-core/msisensorpro/scan/main.nf:1

micro-satellite-scan msisensor-pro scan

Tools

msisensorpro

Microsatellite Instability (MSI) detection using high-throughput sequencing data.

Homepage Documentation License: Custom Licence

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`fasta`	`file`	Reference genome

Outputs

Name	Type	Pattern	Description
`list`	`file`	`*.{list}`	File containing microsatellite list

Authors: @FriederikeHanssen Maintainers: @FriederikeHanssen

process MULTIQC [source] ¶

Defined in modules/nf-core/multiqc/main.nf:1

QC bioinformatics tools Beautiful stand-alone HTML report

Aggregate results from bioinformatics analyses across many samples into a single report

Tools

multiqc

MultiQC searches a given directory for analysis logs and compiles a HTML report. It's a general use tool, perfect for summarising the output from numerous bioinformatics tools.

Homepage Documentation biotools:multiqc License: GPL-3.0-or-later

Outputs

Name	Type	Pattern	Description
`report`	`-`	-	-
`data`	`-`	-	-
`plots`	`-`	-	-

Authors: @abhi18av, @bunop, @drpatelh, @jfy133 Maintainers: @abhi18av, @bunop, @drpatelh, @jfy133

process MUSE_CALL [source] ¶

Defined in modules/nf-core/muse/call/main.nf:1

variant calling somatic wgs wxs vcf

pre-filtering and calculating position-specific summary statistics using the Markov substitution model

Tools

MuSE

Somatic point mutation caller based on Markov substitution model for molecular evolution

Homepage Documentation License: https://github.com/danielfan/MuSE/blob/master/LICENSE

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. `[ id:'sample1' ]`
`tumor_bam`	`file`	Sorted tumor BAM file
`tumor_bai`	`file`	Index file for the tumor BAM file
`normal_bam`	`file`	Sorted matched normal BAM file
`normal_bai`	`file`	Index file for the normal BAM file
`meta2`	`map`	Groovy Map containing reference information. e.g. `[ id:'test' ]`
`reference`	`file`	reference genome file

Outputs

Name	Type	Pattern	Description
`txt`	`file`	`*.MuSE.txt`	position-specific summary statistics

Authors: @famosab Maintainers: @famosab

process MUSE_SUMP [source] ¶

Defined in modules/nf-core/muse/sump/main.nf:1

variant calling somatic wgs wxs vcf

Computes tier-based cutoffs from a sample-specific error model which is generated by muse/call and reports the finalized variants

Tools

MuSE

Somatic point mutation caller based on Markov substitution model for molecular evolution

Homepage Documentation License: https://github.com/danielfan/MuSE/blob/master/LICENSE

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. `[ id:'sample1', single_end:false ]`
`muse_call_txt`	`file`	single input file generated by 'MuSE call'
`meta2`	`map`	Groovy Map containing reference information. e.g. `[ id:'test' ]`
`ref_vcf`	`file`	dbSNP vcf file that should be bgzip compressed, tabix indexed and based on the same reference genome used in 'MuSE call'
`ref_vcf_tbi`	`file`	Tabix index for the dbSNP vcf file

Outputs

Name	Type	Pattern	Description
`vcf`	`map`	`*.vcf.gz`	bgzipped vcf file with called variants
`tbi`	`map`	`*.vcf.gz.tbi`	tabix index of bgzipped vcf file with called variants

Authors: @famosab Maintainers: @famosab

process NGSCHECKMATE_NCM [source] ¶

Defined in modules/nf-core/ngscheckmate/ncm/main.nf:1

ngscheckmate matching snp

Determining whether sequencing data comes from the same individual by using SNP matching. Designed for humans on vcf or bam files.

Tools

ngscheckmate

NGSCheckMate is a software package for identifying next generation sequencing (NGS) data files from the same individual, including matching between DNA and RNA.

Homepage Documentation biotools:ngscheckmate License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test']
`files`	`file`	VCF or BAM files for each sample, in a merged channel (possibly gzipped). BAM files require an index too.
`meta2`	`map`	Groovy Map containing SNP information e.g. [ id:'test' ]
`snp_bed`	`file`	BED file containing the SNPs to analyse
`meta3`	`map`	Groovy Map containing reference fasta index information e.g. [ id:'test' ]
`fasta`	`file`	fasta file for the genome, only used in the bam mode

Outputs

Name	Type	Emit	Description
`val(meta), path("*_corr_matrix.txt")`	`tuple`	`corr_matrix`	-
`val(meta), path("*_matched.txt")`	`tuple`	`matched`	-
`val(meta), path("*_all.txt")`	`tuple`	`all`	-
`val(meta), path("*.pdf")`	`tuple`	`pdf`	-
`val(meta), path("*.vcf")`	`tuple`	`vcf`	-
`versions.yml`	`path`	`versions`	-

Authors: @sppearce Maintainers: @sppearce

process PARABRICKS_FQ2BAM [source] ¶

Defined in modules/nf-core/parabricks/fq2bam/main.nf:1

align sort bqsr duplicates

NVIDIA Clara Parabricks GPU-accelerated alignment, sorting, BQSR calculation, and duplicate marking. Note this nf-core module requires files to be copied into the working directory and not symlinked.

Tools

parabricks

NVIDIA Clara Parabricks GPU-accelerated genomics tools

Homepage Documentation License: custom

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`reads`	`file`	fastq.gz files
`meta2`	`map`	Groovy Map containing fasta information
`fasta`	`file`	reference fasta file - must be unzipped
`meta3`	`map`	Groovy Map containing index information
`index`	`file`	reference BWA index
`meta4`	`map`	Groovy Map containing index information
`interval_file`	`file`	(optional) file(s) containing genomic intervals for use in base quality score recalibration (BQSR)
`meta5`	`map`	Groovy Map containing known sites information
`known_sites`	`file`	(optional) known sites file(s) for calculating BQSR. markdups must be true to perform BQSR.

Outputs

Name	Type	Pattern	Description
`bam`	`file`	`*.bam`	Sorted BAM file
`bai`	`file`	`*.bai`	index corresponding to sorted BAM file
`cram`	`file`	`*.cram`	Sorted CRAM file
`crai`	`file`	`*.crai`	index corresponding to sorted CRAM file
`bqsr_table`	`file`	`*.table`	(optional) table from base quality score recalibration calculation, to be used with parabricks/applybqsr
`qc_metrics`	`directory`	`*_qc_metrics`	(optional) optional directory of qc metrics
`duplicate_metrics`	`file`	`*.duplicate-metrics.txt`	(optional) metrics calculated from marking duplicates in the bam file
`compatible_versions`	`-`	-	-

Authors: @bsiranosian, @adamrtalbot Maintainers: @bsiranosian, @adamrtalbot, @gallvp, @famosab

process RBT_VCFSPLIT [source] ¶

Defined in modules/nf-core/rbt/vcfsplit/main.nf:1

genomics splitting VCF BCF variants

A tool for splitting VCF/BCF files into N equal chunks, including BND support

Tools

rust-bio-tools

A growing collection of fast and secure command line utilities for dealing with NGS data implemented on top of Rust-Bio.

Homepage Documentation License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. `[ id:'sample1' ]`
`vcf`	`file`	VCF file with variants to be split

Outputs

Name	Type	Pattern	Description
`bcfchunks`	`file`	`*.bcf`	Chunks of the input VCF file, split into `numchunks` equal parts.

Authors: @famosab Maintainers: @famosab

process SAMTOOLS_BAM2FQ [source] ¶

Defined in modules/nf-core/samtools/bam2fq/main.nf:1

bam2fq samtools fastq

The module uses bam2fq method from samtools to convert a SAM, BAM or CRAM file to FASTQ format

Tools

samtools

Tools for dealing with SAM, BAM and CRAM files

Documentation biotools:samtools License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`inputbam`	`file`	BAM/CRAM/SAM file
`split`	`boolean`	TRUE/FALSE value to indicate if reads should be separated into /1, /2 and if present other, or singleton. Note: choosing TRUE will generate 4 different files. Choosing FALSE will produce a single file, which will be interleaved in case the input contains paired reads.

Outputs

Name	Type	Emit	Description
`val(meta), path("*.fq.gz")`	`tuple`	`reads`	-
`versions.yml`	`path`	`versions`	-

Authors: @lescai Maintainers: @lescai

process SAMTOOLS_COLLATEFASTQ [source] ¶

Defined in modules/nf-core/samtools/collatefastq/main.nf:1

bam2fq samtools fastq

The module uses collate and then fastq methods from samtools to convert a SAM, BAM or CRAM file to FASTQ format

Tools

samtools

Tools for dealing with SAM, BAM and CRAM files

Documentation biotools:samtools License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	BAM/CRAM/SAM file
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'test' ]
`fasta`	`file`	Reference genome fasta file
`interleave`	`boolean`	If true, the output is a single interleaved paired-end FASTQ If false, the output split paired-end FASTQ

Outputs

Name	Type	Emit	Description
`val(meta)`	`tuple`	-	-

Authors: @lescai, @maxulysse, @matthdsm Maintainers: @lescai, @maxulysse, @matthdsm

process SAMTOOLS_CONVERT [source] ¶

Defined in modules/nf-core/samtools/convert/main.nf:1

view index bam cram

convert and then index CRAM -> BAM or BAM -> CRAM file

Tools

samtools

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Homepage Documentation biotools:samtools License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	BAM/CRAM file
`index`	`file`	BAM/CRAM index file
`meta2`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`fasta`	`file`	Reference file to create the CRAM file
`meta3`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`fai`	`file`	Reference index file to create the CRAM file

Outputs

Name	Type	Emit	Description
`val(meta), path("*.bam")`	`tuple`	`bam`	-
`val(meta), path("*.cram")`	`tuple`	`cram`	-
`val(meta), path("*.bai")`	`tuple`	`bai`	-
`val(meta), path("*.crai")`	`tuple`	`crai`	-
`versions.yml`	`path`	`versions`	-

Authors: @FriederikeHanssen, @maxulysse Maintainers: @FriederikeHanssen, @maxulysse, @matthdsm

process SAMTOOLS_FAIDX [source] ¶

Defined in modules/nf-core/samtools/faidx/main.nf:1

index fasta faidx chromosome

Index FASTA file, and optionally generate a file of chromosome sizes

Tools

samtools

Homepage Documentation biotools:samtools License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing reference information e.g. [ id:'test' ]
`fasta`	`file`	FASTA file
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'test' ]
`fai`	`file`	FASTA index file

Outputs

Name	Type	Pattern	Description
`fa`	`file`	`*.{fa}`	FASTA file
`sizes`	`file`	`*.{sizes}`	File containing chromosome lengths
`fai`	`file`	`*.{fai}`	FASTA index file
`gzi`	`file`	`*.gzi`	Optional gzip index file for compressed inputs

Authors: @drpatelh, @ewels, @phue Maintainers: @maxulysse, @phue

process SAMTOOLS_INDEX [source] ¶

Defined in modules/nf-core/samtools/index/main.nf:1

index bam sam cram

Index SAM/BAM/CRAM file

Tools

samtools

Homepage Documentation biotools:samtools License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	input file

Outputs

Name	Type	Emit	Description
`val(meta), path("*.bai")`	`tuple`	`bai`	-
`val(meta), path("*.csi")`	`tuple`	`csi`	-
`val(meta), path("*.crai")`	`tuple`	`crai`	-
`versions.yml`	`path`	`versions`	-

Authors: @drpatelh, @ewels, @maxulysse Maintainers: @drpatelh, @ewels, @maxulysse

process SAMTOOLS_MERGE [source] ¶

Defined in modules/nf-core/samtools/merge/main.nf:1

merge bam sam cram

Merge BAM or CRAM file

Tools

samtools

Homepage Documentation biotools:samtools License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input_files`	`file`	BAM/CRAM file
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fasta`	`file`	Reference file the CRAM was created with (optional)
`meta3`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fai`	`file`	Index of the reference file the CRAM was created with (optional)

Outputs

Name	Type	Emit	Description
`val(meta)`	`tuple`	-	-

Authors: @drpatelh, @yuukiiwa , @maxulysse, @FriederikeHanssen, @ramprasadn Maintainers: @drpatelh, @yuukiiwa , @maxulysse, @FriederikeHanssen, @ramprasadn

process SAMTOOLS_MPILEUP [source] ¶

Defined in modules/nf-core/samtools/mpileup/main.nf:1

mpileup bam sam cram

BAM

Tools

samtools

Homepage Documentation biotools:samtools License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test' ]
`input`	`file`	BAM/CRAM/SAM file
`intervals`	`file`	Interval FILE
`meta2`	`map`	Groovy Map containing sample information e.g. [ id:'test' ]
`fasta`	`file`	FASTA reference file

Outputs

Name	Type	Emit	Description
`val(meta), path("*.mpileup.gz")`	`tuple`	`mpileup`	-
`versions.yml`	`path`	`versions`	-

Authors: @drpatelh, @joseespinosa Maintainers: @drpatelh, @joseespinosa

process SAMTOOLS_REINDEX_BAM [source] ¶

Defined in modules/local/samtools/reindex_bam/main.nf:5

The aim of this process is to re-index the bam file without the duplicate, supplementary, unmapped etc, for goleft/indexcov It creates a BAM containing only a header (so indexcov can get the sample name) and a BAM index were low quality reads, supplementary etc, have been removed

Inputs

Name	Type	Description
`val(meta), path(input), path(input_index)`	`tuple`	-
`val(meta2), path(fasta)`	`tuple`	-
`val(meta3), path(fai)`	`tuple`	-

Outputs

Name	Type	Emit	Description
`val(meta)`	`tuple`	-	-

process SAMTOOLS_STATS [source] ¶

Defined in modules/nf-core/samtools/stats/main.nf:1

statistics counts bam sam cram

Produces comprehensive statistics from SAM/BAM/CRAM file

Tools

samtools

Homepage Documentation biotools:samtools License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	BAM/CRAM file from alignment
`input_index`	`file`	BAI/CRAI file from alignment
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fasta`	`file`	Reference file the CRAM was created with (optional)

Outputs

Name	Type	Emit	Description
`val(meta), path("*.stats")`	`tuple`	`stats`	-
`versions.yml`	`path`	`versions`	-

Authors: @drpatelh, @FriederikeHanssen, @ramprasadn Maintainers: @drpatelh, @FriederikeHanssen, @ramprasadn

process SAMTOOLS_VIEW [source] ¶

Defined in modules/nf-core/samtools/view/main.nf:1

view bam sam cram

filter/convert SAM/BAM/CRAM file

Tools

samtools

Homepage Documentation biotools:samtools License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	BAM/CRAM/SAM file
`index`	`file`	BAM.BAI/BAM.CSI/CRAM.CRAI file (optional)
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'test' ]
`fasta`	`file`	Reference file the CRAM was created with (optional)
`qname`	`file`	Optional file with read names to output only select alignments
`index_format`	`string`	Index format, used together with ext.args = '--write-index'

Outputs

Name	Type	Emit	Description
`val(meta)`	`tuple`	-	-

Authors: @drpatelh, @joseespinosa, @FriederikeHanssen, @priyanka-surana Maintainers: @drpatelh, @joseespinosa, @FriederikeHanssen, @priyanka-surana

process SENTIEON_APPLYVARCAL [source] ¶

Defined in modules/nf-core/sentieon/applyvarcal/main.nf:1

sentieon applyvarcal varcal VQSR

Apply a score cutoff to filter variants based on a recalibration table. Sentieon's Aplyvarcal performs the second pass in a two-stage process called Variant Quality Score Recalibration (VQSR). Specifically, it applies filtering to the input variants based on the recalibration table produced in the previous step VarCal and a target sensitivity value. https://support.sentieon.com/manual/usages/general/#applyvarcal-algorithm

Tools

sentieon

Sentieon® provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Homepage Documentation

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test']
`vcf`	`file`	VCF file to be recalibrated, this should be the same file as used for the first stage VariantRecalibrator.
`vcf_tbi`	`file`	tabix index for the input vcf file.
`recal`	`file`	Recalibration file produced when the input vcf was run through VariantRecalibrator in stage 1.
`recal_index`	`file`	Index file for the recalibration file.
`tranches`	`file`	Tranches file produced when the input vcf was run through VariantRecalibrator in stage 1.
`meta2`	`map`	Groovy Map containing sample information e.g. [ id:'test']
`fasta`	`file`	The reference fasta file
`meta3`	`map`	Groovy Map containing sample information e.g. [ id:'test']
`fai`	`file`	Index of reference fasta file

Outputs

Name	Type	Pattern	Description
`vcf`	`file`	`*.vcf.gz`	compressed vcf file containing the recalibrated variants.
`tbi`	`file`	`*vcf.gz.tbi`	Index of recalibrated vcf file.

Authors: @assp8200 Maintainers: @assp8200

process SENTIEON_BWAMEM [source] ¶

Defined in modules/nf-core/sentieon/bwamem/main.nf:1

mem bwa alignment map fastq bam sentieon

Performs fastq alignment to a fasta reference using Sentieon's BWA MEM

Tools

sentieon

Homepage Documentation

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing reference information. e.g. [ id:'test', single_end:false ]
`reads`	`file`	Genome fastq files (single-end or paired-end)
`meta2`	`map`	Groovy Map containing reference information. e.g. [ id:'test', single_end:false ]
`index`	`file`	BWA genome index files
`meta3`	`map`	Groovy Map containing reference information. e.g. [ id:'test', single_end:false ]
`fasta`	`file`	Genome fasta file
`meta4`	`map`	Groovy Map containing reference information. e.g. [ id:'test', single_end:false ]
`fasta_fai`	`file`	The index of the FASTA reference.

Outputs

Name	Type	Pattern	Description
`bam_and_bai`	`file`	`.{bam,bai}, .{bam,bai}`	BAM file with corresponding index. BAM file with corresponding index.

Authors: @asp8200 Maintainers: @asp8200, @DonFreed

process SENTIEON_DEDUP [source] ¶

Defined in modules/nf-core/sentieon/dedup/main.nf:1

mem dedup map bam cram sentieon

Runs the sentieon tool LocusCollector followed by Dedup. LocusCollector collects read information that is used by Dedup which in turn marks or removes duplicate reads.

Tools

sentieon

Homepage Documentation

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing reference information. e.g. [ id:'test', single_end:false ]
`bam`	`file`	BAM file.
`bai`	`file`	BAI file
`meta2`	`map`	Groovy Map containing reference information. e.g. [ id:'test', single_end:false ]
`fasta`	`file`	Genome fasta file
`meta3`	`map`	Groovy Map containing reference information. e.g. [ id:'test', single_end:false ]
`fasta_fai`	`file`	The index of the FASTA reference.

Outputs

Name	Type	Pattern	Description
`cram`	`file`	`*.cram`	CRAM file
`crai`	`file`	`*.crai`	CRAM index file
`bam`	`file`	`*.bam`	BAM file.
`bai`	`file`	`*.bai`	BAI file
`score`	`file`	`*.score`	The score file indicates which reads LocusCollector finds are likely duplicates.
`metrics`	`file`	`*.metrics`	Output file containing Dedup metrics incl. histogram data.
`metrics_multiqc_tsv`	`file`	`*.metrics.multiqc.tsv`	Output tsv-file containing Dedup metrics excl. histogram data.

Authors: @asp8200 Maintainers: @asp8200

process SENTIEON_DNAMODELAPPLY [source] ¶

Defined in modules/nf-core/sentieon/dnamodelapply/main.nf:1

dnamodelapply vcf filter sentieon

modifies the input VCF file by adding the MLrejected FILTER to the variants

Tools

sentieon

Homepage Documentation

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. `[ id:'test', single_end:false ]`
`vcf`	`file`	INPUT VCF file
`idx`	`file`	Index of the input VCF file
`meta2`	`map`	Groovy Map containing reference information e.g. `[ id:'test' ]`
`fasta`	`file`	Genome fasta file
`meta3`	`map`	Groovy Map containing reference information e.g. `[ id:'test' ]`
`fai`	`file`	Index of the genome fasta file
`meta4`	`map`	Groovy Map containing reference information e.g. `[ id:'test' ]`
`ml_model`	`file`	machine learning model file

Outputs

Name	Type	Pattern	Description
`vcf`	`file`	`*.{vcf,vcf.gz}`	INPUT VCF file
`tbi`	`file`	`*.{tbi}`	Index of the input VCF file

Authors: @ramprasadn Maintainers: @ramprasadn

process SENTIEON_DNASCOPE [source] ¶

Defined in modules/nf-core/sentieon/dnascope/main.nf:1

dnascope sentieon variant_calling

DNAscope algorithm performs an improved version of Haplotype variant calling.

Tools

sentieon

Homepage Documentation

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information. e.g. [ id:'test', single_end:false ]
`bam`	`file`	BAM file.
`bai`	`file`	BAI file
`intervals`	`file`	bed or interval_list file containing interval in the reference that will be used in the analysis
`meta2`	`map`	Groovy Map containing meta information for fasta.
`fasta`	`file`	Genome fasta file
`meta3`	`map`	Groovy Map containing meta information for fasta index.
`fai`	`file`	Index of the genome fasta file
`meta4`	`map`	Groovy Map containing meta information for dbsnp.
`dbsnp`	`file`	Single Nucleotide Polymorphism database (dbSNP) file
`meta5`	`map`	Groovy Map containing meta information for dbsnp_tbi.
`dbsnp_tbi`	`file`	Index of the Single Nucleotide Polymorphism database (dbSNP) file
`meta6`	`map`	Groovy Map containing meta information for machine learning model for Dnascope.
`ml_model`	`file`	machine learning model file

Outputs

Name	Type	Pattern	Description
`vcf`	`file`	`*.unfiltered.vcf.gz`	Compressed VCF file
`vcf_tbi`	`file`	`*.unfiltered.vcf.gz.tbi`	Index of VCF file
`gvcf`	`file`	`*.g.vcf.gz`	Compressed GVCF file
`gvcf_tbi`	`file`	`*.g.vcf.gz.tbi`	Index of GVCF file

Authors: @ramprasadn Maintainers: @ramprasadn

process SENTIEON_GVCFTYPER [source] ¶

Defined in modules/nf-core/sentieon/gvcftyper/main.nf:1

joint genotyping genotype gvcf

Perform joint genotyping on one or more samples pre-called with Sentieon's Haplotyper.

Tools

sentieon

Homepage Documentation

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`gvcfs`	`file`	gVCF(.gz) file
`tbis`	`file`	index of gvcf file
`intervals`	`file`	Interval file with the genomic regions included in the library (optional)
`meta1`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`fasta`	`file`	Reference fasta file
`meta2`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`fai`	`file`	Reference fasta index file
`meta3`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`dbsnp`	`file`	dbSNP VCF file
`meta4`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`dbsnp_tbi`	`file`	dbSNP VCF index file

Outputs

Name	Type	Pattern	Description
`vcf_gz`	`file`	`*.vcf.gz`	VCF file
`vcf_gz_tbi`	`file`	`*.vcf.gz.tbi`	VCF index file

Authors: @asp8200 Maintainers: @asp8200

process SENTIEON_HAPLOTYPER [source] ¶

Defined in modules/nf-core/sentieon/haplotyper/main.nf:1

sentieon haplotypecaller haplotype

Runs Sentieon's haplotyper for germline variant calling.

Tools

sentieon

Homepage Documentation

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing reference information. e.g. [ id:'test', single_end:false ]
`input`	`file`	BAM/CRAM file from alignment
`input_index`	`file`	BAI/CRAI file from alignment
`intervals`	`file`	Bed file with the genomic regions included in the library (optional)
`recal_table`	`file`	Recalibration table from sentieon/qualcal (optional)
`meta2`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`fasta`	`file`	Genome fasta file
`meta3`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`fai`	`file`	The index of the FASTA reference.
`meta4`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`dbsnp`	`file`	VCF file containing known sites (optional)
`meta5`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`dbsnp_tbi`	`file`	VCF index of dbsnp (optional)

Outputs

Name	Type	Pattern	Description
`vcf`	`file`	`*.unfiltered.vcf.gz`	Compressed VCF file
`vcf_tbi`	`file`	`*.unfiltered.vcf.gz.tbi`	Index of VCF file
`gvcf`	`file`	`*.g.vcf.gz`	Compressed GVCF file
`gvcf_tbi`	`file`	`*.g.vcf.gz.tbi`	Index of GVCF file

Authors: @asp8200 Maintainers: @asp8200

process SENTIEON_TNSCOPE [source] ¶

Defined in modules/nf-core/sentieon/tnscope/main.nf:1

tnscope sentieon variant_calling

TNscope algorithm performs somatic variant calling on the tumor-normal matched pair or the tumor only data, using a Haplotyper algorithm.

Tools

sentieon

Homepage Documentation

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information. e.g. [ id:'test' ]
`input`	`file`	One or more BAM or CRAM files.
`input_index`	`file`	Indices for the input files
`intervals`	`file`	bed or interval_list file containing interval in the reference that will be used in the analysis. Only recommended for large WGS data, else the overhead may not be worth the additional parallelisation.
`meta2`	`map`	Groovy Map containing reference information. e.g. [ id:'test' ]
`fasta`	`file`	Genome fasta file
`meta3`	`map`	Groovy Map containing reference information. e.g. [ id:'test' ]
`fai`	`file`	Index of the genome fasta file
`meta4`	`map`	Groovy Map containing reference information. e.g. [ id:'test' ]
`dbsnp`	`file`	Single Nucleotide Polymorphism database (dbSNP) file
`meta5`	`map`	Groovy Map containing reference information. e.g. [ id:'test' ]
`dbsnp_tbi`	`file`	Index of the Single Nucleotide Polymorphism database (dbSNP) file
`meta6`	`map`	Groovy Map containing reference information. e.g. [ id:'test' ]
`pon`	`file`	Single Nucleotide Polymorphism database (dbSNP) file
`meta7`	`map`	Groovy Map containing reference information. e.g. [ id:'test' ]
`pon_tbi`	`file`	Index of the Single Nucleotide Polymorphism database (dbSNP) file
`meta8`	`map`	Groovy Map containing reference information. e.g. [ id:'test' ]
`cosmic`	`file`	Catalogue of Somatic Mutations in Cancer (COSMIC) VCF file.
`meta9`	`map`	Groovy Map containing reference information. e.g. [ id:'test' ]
`cosmic_tbi`	`file`	Index of the Catalogue of Somatic Mutations in Cancer (COSMIC) VCF file.

Outputs

Name	Type	Pattern	Description
`vcf`	`file`	`*.{vcf.gz}`	VCF file
`index`	`file`	`*.vcf.gz.tbi`	Index of the VCF file

Authors: @ramprasadn Maintainers: @ramprasadn

process SENTIEON_VARCAL [source] ¶

Defined in modules/nf-core/sentieon/varcal/main.nf:1

sentieon varcal variant recalibration

Module for Sentieons VarCal. The VarCal algorithm calculates the Variant Quality Score Recalibration (VQSR). VarCal builds a recalibration model for scoring variant quality. https://support.sentieon.com/manual/usages/general/#varcal-algorithm

Tools

sentieon

Homepage Documentation

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test' ]
`vcf`	`file`	input vcf file containing the variants to be recalibrated
`tbi`	`file`	tbi file matching with -vcf

Outputs

Name	Type	Pattern	Description
`recal`	`file`	`*.recal`	Output recal file used by ApplyVQSR
`idx`	`file`	`*.idx`	Index file for the recal output file
`tranches`	`file`	`*.tranches`	Output tranches file used by ApplyVQSR
`plots`	`file`	`*plots.R`	Optional output rscript file to aid in visualization of the input data and learned model.

Authors: @asp8200 Maintainers: @asp8200

process SNPEFF_DOWNLOAD [source] ¶

Defined in modules/nf-core/snpeff/download/main.nf:1

annotation effect prediction snpeff variant vcf

Genetic variant annotation and functional effect prediction toolbox

Tools

snpeff

SnpEff is a variant annotation and effect prediction tool. It annotates and predicts the effects of genetic variants on genes and proteins (such as amino acid changes).

Homepage Documentation biotools:snpeff License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`snpeff_db`	`string`	SnpEff database name

Outputs

Name	Type	Pattern	Description
`cache`	`file`	-	snpEff cache

Authors: @maxulysse Maintainers: @maxulysse

process SNPEFF_SNPEFF [source] ¶

Defined in modules/nf-core/snpeff/snpeff/main.nf:1

annotation effect prediction snpeff variant vcf

Genetic variant annotation and functional effect prediction toolbox

Tools

snpeff

SnpEff is a variant annotation and effect prediction tool. It annotates and predicts the effects of genetic variants on genes and proteins (such as amino acid changes).

Homepage Documentation biotools:snpeff License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`vcf`	`file`	vcf to annotate
`meta2`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`cache`	`file`	path to snpEff cache (optional)

Outputs

Name	Type	Pattern	Description
`vcf`	`file`	`*.ann.vcf`	annotated vcf
`report`	`file`	`*.csv`	snpEff report csv file
`summary_html`	`file`	`*.html`	snpEff summary statistics in html file
`genes_txt`	`file`	`*.genes.txt`	txt (tab separated) file having counts of the number of variants affecting each transcript and gene

Authors: @maxulysse Maintainers: @maxulysse

process SPRING_DECOMPRESS [source] ¶

Defined in modules/nf-core/spring/decompress/main.nf:1

FASTQ decompression lossless

Fast, efficient, lossless decompression of FASTQ files.

Tools

spring

SPRING is a compression tool for Fastq files (containing up to 4.29 Billion reads)

Homepage Documentation biotools:spring License: Free for non-commercial use

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`spring`	`file`	Spring file to decompress.
`write_one_fastq_gz`	`boolean`	Controls whether spring should write one fastq.gz file with reads from both directions or two fastq.gz files with reads from distinct directions

Outputs

Name	Type	Emit	Description
`val(meta), path("*.fastq.gz")`	`tuple`	`fastq`	-
`versions.yml`	`path`	`versions`	-

Authors: @xec-cm Maintainers: @xec-cm

process STRELKA_GERMLINE [source] ¶

Defined in modules/nf-core/strelka/germline/main.nf:1

variantcalling germline wgs vcf variants

Strelka2 is a fast and accurate small variant caller optimized for analysis of germline variation

Tools

strelka

Strelka calls somatic and germline small variants from mapped sequencing reads

Homepage Documentation biotools:strelka License: GPL v3

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test']
`input`	`file`	BAM/CRAM file
`input_index`	`file`	BAM/CRAI index file
`target_bed`	`file`	BED file containing target regions for variant calling
`target_bed_index`	`file`	Index for BED file containing target regions for variant calling

Outputs

Name	Type	Pattern	Description
`vcf`	`file`	`*.{vcf.gz}`	gzipped germline variant file
`vcf_tbi`	`file`	`*.vcf.gz.tbi`	index file for the vcf file
`genome_vcf`	`file`	`*_genome.vcf.gz`	variant records and compressed non-variant blocks
`genome_vcf_tbi`	`file`	`*_genome.vcf.gz.tbi`	index file for the genome_vcf file

Authors: @arontommi Maintainers: @arontommi

process STRELKA_SOMATIC [source] ¶

Defined in modules/nf-core/strelka/somatic/main.nf:1

variant calling germline wgs vcf variants

Strelka2 is a fast and accurate small variant caller optimized for analysis of germline variation in small cohorts and somatic variation in tumor/normal sample pairs

Tools

strelka

Strelka calls somatic and germline small variants from mapped sequencing reads

Homepage Documentation biotools:strelka License: GPL v3

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input_normal`	`file`	BAM/CRAM/SAM file
`input_index_normal`	`file`	BAM/CRAM/SAM index file
`input_tumor`	`file`	BAM/CRAM/SAM file
`input_index_tumor`	`file`	BAM/CRAM/SAM index file
`manta_candidate_small_indels`	`file`	VCF.gz file
`manta_candidate_small_indels_tbi`	`file`	VCF.gz index file
`target_bed`	`file`	BED file containing target regions for variant calling
`target_bed_index`	`file`	Index for BED file containing target regions for variant calling

Outputs

Name	Type	Pattern	Description
`vcf_indels`	`file`	`*.{vcf.gz}`	Gzipped VCF file containing variants
`vcf_indels_tbi`	`file`	`*.{vcf.gz.tbi}`	Index for gzipped VCF file containing variants
`vcf_snvs`	`file`	`*.{vcf.gz}`	Gzipped VCF file containing variants
`vcf_snvs_tbi`	`file`	`*.{vcf.gz.tbi}`	Index for gzipped VCF file containing variants

Authors: @drpatelh Maintainers: @drpatelh

process SVDB_MERGE [source] ¶

Defined in modules/nf-core/svdb/merge/main.nf:1

structural variants vcf merge

The merge module merges structural variants within one or more vcf files.

Tools

svdb

structural variant database software

Homepage Documentation License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test' ]
`vcfs`	`list`	One or more VCF files. The order and number of files should correspond to the order and number of tags in the `priority` input channel.
`input_priority`	`list`	Prioritize the input VCF files according to this list, e.g ['tiddit','cnvnator']. The order and number of tags should correspond to the order and number of VCFs in the `vcfs` input channel.
`sort_inputs`	`boolean`	Should the input files be sorted by name. The priority tag will be sorted together with it's corresponding VCF file.

Outputs

Name	Type	Emit	Description
`val(meta)`	`tuple`	-	-

Authors: @ramprasadn Maintainers: @ramprasadn, @fellen31

process TABIX_BGZIPTABIX [source] ¶

Defined in modules/nf-core/tabix/bgziptabix/main.nf:1

bgzip compress index tabix vcf

bgzip a sorted tab-delimited genome file and then create tabix index

Tools

tabix

Generic indexer for TAB-delimited genome position files.

Homepage Documentation biotools:tabix License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	Sorted tab-delimited genome file

Outputs

Name	Type	Pattern	Description
`gz_tbi`	`file`	`.gz, .tbi`	bgzipped tab-delimited genome file tabix index file
`gz_csi`	`file`	`.gz, .csi`	bgzipped tab-delimited genome file csi index file

Authors: @maxulysse, @DLBPointon Maintainers: @maxulysse, @DLBPointon

process TABIX_TABIX [source] ¶

Defined in modules/nf-core/tabix/tabix/main.nf:1

index tabix vcf

create tabix index from a sorted bgzip tab-delimited genome file

Tools

tabix

Generic indexer for TAB-delimited genome position files.

Homepage Documentation biotools:tabix License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`tab`	`file`	TAB-delimited genome position file compressed with bgzip

Outputs

Name	Type	Pattern	Description
`tbi`	`file`	`*.{tbi}`	tabix index file
`csi`	`file`	`*.{csi}`	coordinate sorted index file

Authors: @joseespinosa, @drpatelh, @maxulysse Maintainers: @joseespinosa, @drpatelh, @maxulysse

process TIDDIT_SV [source] ¶

Defined in modules/nf-core/tiddit/sv/main.nf:1

structural variants vcf

Identify chromosomal rearrangements.

Tools

Search for structural variants.

Homepage Documentation biotools:tiddit License: GPL-3.0-or-later

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	BAM/CRAM file
`input_index`	`file`	BAM/CRAM index file
`meta2`	`map`	Groovy Map containing sample information e.g. `[ id:'test_fasta']`
`fasta`	`file`	Input FASTA file
`meta3`	`map`	Groovy Map containing sample information from bwa index e.g. `[ id:'test_bwa-index' ]`
`bwa_index`	`file`	BWA genome index files

Outputs

Name	Type	Emit	Description
`val(meta), path("*.vcf")`	`tuple`	`vcf`	-
`val(meta), path("*.ploidies.tab")`	`tuple`	`ploidy`	-
`versions.yml`	`path`	`versions`	-

Authors: @maxulysse Maintainers: @maxulysse

process UNTAR [source] ¶

Defined in modules/nf-core/untar/main.nf:1

untar uncompress extract

Extract files.

Tools

untar

Extract tar.gz files.

Documentation License: GPL-3.0-or-later

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`archive`	`file`	File to be untar

Outputs

Name	Type	Pattern	Description
`untar`	`map`	`*/`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]

Authors: @joseespinosa, @drpatelh, @matthdsm, @jfy133 Maintainers: @joseespinosa, @drpatelh, @matthdsm, @jfy133

process UNZIP [source] ¶

Defined in modules/nf-core/unzip/main.nf:1

unzip decompression zip archiving

Unzip ZIP archive files

Tools

unzip

p7zip is a quick port of 7z.exe and 7za.exe (command line version of 7zip, see www.7-zip.org) for Unix.

Homepage Documentation License: LGPL-2.1-or-later

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`archive`	`file`	ZIP file

Outputs

Name	Type	Pattern	Description
`unzipped_archive`	`directory`	`${archive.baseName}/`	Directory contents of the unzipped archive

Authors: @jfy133 Maintainers: @jfy133

process VARLOCIRAPTOR_CALLVARIANTS [source] ¶

Defined in modules/nf-core/varlociraptor/callvariants/main.nf:1

observations variants calling

Call variants for a given scenario specified with the varlociraptor calling grammar, preprocessed by varlociraptor preprocessing

Tools

varlociraptor

Flexible, uncertainty-aware variant calling with parameter free filtration via FDR control.

Homepage Documentation biotools:varlociraptor License: GPL v3

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`vcfs`	`file`	Sorted VCF/BCF file containing sample observations, Can also be a list of files

Outputs

Name	Type	Pattern	Description
`bcf`	`file`	`*.bcf`	BCF file containing sample observations

Authors: @FriederikeHanssen Maintainers: @FriederikeHanssen, @famosab

process VARLOCIRAPTOR_ESTIMATEALIGNMENTPROPERTIES [source] ¶

Defined in modules/nf-core/varlociraptor/estimatealignmentproperties/main.nf:1

estimation alignment variants

In order to judge about candidate indel and structural variants, Varlociraptor needs to know about certain properties of the underlying sequencing experiment in combination with the used read aligner.

Tools

varlociraptor

Flexible, uncertainty-aware variant calling with parameter free filtration via FDR control.

Homepage Documentation biotools:varlociraptor License: GPL v3

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`bam`	`file`	Sorted BAM/CRAM/SAM file
`bai`	`file`	Index of sorted BAM/CRAM/SAM file
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'test', single_end:false ]
`fasta`	`file`	Reference fasta file
`meta3`	`map`	Groovy Map containing reference index information e.g. [ id:'test', single_end:false ]
`fai`	`file`	Index for reference fasta file (must be with samtools index)

Outputs

Name	Type	Pattern	Description
`alignment_properties_json`	`file`	`*.alignment-properties.json`	File containing alignment properties

Authors: @FriederikeHanssen Maintainers: @FriederikeHanssen, @famosab

process VARLOCIRAPTOR_PREPROCESS [source] ¶

Defined in modules/nf-core/varlociraptor/preprocess/main.nf:1

observations variants preprocessing

Obtains per-sample observations for the actual calling process with varlociraptor calls

Tools

varlociraptor

Flexible, uncertainty-aware variant calling with parameter free filtration via FDR control.

Homepage Documentation biotools:varlociraptor License: GPL v3

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`bam`	`file`	Sorted BAM/CRAM/SAM file
`bai`	`file`	Index of the BAM/CRAM/SAM file
`candidates`	`file`	Sorted BCF/VCF file
`alignment_json`	`file`	File containing alignment properties obtained with varlociraptor/estimatealignmentproperties
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'test', single_end:false ]
`fasta`	`file`	Reference fasta file
`meta3`	`map`	Groovy Map containing reference index information e.g. [ id:'test', single_end:false ]
`fai`	`file`	Index for reference fasta file (must be with samtools index)

Outputs

Name	Type	Pattern	Description
`bcf`	`file`	`*.bcf`	BCF file containing sample observations

Authors: @FriederikeHanssen Maintainers: @FriederikeHanssen, @famosab

process VCFLIB_VCFFILTER [source] ¶

Defined in modules/nf-core/vcflib/vcffilter/main.nf:1

filter variant vcf quality

Command line tools for parsing and manipulating VCF files.

Tools

vcflib

Command line tools for parsing and manipulating VCF files.

Homepage Documentation biotools:vcflib License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test_sample_1' ]
`vcf`	`file`	VCF file
`tbi`	`file`	Index file

Outputs

Name	Type	Pattern	Description
`vcf`	`file`	`*.{vcf.gz}`	Filtered VCF file

Authors: @zachary-foster Maintainers: @zachary-foster

process VCFTOOLS [source] ¶

Defined in modules/nf-core/vcftools/main.nf:1

VCFtools VCF sort

A set of tools written in Perl and C++ for working with VCF files

Tools

vcftools

A set of tools written in Perl and C++ for working with VCF files. This package only contains the C++ libraries whereas the package perl-vcftools-vcf contains the perl libraries

Homepage Documentation biotools:vcftools License: LGPL

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`variant_file`	`file`	variant input file which can be vcf, vcf.gz, or bcf format.
`bed`	`file`	bed file which can be used with different arguments in vcftools (optional)
`diff_variant_file`	`file`	secondary variant file which can be used with the 'diff' suite of tools (optional)

Outputs

Name	Type	Emit	Description
`val(meta), path("*.vcf")`	`tuple`	`vcf`	-
`val(meta), path("*.bcf")`	`tuple`	`bcf`	-
`val(meta), path("*.frq")`	`tuple`	`frq`	-
`val(meta), path("*.frq.count")`	`tuple`	`frq_count`	-
`val(meta), path("*.idepth")`	`tuple`	`idepth`	-
`val(meta), path("*.ldepth")`	`tuple`	`ldepth`	-
`val(meta), path("*.ldepth.mean")`	`tuple`	`ldepth_mean`	-
`val(meta), path("*.gdepth")`	`tuple`	`gdepth`	-
`val(meta), path("*.hap.ld")`	`tuple`	`hap_ld`	-
`val(meta), path("*.geno.ld")`	`tuple`	`geno_ld`	-
`val(meta), path("*.geno.chisq")`	`tuple`	`geno_chisq`	-
`val(meta), path("*.list.hap.ld")`	`tuple`	`list_hap_ld`	-
`val(meta), path("*.list.geno.ld")`	`tuple`	`list_geno_ld`	-
`val(meta), path("*.interchrom.hap.ld")`	`tuple`	`interchrom_hap_ld`	-
`val(meta), path("*.interchrom.geno.ld")`	`tuple`	`interchrom_geno_ld`	-
`val(meta), path("*.TsTv")`	`tuple`	`tstv`	-
`val(meta), path("*.TsTv.summary")`	`tuple`	`tstv_summary`	-
`val(meta), path("*.TsTv.count")`	`tuple`	`tstv_count`	-
`val(meta), path("*.TsTv.qual")`	`tuple`	`tstv_qual`	-
`val(meta), path("*.FILTER.summary")`	`tuple`	`filter_summary`	-
`val(meta), path("*.sites.pi")`	`tuple`	`sites_pi`	-
`val(meta), path("*.windowed.pi")`	`tuple`	`windowed_pi`	-
`val(meta), path("*.weir.fst")`	`tuple`	`weir_fst`	-
`val(meta), path("*.het")`	`tuple`	`heterozygosity`	-
`val(meta), path("*.hwe")`	`tuple`	`hwe`	-
`val(meta), path("*.Tajima.D")`	`tuple`	`tajima_d`	-
`val(meta), path("*.ifreqburden")`	`tuple`	`freq_burden`	-
`val(meta), path("*.LROH")`	`tuple`	`lroh`	-
`val(meta), path("*.relatedness")`	`tuple`	`relatedness`	-
`val(meta), path("*.relatedness2")`	`tuple`	`relatedness2`	-
`val(meta), path("*.lqual")`	`tuple`	`lqual`	-
`val(meta), path("*.imiss")`	`tuple`	`missing_individual`	-
`val(meta), path("*.lmiss")`	`tuple`	`missing_site`	-
`val(meta), path("*.snpden")`	`tuple`	`snp_density`	-
`val(meta), path("*.kept.sites")`	`tuple`	`kept_sites`	-
`val(meta), path("*.removed.sites")`	`tuple`	`removed_sites`	-
`val(meta), path("*.singletons")`	`tuple`	`singeltons`	-
`val(meta), path("*.indel.hist")`	`tuple`	`indel_hist`	-
`val(meta), path("*.hapcount")`	`tuple`	`hapcount`	-
`val(meta), path("*.mendel")`	`tuple`	`mendel`	-
`val(meta), path("*.FORMAT")`	`tuple`	`format`	-
`val(meta), path("*.INFO")`	`tuple`	`info`	-
`val(meta), path("*.012")`	`tuple`	`genotypes_matrix`	-
`val(meta), path("*.012.indv")`	`tuple`	`genotypes_matrix_individual`	-
`val(meta), path("*.012.pos")`	`tuple`	`genotypes_matrix_position`	-
`val(meta), path("*.impute.hap")`	`tuple`	`impute_hap`	-
`val(meta), path("*.impute.hap.legend")`	`tuple`	`impute_hap_legend`	-
`val(meta), path("*.impute.hap.indv")`	`tuple`	`impute_hap_indv`	-
`val(meta), path("*.ldhat.sites")`	`tuple`	`ldhat_sites`	-
`val(meta), path("*.ldhat.locs")`	`tuple`	`ldhat_locs`	-
`val(meta), path("*.BEAGLE.GL")`	`tuple`	`beagle_gl`	-
`val(meta), path("*.BEAGLE.PL")`	`tuple`	`beagle_pl`	-
`val(meta), path("*.ped")`	`tuple`	`ped`	-
`val(meta), path("*.map")`	`tuple`	`map_`	-
`val(meta), path("*.tped")`	`tuple`	`tped`	-
`val(meta), path("*.tfam")`	`tuple`	`tfam`	-
`val(meta), path("*.diff.sites_in_files")`	`tuple`	`diff_sites_in_files`	-
`val(meta), path("*.diff.indv_in_files")`	`tuple`	`diff_indv_in_files`	-
`val(meta), path("*.diff.sites")`	`tuple`	`diff_sites`	-
`val(meta), path("*.diff.indv")`	`tuple`	`diff_indv`	-
`val(meta), path("*.diff.discordance.matrix")`	`tuple`	`diff_discd_matrix`	-
`val(meta), path("*.diff.switch")`	`tuple`	`diff_switch_error`	-
`versions.yml`	`path`	`versions`	-

Authors: @Mark-S-Hill Maintainers: @Mark-S-Hill

process YTE [source] ¶

Defined in modules/nf-core/yte/main.nf:1

yaml template python

A YAML template engine with Python expressions

Tools

yte

A YAML template engine with Python expressions

Homepage Documentation License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. `[ id:'template1' ]`
`template`	`file`	YTE template
`map_file`	`file`	YAML file containing a map to be used in the template
`map`	`map`	Groovy Map containing mapping information to be used in the template e.g. `[ key: value ]` with key being a wildcard in the template

Outputs

Name	Type	Pattern	Description
`rendered`	`file`	`*.yaml`	Rendered YAML file

Authors: @famosab Maintainers: @famosab

Functions

This page documents helper functions defined in the pipeline.

def addReadgroupToMeta(meta, files) [source] ¶

Defined in workflows/sarek/main.nf:631

Parameters

Name	Description	Default
`meta`	-	-
`files`	-	-

def checkCondaChannels() [source] ¶

Defined in subworkflows/nf-core/utils_nextflow_pipeline/main.nf:87

def checkConfigProvided() [source] ¶

Defined in subworkflows/nf-core/utils_nfcore_pipeline/main.nf:32

def checkProfileProvided(nextflow_cli_args) [source] ¶

Defined in subworkflows/nf-core/utils_nfcore_pipeline/main.nf:46

Parameters

Name	Description	Default
`nextflow_cli_args`	-	-

def completionEmail(summary_params, email, email_on_fail, plaintext_email, outdir, monochrome_logs, multiqc_report) [source] ¶

Defined in subworkflows/nf-core/utils_nfcore_pipeline/main.nf:229

Parameters

Name	Description	Default
`summary_params`	-	-
`email`	-	-
`email_on_fail`	-	-
`plaintext_email`	-	-
`outdir`	-	-
`monochrome_logs`	-	-
`multiqc_report`	-	-

def completionSummary(monochrome_logs) [source] ¶

Defined in subworkflows/nf-core/utils_nfcore_pipeline/main.nf:342

Parameters

Name	Description	Default
`monochrome_logs`	-	-

def dumpParametersToJSON(outdir) [source] ¶

Defined in subworkflows/nf-core/utils_nextflow_pipeline/main.nf:73

Parameters

Name	Description	Default
`outdir`	-	-

def flowcellLaneFromFastq(path) [source] ¶

Defined in workflows/sarek/main.nf:653

Parameters

Name	Description	Default
`path`	-	-

def genomeExistsError() [source] ¶

Defined in subworkflows/local/utils_nfcore_sarek_pipeline/main.nf:254

def getFileSuffix(filename) [source] ¶

Defined in modules/nf-core/cat/cat/main.nf:75

Parameters

Name	Description	Default
`filename`	-	-

def getGenomeAttribute(attribute) [source] ¶

Defined in main.nf:342

Parameters

Name	Description	Default
`attribute`	-	-

def getSingleReport(multiqc_reports) [source] ¶

Defined in subworkflows/nf-core/utils_nfcore_pipeline/main.nf:208

Parameters

Name	Description	Default
`multiqc_reports`	-	-

def getWorkflowVersion() [source] ¶

Defined in subworkflows/nf-core/utils_nfcore_pipeline/main.nf:62

def imNotification(summary_params, hook_url) [source] ¶

Defined in subworkflows/nf-core/utils_nfcore_pipeline/main.nf:360

Parameters

Name	Description	Default
`summary_params`	-	-
`hook_url`	-	-

def isCloudUrl(cache_url) [source] ¶

Defined in subworkflows/local/annotation_cache_initialisation/main.nf:70

Parameters

Name	Description	Default
`cache_url`	-	-

def logColours(monochrome_logs) [source] ¶

Defined in subworkflows/nf-core/utils_nfcore_pipeline/main.nf:141

Parameters

Name	Description	Default
`monochrome_logs`	-	-

def methodsDescriptionText(mqc_methods_yaml) [source] ¶

Defined in subworkflows/local/utils_nfcore_sarek_pipeline/main.nf:297

Parameters

Name	Description	Default
`mqc_methods_yaml`	-	-

def paramsSummaryMultiqc(summary_params) [source] ¶

Defined in subworkflows/nf-core/utils_nfcore_pipeline/main.nf:107

Parameters

Name	Description	Default
`summary_params`	-	-

def processVersionsFromYAML(yaml_file) [source] ¶

Defined in subworkflows/nf-core/utils_nfcore_pipeline/main.nf:80

Parameters

Name	Description	Default
`yaml_file`	-	-

def readFirstLineOfFastq(path) [source] ¶

Defined in workflows/sarek/main.nf:681

Parameters

Name	Description	Default
`path`	-	-

def retrieveInput(need_input, step, outdir) [source] ¶

Defined in subworkflows/local/utils_nfcore_sarek_pipeline/main.nf:338

Parameters

Name	Description	Default
`need_input`	-	-
`step`	-	-
`outdir`	-	-

def softwareVersionsToYAML(ch_versions) [source] ¶

Defined in subworkflows/nf-core/utils_nfcore_pipeline/main.nf:100

Parameters

Name	Description	Default
`ch_versions`	-	-

def sparkAndBam() [source] ¶

Defined in subworkflows/local/utils_nfcore_sarek_pipeline/main.nf:262

def toolBibliographyText() [source] ¶

Defined in subworkflows/local/utils_nfcore_sarek_pipeline/main.nf:285

def toolCitationText() [source] ¶

Defined in subworkflows/local/utils_nfcore_sarek_pipeline/main.nf:271

def validateInputParameters() [source] ¶

Defined in subworkflows/local/utils_nfcore_sarek_pipeline/main.nf:248

def workflowVersionToYAML() [source] ¶

Defined in subworkflows/nf-core/utils_nfcore_pipeline/main.nf:89

Name	Description
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-

Name	Description
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-

Name	Description
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-

Name	Description
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-

Name	Description
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-

Name	Description
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-

Name	Description
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-

Name	Description
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-

Search Results

nf-core/sarek

Introduction¶

Pipeline summary¶

Usage¶

Pipeline output¶

Benchmarking¶

Credits¶

Acknowledgements¶

Contributions & Support¶

Citations¶

CHANGELOG¶

Pipeline Inputs

Input/output options ¶

Main options ¶

FASTQ Preprocessing ¶

Unique Molecular Identifiers ¶

Preprocessing ¶

Variant Calling ¶

Post variant calling ¶

Annotation ¶

General reference genome options ¶

Reference genome options ¶

Institutional config options ¶

Generic options ¶

Workflows

Inputs (take)

Outputs (emit)

Inputs (take)

Outputs (emit)

Inputs (take)

Outputs (emit)

Inputs (take)

Outputs (emit)

Inputs (take)

Outputs (emit)

Inputs (take)

Outputs (emit)

Inputs (take)

Outputs (emit)

Inputs (take)

Outputs (emit)

Inputs (take)

Outputs (emit)

Inputs (take)

Outputs (emit)

Inputs (take)

Outputs (emit)

Components

Inputs (take)

Outputs (emit)

Inputs (take)

Outputs (emit)

Inputs (take)

Outputs (emit)

Inputs (take)

Outputs (emit)

Inputs (take)

Outputs (emit)

Inputs (take)

Outputs (emit)

Inputs (take)

Outputs (emit)

Inputs (take)

Outputs (emit)

Inputs (take)

Outputs (emit)

Inputs (take)

Outputs (emit)

Inputs (take)

Outputs (emit)

Inputs (take)

Outputs (emit)

Inputs (take)

Outputs (emit)

Inputs (take)

Outputs (emit)

Inputs (take)

Outputs (emit)

Inputs (take)

Name	Description
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-

Name	Description
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-

Name	Description
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-

Name	Description
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-
`?`	-