Processes

This page documents all processes in the pipeline.

MULTIQC
UNTAR
MOSDEPTH
VCFTOOLS
UNZIP
YTE
GAWK
FASTQC
ASCAT
FREEBAYES
FASTP
GUNZIP
SPRING_DECOMPRESS
LOFREQ_CALLPARALLEL
GATK4_INTERVALLISTTOBED
GATK4_CALCULATECONTAMINATION
GATK4_FILTERMUTECTCALLS
GATK4_APPLYVQSR
GATK4_GENOMICSDBIMPORT
GATK4_LEARNREADORIENTATIONMODEL
GATK4_VARIANTRECALIBRATOR
GATK4_GATHERBQSRREPORTS
GATK4_GETPILEUPSUMMARIES
GATK4_GENOTYPEGVCFS
GATK4_CREATESEQUENCEDICTIONARY
GATK4_GATHERPILEUPSUMMARIES
GATK4_ESTIMATELIBRARYCOMPLEXITY
GATK4_HAPLOTYPECALLER
GATK4_CNNSCOREVARIANTS
GATK4_BASERECALIBRATOR
GATK4_APPLYBQSR
GATK4_MERGEVCFS
GATK4_FILTERVARIANTTRANCHES
GATK4_MARKDUPLICATES
GATK4_MUTECT2
GATK4_MERGEMUTECTSTATS
GATK4SPARK_BASERECALIBRATOR
GATK4SPARK_APPLYBQSR
GATK4SPARK_MARKDUPLICATES
DEEPVARIANT_RUNDEEPVARIANT
SENTIEON_GVCFTYPER
SENTIEON_TNSCOPE
SENTIEON_DNAMODELAPPLY
SENTIEON_BWAMEM
SENTIEON_APPLYVARCAL
SENTIEON_VARCAL
SENTIEON_DNASCOPE
SENTIEON_HAPLOTYPER
SENTIEON_DEDUP
SAMTOOLS_BAM2FQ
SAMTOOLS_MERGE
SAMTOOLS_MPILEUP
SAMTOOLS_FAIDX
SAMTOOLS_VIEW
SAMTOOLS_INDEX
SAMTOOLS_COLLATEFASTQ
SAMTOOLS_STATS
SAMTOOLS_CONVERT
VCFLIB_VCFFILTER
BBMAP_BBSPLIT
MSISENSOR2_MSI
CONTROLFREEC_FREEC2BED
CONTROLFREEC_MAKEGRAPH2
CONTROLFREEC_FREEC2CIRCOS
CONTROLFREEC_FREEC
CONTROLFREEC_ASSESSSIGNIFICANCE
GOLEFT_INDEXCOV
DRAGMAP_ALIGN
DRAGMAP_HASHTABLE
STRELKA_GERMLINE
STRELKA_SOMATIC
BWA_INDEX
BWA_MEM
SNPEFF_SNPEFF
SNPEFF_DOWNLOAD
NGSCHECKMATE_NCM
ENSEMBLVEP_VEP
ENSEMBLVEP_DOWNLOAD
VARLOCIRAPTOR_CALLVARIANTS
VARLOCIRAPTOR_PREPROCESS
VARLOCIRAPTOR_ESTIMATEALIGNMENTPROPERTIES
CNVKIT_GENEMETRICS
CNVKIT_CALL
CNVKIT_BATCH
CNVKIT_ANTITARGET
CNVKIT_EXPORT
CNVKIT_REFERENCE
RBT_VCFSPLIT
FGBIO_FASTQTOBAM
FGBIO_COPYUMIFROMREADNAME
FGBIO_CALLMOLECULARCONSENSUSREADS
FGBIO_GROUPREADSBYUMI
BWAMEM2_INDEX
BWAMEM2_MEM
MANTA_TUMORONLY
MANTA_GERMLINE
MANTA_SOMATIC
BCFTOOLS_CONCAT
BCFTOOLS_SORT
BCFTOOLS_MERGE
BCFTOOLS_MPILEUP
BCFTOOLS_ANNOTATE
BCFTOOLS_NORM
BCFTOOLS_VIEW
BCFTOOLS_STATS
BCFTOOLS_ISEC
MSISENSORPRO_SCAN
MSISENSORPRO_MSISOMATIC
MUSE_SUMP
MUSE_CALL
PARABRICKS_FQ2BAM
SVDB_MERGE
TIDDIT_SV
TABIX_TABIX
TABIX_BGZIPTABIX
CAT_CAT
CAT_FASTQ
CREATE_INTERVALS_BED
ADD_INFO_TO_VCF
SAMTOOLS_REINDEX_BAM

MULTIQC

Defined in modules/nf-core/multiqc/main.nf:1

Keywords: QC, bioinformatics tools, Beautiful stand-alone HTML report

Aggregate results from bioinformatics analyses across many samples into a single report

Tools

multiqc

MultiQC searches a given directory for analysis logs and compiles a HTML report. It's a general use tool, perfect for summarising the output from numerous bioinformatics tools.

Homepage | Documentation | biotools:multiqc | License: GPL-3.0-or-later

Outputs

Name	Type	Pattern	Description
`report`	`-`	`-`	-
`data`	`-`	`-`	-
`plots`	`-`	`-`	-

Authors: @abhi18av, @bunop, @drpatelh, @jfy133 Maintainers: @abhi18av, @bunop, @drpatelh, @jfy133

UNTAR

Defined in modules/nf-core/untar/main.nf:1

Keywords: untar, uncompress, extract

Extract files.

Tools

untar

Extract tar.gz files.

Documentation | License: GPL-3.0-or-later

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`archive`	`file`	File to be untar

Outputs

Name	Type	Pattern	Description
`untar`	`map`	`*/`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]

Authors: @joseespinosa, @drpatelh, @matthdsm, @jfy133 Maintainers: @joseespinosa, @drpatelh, @matthdsm, @jfy133

MOSDEPTH

Defined in modules/nf-core/mosdepth/main.nf:1

Keywords: mosdepth, bam, cram, coverage

Calculates genome-wide sequencing coverage.

Tools

mosdepth

Fast BAM/CRAM depth calculation for WGS, exome, or targeted sequencing.

Documentation | biotools:mosdepth | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`bam`	`file`	Input BAM/CRAM file
`bai`	`file`	Index for BAM/CRAM file
`bed`	`file`	BED file with intersected intervals
`meta2`	`map`	Groovy Map containing bed information e.g. [ id:'test' ]
`fasta`	`file`	Reference genome FASTA file

Outputs

Name	Type	Emit	Description
`val(meta), path('*.global.dist.txt')`	`tuple`	`global_txt`	-
`val(meta), path('*.summary.txt')`	`tuple`	`summary_txt`	-
`val(meta), path('*.region.dist.txt')`	`tuple`	`regions_txt`	-
`val(meta), path('*.per-base.d4')`	`tuple`	`per_base_d4`	-
`val(meta), path('*.per-base.bed.gz')`	`tuple`	`per_base_bed`	-
`val(meta), path('*.per-base.bed.gz.csi')`	`tuple`	`per_base_csi`	-
`val(meta), path('*.regions.bed.gz')`	`tuple`	`regions_bed`	-
`val(meta), path('*.regions.bed.gz.csi')`	`tuple`	`regions_csi`	-
`val(meta), path('*.quantized.bed.gz')`	`tuple`	`quantized_bed`	-
`val(meta), path('*.quantized.bed.gz.csi')`	`tuple`	`quantized_csi`	-
`val(meta), path('*.thresholds.bed.gz')`	`tuple`	`thresholds_bed`	-
`val(meta), path('*.thresholds.bed.gz.csi')`	`tuple`	`thresholds_csi`	-
`versions.yml`	`path`	`versions`	-

Authors: @joseespinosa, @drpatelh, @ramprasadn, @matthdsm Maintainers: @joseespinosa, @drpatelh, @ramprasadn, @matthdsm

VCFTOOLS

Defined in modules/nf-core/vcftools/main.nf:1

Keywords: VCFtools, VCF, sort

A set of tools written in Perl and C++ for working with VCF files

Tools

vcftools

A set of tools written in Perl and C++ for working with VCF files. This package only contains the C++ libraries whereas the package perl-vcftools-vcf contains the perl libraries

Homepage | Documentation | biotools:vcftools | License: LGPL

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`variant_file`	`file`	variant input file which can be vcf, vcf.gz, or bcf format.
`bed`	`file`	bed file which can be used with different arguments in vcftools (optional)
`diff_variant_file`	`file`	secondary variant file which can be used with the 'diff' suite of tools (optional)

Outputs

Name	Type	Emit	Description
`val(meta), path("*.vcf")`	`tuple`	`vcf`	-
`val(meta), path("*.bcf")`	`tuple`	`bcf`	-
`val(meta), path("*.frq")`	`tuple`	`frq`	-
`val(meta), path("*.frq.count")`	`tuple`	`frq_count`	-
`val(meta), path("*.idepth")`	`tuple`	`idepth`	-
`val(meta), path("*.ldepth")`	`tuple`	`ldepth`	-
`val(meta), path("*.ldepth.mean")`	`tuple`	`ldepth_mean`	-
`val(meta), path("*.gdepth")`	`tuple`	`gdepth`	-
`val(meta), path("*.hap.ld")`	`tuple`	`hap_ld`	-
`val(meta), path("*.geno.ld")`	`tuple`	`geno_ld`	-
`val(meta), path("*.geno.chisq")`	`tuple`	`geno_chisq`	-
`val(meta), path("*.list.hap.ld")`	`tuple`	`list_hap_ld`	-
`val(meta), path("*.list.geno.ld")`	`tuple`	`list_geno_ld`	-
`val(meta), path("*.interchrom.hap.ld")`	`tuple`	`interchrom_hap_ld`	-
`val(meta), path("*.interchrom.geno.ld")`	`tuple`	`interchrom_geno_ld`	-
`val(meta), path("*.TsTv")`	`tuple`	`tstv`	-
`val(meta), path("*.TsTv.summary")`	`tuple`	`tstv_summary`	-
`val(meta), path("*.TsTv.count")`	`tuple`	`tstv_count`	-
`val(meta), path("*.TsTv.qual")`	`tuple`	`tstv_qual`	-
`val(meta), path("*.FILTER.summary")`	`tuple`	`filter_summary`	-
`val(meta), path("*.sites.pi")`	`tuple`	`sites_pi`	-
`val(meta), path("*.windowed.pi")`	`tuple`	`windowed_pi`	-
`val(meta), path("*.weir.fst")`	`tuple`	`weir_fst`	-
`val(meta), path("*.het")`	`tuple`	`heterozygosity`	-
`val(meta), path("*.hwe")`	`tuple`	`hwe`	-
`val(meta), path("*.Tajima.D")`	`tuple`	`tajima_d`	-
`val(meta), path("*.ifreqburden")`	`tuple`	`freq_burden`	-
`val(meta), path("*.LROH")`	`tuple`	`lroh`	-
`val(meta), path("*.relatedness")`	`tuple`	`relatedness`	-
`val(meta), path("*.relatedness2")`	`tuple`	`relatedness2`	-
`val(meta), path("*.lqual")`	`tuple`	`lqual`	-
`val(meta), path("*.imiss")`	`tuple`	`missing_individual`	-
`val(meta), path("*.lmiss")`	`tuple`	`missing_site`	-
`val(meta), path("*.snpden")`	`tuple`	`snp_density`	-
`val(meta), path("*.kept.sites")`	`tuple`	`kept_sites`	-
`val(meta), path("*.removed.sites")`	`tuple`	`removed_sites`	-
`val(meta), path("*.singletons")`	`tuple`	`singeltons`	-
`val(meta), path("*.indel.hist")`	`tuple`	`indel_hist`	-
`val(meta), path("*.hapcount")`	`tuple`	`hapcount`	-
`val(meta), path("*.mendel")`	`tuple`	`mendel`	-
`val(meta), path("*.FORMAT")`	`tuple`	`format`	-
`val(meta), path("*.INFO")`	`tuple`	`info`	-
`val(meta), path("*.012")`	`tuple`	`genotypes_matrix`	-
`val(meta), path("*.012.indv")`	`tuple`	`genotypes_matrix_individual`	-
`val(meta), path("*.012.pos")`	`tuple`	`genotypes_matrix_position`	-
`val(meta), path("*.impute.hap")`	`tuple`	`impute_hap`	-
`val(meta), path("*.impute.hap.legend")`	`tuple`	`impute_hap_legend`	-
`val(meta), path("*.impute.hap.indv")`	`tuple`	`impute_hap_indv`	-
`val(meta), path("*.ldhat.sites")`	`tuple`	`ldhat_sites`	-
`val(meta), path("*.ldhat.locs")`	`tuple`	`ldhat_locs`	-
`val(meta), path("*.BEAGLE.GL")`	`tuple`	`beagle_gl`	-
`val(meta), path("*.BEAGLE.PL")`	`tuple`	`beagle_pl`	-
`val(meta), path("*.ped")`	`tuple`	`ped`	-
`val(meta), path("*.map")`	`tuple`	`map_`	-
`val(meta), path("*.tped")`	`tuple`	`tped`	-
`val(meta), path("*.tfam")`	`tuple`	`tfam`	-
`val(meta), path("*.diff.sites_in_files")`	`tuple`	`diff_sites_in_files`	-
`val(meta), path("*.diff.indv_in_files")`	`tuple`	`diff_indv_in_files`	-
`val(meta), path("*.diff.sites")`	`tuple`	`diff_sites`	-
`val(meta), path("*.diff.indv")`	`tuple`	`diff_indv`	-
`val(meta), path("*.diff.discordance.matrix")`	`tuple`	`diff_discd_matrix`	-
`val(meta), path("*.diff.switch")`	`tuple`	`diff_switch_error`	-
`versions.yml`	`path`	`versions`	-

Authors: @Mark-S-Hill Maintainers: @Mark-S-Hill

UNZIP

Defined in modules/nf-core/unzip/main.nf:1

Keywords: unzip, decompression, zip, archiving

Unzip ZIP archive files

Tools

unzip

p7zip is a quick port of 7z.exe and 7za.exe (command line version of 7zip, see www.7-zip.org) for Unix.

Homepage | Documentation | License: LGPL-2.1-or-later

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`archive`	`file`	ZIP file

Outputs

Name	Type	Pattern	Description
`unzipped_archive`	`directory`	`${archive.baseName}/`	Directory contents of the unzipped archive

Authors: @jfy133 Maintainers: @jfy133

YTE

Defined in modules/nf-core/yte/main.nf:1

Keywords: yaml, template, python

A YAML template engine with Python expressions

Tools

yte

A YAML template engine with Python expressions

Homepage | Documentation | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. `[ id:'template1' ]`
`template`	`file`	YTE template
`map_file`	`file`	YAML file containing a map to be used in the template
`map`	`map`	Groovy Map containing mapping information to be used in the template e.g. `[ key: value ]` with key being a wildcard in the template

Outputs

Name	Type	Pattern	Description
`rendered`	`file`	`*.yaml`	Rendered YAML file

Authors: @famosab Maintainers: @famosab

GAWK

Defined in modules/nf-core/gawk/main.nf:1

Keywords: gawk, awk, txt, text, file parsing

If you are like many computer users, you would frequently like to make changes in various text files wherever certain patterns appear, or extract data from parts of certain lines while discarding the rest. The job is easy with awk, especially the GNU implementation gawk.

Tools

gawk

GNU awk

Homepage | Documentation | License: GPL v3

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	The input file - Specify the logic that needs to be executed on this file on the `ext.args2` or in the program file. If the files have a `.gz` extension, they will be unzipped using `zcat`.
`program_file`	`file`	Optional file containing logic for awk to execute. If you don't wish to use a file, you can use `ext.args2` to specify the logic.
`disable_redirect_output`	`boolean`	Disable the redirection of awk output to a given file. This is useful if you want to use awk's built-in redirect to write files instead of the shell's redirect.

Outputs

Name	Type	Emit	Description
`val(meta)`	`tuple`	-	-

Authors: @nvnieuwk Maintainers: @nvnieuwk

FASTQC

Defined in modules/nf-core/fastqc/main.nf:1

Keywords: quality control, qc, adapters, fastq

Run FastQC on sequenced reads

Tools

fastqc

FastQC gives general quality metrics about your reads. It provides information about the quality score distribution across your reads, the per base sequence content (%A/C/G/T).

You get information about adapter contamination and other overrepresented sequences.

Homepage | Documentation | biotools:fastqc | License: GPL-2.0-only

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`reads`	`file`	List of input FastQ files of size 1 and 2 for single-end and paired-end data, respectively.

Outputs

Name	Type	Pattern	Description
`html`	`file`	`*_{fastqc.html}`	FastQC report
`zip`	`file`	`*_{fastqc.zip}`	FastQC report archive

Authors: @drpatelh, @grst, @ewels, @FelixKrueger Maintainers: @drpatelh, @grst, @ewels, @FelixKrueger

ASCAT

Defined in modules/nf-core/ascat/main.nf:1

Keywords: bam, copy number, cram

copy number profiles of tumour cells.

Tools

ascat

ASCAT is a method to derive copy number profiles of tumour cells, accounting for normal cell admixture and tumour aneuploidy. ASCAT infers tumour purity (the fraction of tumour cells) and ploidy (the amount of DNA per tumour cell), expressed as multiples of haploid genomes from SNP array or massively parallel sequencing data, and calculates whole-genome allele-specific copy number profiles (the number of copies of both parental alleles for all SNP loci across the genome).

Documentation | biotools:ascat | License: GPL v3

Inputs

| Name | Type | Description | | -------------- | ------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------- | ----------------------------------------------------------------------------------------------------------------- | | meta | map | Groovy Map containing sample information e.g. [ id:'test', single_end:false ] | | input_normal | file | BAM/CRAM file, must adhere to chr1, chr2, ...chrX notation For modifying chromosome notation in bam files please follow https://josephcckuo.wordpress.com/2016/11/17/modify-chromosome-notation-in-bam-file/. | | index_normal | file | index for normal_bam/cram | | input_tumor | file | BAM/CRAM file, must adhere to chr1, chr2, ...chrX notation | | index_tumor | file | index for tumor_bam/cram | | allele_files | file | allele files for ASCAT WGS. Can be downloaded here https://github.com/VanLoo-lab/ascat/tree/master/ReferenceFiles/WGS | | loci_files | file | loci files for ASCAT WGS. Loci files without chromosome notation can be downloaded here https://github.com/VanLoo-lab/ascat/tree/master/ReferenceFiles/WGS Make sure the chromosome notation matches the bam/cram input files. To add the chromosome notation to loci files (hg19/hg38) if necessary, you can run this command if [[ $(samtools view <your_bam_file.bam> | head -n1 | cut -f3)\" == _\"chr\"_ ]]; then for i in {1..22} X; do sed -i 's/^/chr/' G1000*loci_hg19_chr*${i}.txt; done; fi | | bed_file | file | Bed file for ASCAT WES (optional, but recommended for WES) | | fasta | file | Reference fasta file (optional) | | gc_file | file | GC correction file (optional) - Used to do logR correction of the tumour sample(s) with genomic GC content | | rt_file | file | replication timing correction file (optional, provide only in combination with gc_file) |

Outputs

Name	Type	Emit	Description
`val(meta), path("alleleFrequencies_chr.txt")`	`tuple`	`allelefreqs`	-
`val(meta), path("*BAF.txt")`	`tuple`	`bafs`	-
`val(meta), path("*cnvs.txt")`	`tuple`	`cnvs`	-
`val(meta), path("*LogR.txt")`	`tuple`	`logrs`	-
`val(meta), path("*metrics.txt")`	`tuple`	`metrics`	-
`val(meta), path("*png")`	`tuple`	`png`	-
`val(meta), path("*purityploidy.txt")`	`tuple`	`purityploidy`	-
`val(meta), path("*segments.txt")`	`tuple`	`segments`	-
`versions.yml`	`path`	`versions`	-

Authors: @aasNGC, @lassefolkersen, @FriederikeHanssen, @maxulysse, @SusiJo Maintainers: @aasNGC, @lassefolkersen, @FriederikeHanssen, @maxulysse, @SusiJo

FREEBAYES

Defined in modules/nf-core/freebayes/main.nf:1

Keywords: variant caller, SNP, genotyping, somatic variant calling, germline variant calling, bacterial variant calling, bayesian

A haplotype-based variant detector

Tools

freebayes

Bayesian haplotype-based polymorphism discovery and genotyping

Homepage | Documentation | biotools:freebayes | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input_1`	`file`	BAM/CRAM/SAM file
`input_1_index`	`file`	BAM/CRAM/SAM index file
`input_2`	`file`	BAM/CRAM/SAM file
`input_2_index`	`file`	BAM/CRAM/SAM index file
`target_bed`	`file`	Optional - Limit analysis to targets listed in this BED-format FILE.
`meta2`	`map`	Groovy Map containing reference information. e.g. [ id:'test_reference' ]
`fasta`	`file`	reference fasta file
`meta3`	`map`	Groovy Map containing reference information. e.g. [ id:'test_reference' ]
`fasta_fai`	`file`	reference fasta file index
`meta4`	`map`	Groovy Map containing meta information for the samples file. e.g. [ id:'test_samples' ]
`samples`	`file`	Optional - Limit analysis to samples listed (one per line) in the FILE.
`meta5`	`map`	Groovy Map containing meta information for the populations file. e.g. [ id:'test_populations' ]
`populations`	`file`	Optional - Each line of FILE should list a sample and a population which it is part of.
`meta6`	`map`	Groovy Map containing meta information for the cnv file. e.g. [ id:'test_cnv' ]
`cnv`	`file`	A copy number map BED file, which has either a sample-level ploidy: sample_name copy_number or a region-specific format: seq_name start end sample_name copy_number

Outputs

Name	Type	Pattern	Description
`vcf`	`file`	`*.vcf.gz`	Compressed VCF file

Authors: @maxibor, @FriederikeHanssen, @maxulysse Maintainers: @maxibor, @FriederikeHanssen, @maxulysse

FASTP

Defined in modules/nf-core/fastp/main.nf:1

Keywords: trimming, quality control, fastq

Perform adapter/quality trimming on sequencing reads

Tools

fastp

A tool designed to provide fast all-in-one preprocessing for FastQ files. This tool is developed in C++ with multithreading supported to afford high performance.

Documentation | biotools:fastp | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information. Use 'single_end: true' to specify single ended or interleaved FASTQs. Use 'single_end: false' for paired-end reads. e.g. [ id:'test', single_end:false ]
`reads`	`file`	List of input FastQ files of size 1 and 2 for single-end and paired-end data, respectively. If you wish to run interleaved paired-end data, supply as single-end data but with `--interleaved_in` in your `modules.conf`'s `ext.args` for the module.
`adapter_fasta`	`file`	File in FASTA format containing possible adapters to remove.
`discard_trimmed_pass`	`boolean`	Specify true to not write any reads that pass trimming thresholds. This can be used to use fastp for the output report only.
`save_trimmed_fail`	`boolean`	Specify true to save files that failed to pass trimming thresholds ending in `*.fail.fastq.gz`
`save_merged`	`boolean`	Specify true to save all merged reads to a file ending in `*.merged.fastq.gz`

Outputs

Name	Type	Emit	Description
`val(meta), path('*.fastp.fastq.gz')`	`tuple`	`reads`	-
`val(meta), path('*.json')`	`tuple`	`json`	-
`val(meta), path('*.html')`	`tuple`	`html`	-
`val(meta), path('*.log')`	`tuple`	`log`	-
`val(meta), path('*.fail.fastq.gz')`	`tuple`	`reads_fail`	-
`val(meta), path('*.merged.fastq.gz')`	`tuple`	`reads_merged`	-
`versions.yml`	`path`	`versions`	-

Authors: @drpatelh, @kevinmenden Maintainers: @drpatelh, @kevinmenden

GUNZIP

Defined in modules/nf-core/gunzip/main.nf:1

Keywords: gunzip, compression, decompression

Compresses and decompresses files.

Tools

gunzip

gzip is a file format and a software application used for file compression and decompression.

Documentation | License: GPL-3.0-or-later

Inputs

Name	Type	Description
`meta`	`map`	Optional groovy Map containing meta information e.g. [ id:'test', single_end:false ]
`archive`	`file`	File to be compressed/uncompressed

Outputs

Name	Type	Emit	Description
`val(meta)`	`tuple`	-	-

Authors: @joseespinosa, @drpatelh, @jfy133 Maintainers: @joseespinosa, @drpatelh, @jfy133, @gallvp

SPRING_DECOMPRESS

Defined in modules/nf-core/spring/decompress/main.nf:1

Keywords: FASTQ, decompression, lossless

Fast, efficient, lossless decompression of FASTQ files.

Tools

spring

SPRING is a compression tool for Fastq files (containing up to 4.29 Billion reads)

Homepage | Documentation | biotools:spring | License: Free for non-commercial use

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`spring`	`file`	Spring file to decompress.
`write_one_fastq_gz`	`boolean`	Controls whether spring should write one fastq.gz file with reads from both directions or two fastq.gz files with reads from distinct directions

Outputs

Name	Type	Emit	Description
`val(meta), path("*.fastq.gz")`	`tuple`	`fastq`	-
`versions.yml`	`path`	`versions`	-

Authors: @xec-cm Maintainers: @xec-cm

LOFREQ_CALLPARALLEL

Defined in modules/nf-core/lofreq/callparallel/main.nf:1

Keywords: variant calling, low frequency variant calling, call, variants

It predicts variants using multiple processors

Tools

lofreq

Lofreq is a fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data. It's call-parallel programme predicts variants using multiple processors

Homepage | Documentation | biotools:lofreq | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test' ]
`bam`	`file`	Tumor sample sorted BAM file
`bai`	`file`	BAM index file
`intervals`	`file`	BED file containing target regions for variant calling
`meta2`	`map`	Groovy Map containing sample information about the reference fasta e.g. [ id:'reference' ]
`fasta`	`file`	Reference genome FASTA file
`meta3`	`map`	Groovy Map containing sample information about the reference fasta fai e.g. [ id:'reference' ]
`fai`	`file`	Reference genome FASTA index file

Outputs

Name	Type	Emit	Description
`val(meta), path("*.vcf.gz")`	`tuple`	`vcf`	-
`val(meta), path("*.vcf.gz.tbi")`	`tuple`	`tbi`	-
`versions.yml`	`path`	`versions`	-

Authors: @kaurravneet4123, @bjohnnyd Maintainers: @kaurravneet4123, @bjohnnyd, @nevinwu, @AitorPeseta

GATK4_INTERVALLISTTOBED

Defined in modules/nf-core/gatk4/intervallisttobed/main.nf:1

Keywords: bed, conversion, gatk4, interval

Converts an Picard IntervalList file to a BED file.

Tools

gatk4

Genome Analysis Toolkit (GATK4)

Homepage | Documentation | License: BSD-3-clause

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`intervals`	`file`	IntervalList file

Outputs

Name	Type	Emit	Description
`val(meta)`	`tuple`	-	-

Authors: @FriederikeHanssen Maintainers: @FriederikeHanssen

GATK4_CALCULATECONTAMINATION

Defined in modules/nf-core/gatk4/calculatecontamination/main.nf:1

Keywords: gatk4, calculatecontamination, cross-samplecontamination, getpileupsummaries, filtermutectcalls

Calculates the fraction of reads from cross-sample contamination based on summary tables from getpileupsummaries. Output to be used with filtermutectcalls.

Tools

gatk4

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Homepage | Documentation | License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test' ]
`pileup`	`file`	File containing the pileups summary table of a tumor sample to be used to calculate contamination.
`matched`	`file`	File containing the pileups summary table of a normal sample that matches with the tumor sample specified in pileup argument. This is an optional input.

Outputs

Name	Type	Emit	Description
`val(meta), path('*.contamination.table')`	`tuple`	`contamination`	-
`val(meta), path('*.segmentation.table')`	`tuple`	`segmentation`	-
`versions.yml`	`path`	`versions`	-

Authors: @GCJMackenzie, @maxulysse Maintainers: @GCJMackenzie, @maxulysse

GATK4_FILTERMUTECTCALLS

Defined in modules/nf-core/gatk4/filtermutectcalls/main.nf:1

Keywords: filtermutectcalls, filter, gatk4, mutect2, vcf

Filters the raw output of mutect2, can optionally use outputs of calculatecontamination and learnreadorientationmodel to improve filtering.

Tools

gatk4

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Homepage | Documentation

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test' ]
`vcf`	`file`	compressed vcf file of mutect2calls
`vcf_tbi`	`file`	Tabix index of vcf file
`stats`	`file`	Stats file that pairs with output vcf file
`orientationbias`	`file`	files containing artifact priors for input vcf. Optional input.
`segmentation`	`file`	tables containing segmentation information for input vcf. Optional input.
`table`	`file`	table(s) containing contamination data for input vcf. Optional input, takes priority over estimate.
`estimate`	`float`	estimation of contamination value as a double. Optional input, will only be used if table is not specified.
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fasta`	`file`	The reference fasta file
`meta3`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fai`	`file`	Index of reference fasta file
`meta4`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`dict`	`file`	GATK sequence dictionary

Outputs

Name	Type	Emit	Description
`val(meta), path("*.vcf.gz")`	`tuple`	`vcf`	-
`val(meta), path("*.vcf.gz.tbi")`	`tuple`	`tbi`	-
`val(meta), path("*.filteringStats.tsv")`	`tuple`	`stats`	-
`versions.yml`	`path`	`versions`	-

Authors: @GCJMackenzie, @maxulysse, @ramprasadn Maintainers: @GCJMackenzie, @maxulysse, @ramprasadn

GATK4_APPLYVQSR

Defined in modules/nf-core/gatk4/applyvqsr/main.nf:1

Keywords: gatk4, variant quality score recalibration, vcf, vqsr

Apply a score cutoff to filter variants based on a recalibration table. AplyVQSR performs the second pass in a two-stage process called Variant Quality Score Recalibration (VQSR). Specifically, it applies filtering to the input variants based on the recalibration table produced in the first step by VariantRecalibrator and a target sensitivity value.

Tools

gatk4

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Homepage | Documentation | License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test']
`vcf`	`file`	VCF file to be recalibrated, this should be the same file as used for the first stage VariantRecalibrator.
`vcf_tbi`	`file`	tabix index for the input vcf file.
`recal`	`file`	Recalibration file produced when the input vcf was run through VariantRecalibrator in stage 1.
`recal_index`	`file`	Index file for the recalibration file.
`tranches`	`file`	Tranches file produced when the input vcf was run through VariantRecalibrator in stage 1.
`fasta`	`file`	The reference fasta file
`fai`	`file`	Index of reference fasta file
`dict`	`file`	GATK sequence dictionary

Outputs

Name	Type	Emit	Description
`val(meta), path("*.vcf.gz")`	`tuple`	`vcf`	-
`val(meta), path("*.tbi")`	`tuple`	`tbi`	-
`versions.yml`	`path`	`versions`	-

Authors: @GCJMackenzie Maintainers: @GCJMackenzie

GATK4_GENOMICSDBIMPORT

Defined in modules/nf-core/gatk4/genomicsdbimport/main.nf:1

Keywords: gatk4, genomicsdb, genomicsdbimport, jointgenotyping, panelofnormalscreation

merge GVCFs from multiple samples. For use in joint genotyping or somatic panel of normal creation.

Tools

gatk4

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Homepage | Documentation

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test']
`vcf`	`list`	either a list of vcf files to be used to create or update a genomicsdb, or a file that contains a map to vcf files to be used.
`tbi`	`list`	list of tbi files that match with the input vcf files
`interval_file`	`file`	file containing the intervals to be used when creating the genomicsdb
`interval_value`	`string`	if an intervals file has not been specified, the value entered here will be used as an interval via the "-L" argument
`wspace`	`file`	path to an existing genomicsdb to be used in update db mode or get intervals mode. This WILL NOT specify name of a new genomicsdb in create db mode.
`run_intlist`	`boolean`	Specify whether to run get interval list mode, this option cannot be specified at the same time as run_updatewspace.
`run_updatewspace`	`boolean`	Specify whether to run update genomicsdb mode, this option takes priority over run_intlist.
`input_map`	`boolean`	Specify whether the vcf input is providing a list of vcf file(s) or a single file containing a map of paths to vcf files to be used to create or update a genomicsdb.

Outputs

Name	Type	Emit	Description
`val(meta)`	`tuple`	-	-

Authors: @GCJMackenzie Maintainers: @GCJMackenzie

GATK4_LEARNREADORIENTATIONMODEL

Defined in modules/nf-core/gatk4/learnreadorientationmodel/main.nf:1

Keywords: gatk4, learnreadorientationmodel, mutect2, readorientationartifacts

Uses f1r2 counts collected during mutect2 to Learn the prior probability of read orientation artifacts

Tools

gatk4

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Homepage | Documentation | License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test' ]
`f1r2`	`list`	list of f1r2 files to be used as input.

Outputs

Name	Type	Emit	Description
`val(meta), path("*.tar.gz")`	`tuple`	`artifactprior`	-
`versions.yml`	`path`	`versions`	-

Authors: @GCJMackenzie Maintainers: @GCJMackenzie

GATK4_VARIANTRECALIBRATOR

Defined in modules/nf-core/gatk4/variantrecalibrator/main.nf:1

Keywords: gatk4, recalibration model, variantrecalibrator

Build a recalibration model to score variant quality for filtering purposes. It is highly recommended to follow GATK best practices when using this module, the gaussian mixture model requires a large number of samples to be used for the tool to produce optimal results. For example, 30 samples for exome data. For more details see https://gatk.broadinstitute.org/hc/en-us/articles/4402736812443-Which-training-sets-arguments-should-I-use-for-running-VQSR-

Tools

gatk4

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Homepage | Documentation

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test' ]
`vcf`	`file`	input vcf file containing the variants to be recalibrated
`tbi`	`file`	tbi file matching with -vcf
`resource_vcf`	`file`	all resource vcf files that are used with the corresponding '--resource' label
`resource_tbi`	`file`	all resource tbi files that are used with the corresponding '--resource' label
`labels`	`string`	necessary arguments for GATK VariantRecalibrator. Specified to directly match the resources provided. More information can be found at https://gatk.broadinstitute.org/hc/en-us/articles/5358906115227-VariantRecalibrator
`fasta`	`file`	The reference fasta file
`fai`	`file`	Index of reference fasta file
`dict`	`file`	GATK sequence dictionary

Outputs

Name	Type	Emit	Description
`val(meta), path("*.recal")`	`tuple`	`recal`	-
`val(meta), path("*.idx")`	`tuple`	`idx`	-
`val(meta), path("*.tranches")`	`tuple`	`tranches`	-
`val(meta), path("*plots.R")`	`tuple`	`plots`	-
`versions.yml`	`path`	`versions`	-

Authors: @GCJMackenzie, @nickhsmith Maintainers: @GCJMackenzie, @nickhsmith

GATK4_GATHERBQSRREPORTS

Defined in modules/nf-core/gatk4/gatherbqsrreports/main.nf:1

Keywords: base quality score recalibration, bqsr, gatherbqsrreports, gatk4

Gathers scattered BQSR recalibration reports into a single file

Tools

gatk4

Genome Analysis Toolkit (GATK4)

Homepage | Documentation | License: BSD-3-clause

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`table`	`file`	File(s) containing BQSR table(s)

Outputs

Name	Type	Emit	Description
`val(meta), path("*.table")`	`tuple`	`table`	-
`versions.yml`	`path`	`versions`	-

Authors: @FriederikeHanssen Maintainers: @FriederikeHanssen

GATK4_GETPILEUPSUMMARIES

Defined in modules/nf-core/gatk4/getpileupsummaries/main.nf:1

Keywords: gatk4, germlinevariantsites, getpileupsumaries, readcountssummary

Summarizes counts of reads that support reference, alternate and other alleles for given sites. Results can be used with CalculateContamination. Requires a common germline variant sites file, such as from gnomAD.

Tools

gatk4

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Homepage | Documentation | License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test' ]
`input`	`file`	BAM/CRAM file to be summarised.
`index`	`file`	Index file for the input BAM/CRAM file.
`intervals`	`file`	File containing specified sites to be used for the summary. If this option is not specified, variants file is used instead automatically.
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fasta`	`file`	The reference fasta file
`meta3`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fai`	`file`	Index of reference fasta file
`meta4`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`dict`	`file`	GATK sequence dictionary
`variants`	`file`	Population vcf of germline sequencing, containing allele fractions. Is also used as sites file if no separate sites file is specified.
`variants_tbi`	`file`	Index file for the germline resource.

Outputs

Name	Type	Emit	Description
`val(meta), path('*.pileups.table')`	`tuple`	`table`	-
`versions.yml`	`path`	`versions`	-

Authors: @GCJMackenzie Maintainers: @GCJMackenzie

GATK4_GENOTYPEGVCFS

Defined in modules/nf-core/gatk4/genotypegvcfs/main.nf:1

Keywords: gatk4, genotype, gvcf, joint genotyping

Perform joint genotyping on one or more samples pre-called with HaplotypeCaller.

Tools

gatk4

Genome Analysis Toolkit (GATK4)

Homepage | Documentation | License: BSD-3-clause

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	gVCF(.gz) file or a GenomicsDB
`gvcf_index`	`file`	index of gvcf file, or empty when providing GenomicsDB
`intervals`	`file`	Interval file with the genomic regions included in the library (optional)
`intervals_index`	`file`	Interval index file (optional)
`meta2`	`map`	Groovy Map containing fasta information e.g. [ id:'test' ]
`fasta`	`file`	Reference fasta file
`meta3`	`map`	Groovy Map containing fai information e.g. [ id:'test' ]
`fai`	`file`	Reference fasta index file
`meta4`	`map`	Groovy Map containing dict information e.g. [ id:'test' ]
`dict`	`file`	Reference fasta sequence dict file
`meta5`	`map`	Groovy Map containing dbsnp information e.g. [ id:'test' ]
`dbsnp`	`file`	dbSNP VCF file
`meta6`	`map`	Groovy Map containing dbsnp tbi information e.g. [ id:'test' ]
`dbsnp_tbi`	`file`	dbSNP VCF index file

Outputs

Name	Type	Emit	Description
`val(meta), path("*.vcf.gz")`	`tuple`	`vcf`	-
`val(meta), path("*.tbi")`	`tuple`	`tbi`	-
`versions.yml`	`path`	`versions`	-

Authors: @santiagorevale, @maxulysse Maintainers: @santiagorevale, @maxulysse

GATK4_CREATESEQUENCEDICTIONARY

Defined in modules/nf-core/gatk4/createsequencedictionary/main.nf:1

Keywords: createsequencedictionary, dictionary, fasta, gatk4

Creates a sequence dictionary for a reference sequence

Tools

gatk

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Homepage | Documentation | License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fasta`	`file`	Input fasta file

Outputs

Name	Type	Pattern	Description
`dict`	`file`	`*.{dict}`	gatk dictionary file

Authors: @maxulysse, @ramprasadn Maintainers: @maxulysse, @ramprasadn

GATK4_GATHERPILEUPSUMMARIES

Defined in modules/nf-core/gatk4/gatherpileupsummaries/main.nf:1

Keywords: gatk4, mpileup, sort

write your description here

Tools

gatk4

Genome Analysis Toolkit (GATK4)

Homepage | Documentation | License: BSD-3-clause

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`pileup`	`file`	Pileup files from gatk4/getpileupsummaries
`dict`	`file`	dictionary

Outputs

Name	Type	Emit	Description
`val(meta), path("*.pileups.table")`	`tuple`	`table`	-
`versions.yml`	`path`	`versions`	-

Authors: @FriederikeHanssen, @maxulysse Maintainers: @FriederikeHanssen, @maxulysse

GATK4_ESTIMATELIBRARYCOMPLEXITY

Defined in modules/nf-core/gatk4/estimatelibrarycomplexity/main.nf:1

Keywords: duplication metrics, estimatelibrarycomplexity, gatk4, reporting

Estimates the numbers of unique molecules in a sequencing library.

Tools

gatk4

Genome Analysis Toolkit (GATK4)

Homepage | Documentation | License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	BAM/CRAM/SAM file
`fasta`	`file`	The reference fasta file
`fai`	`file`	Index of reference fasta file
`dict`	`file`	GATK sequence dictionary

Outputs

Name	Type	Emit	Description
`val(meta), path('*.metrics')`	`tuple`	`metrics`	-
`versions.yml`	`path`	`versions`	-

Authors: @FriederikeHanssen, @maxulysse Maintainers: @FriederikeHanssen, @maxulysse

GATK4_HAPLOTYPECALLER

Defined in modules/nf-core/gatk4/haplotypecaller/main.nf:1

Keywords: gatk4, haplotype, haplotypecaller

Call germline SNPs and indels via local re-assembly of haplotypes

Tools

gatk4

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Homepage | Documentation | License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	BAM/CRAM file from alignment
`input_index`	`file`	BAI/CRAI file from alignment
`intervals`	`file`	Bed file with the genomic regions included in the library (optional)
`dragstr_model`	`file`	Text file containing the DragSTR model of the used BAM/CRAM file (optional)
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'test_reference' ]
`fasta`	`file`	The reference fasta file
`meta3`	`map`	Groovy Map containing reference information e.g. [ id:'test_reference' ]
`fai`	`file`	Index of reference fasta file
`meta4`	`map`	Groovy Map containing reference information e.g. [ id:'test_reference' ]
`dict`	`file`	GATK sequence dictionary
`meta5`	`map`	Groovy Map containing dbsnp information e.g. [ id:'test_dbsnp' ]
`dbsnp`	`file`	VCF file containing known sites (optional)
`meta6`	`map`	Groovy Map containing dbsnp information e.g. [ id:'test_dbsnp' ]
`dbsnp_tbi`	`file`	VCF index of dbsnp (optional)

Outputs

Name	Type	Emit	Description
`val(meta), path("*.vcf.gz")`	`tuple`	`vcf`	-
`val(meta), path("*.tbi")`	`tuple`	`tbi`	-
`val(meta), path("*.realigned.bam")`	`tuple`	`bam`	-
`versions.yml`	`path`	`versions`	-

Authors: @suzannejin, @FriederikeHanssen Maintainers: @suzannejin, @FriederikeHanssen

GATK4_CNNSCOREVARIANTS

Defined in modules/nf-core/gatk4/cnnscorevariants/main.nf:1

Keywords: cnnscorevariants, gatk4, variants

Apply a Convolutional Neural Net to filter annotated variants

Tools

gatk4

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Homepage | Documentation | License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`vcf`	`file`	VCF file
`tbi`	`file`	VCF index file
`aligned_input`	`file`	BAM/CRAM file from alignment (optional)
`intervals`	`file`	Bed file with the genomic regions included in the library (optional)
`fasta`	`file`	The reference fasta file
`fai`	`file`	Index of reference fasta file
`dict`	`file`	GATK sequence dictionary
`architecture`	`file`	Neural Net architecture configuration json file (optional)
`weights`	`file`	Keras model HD5 file with neural net weights. (optional)

Outputs

Name	Type	Emit	Description
`val(meta), path("*cnn.vcf.gz")`	`tuple`	`vcf`	-
`val(meta), path("*cnn.vcf.gz.tbi")`	`tuple`	`tbi`	-
`versions.yml`	`path`	`versions`	-

Authors: @FriederikeHanssen Maintainers: @FriederikeHanssen

GATK4_BASERECALIBRATOR

Defined in modules/nf-core/gatk4/baserecalibrator/main.nf:1

Keywords: base quality score recalibration, table, bqsr, gatk4, sort

Generate recalibration table for Base Quality Score Recalibration (BQSR)

Tools

gatk4

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Homepage | Documentation | License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	BAM/CRAM file from alignment
`input_index`	`file`	BAI/CRAI file from alignment
`intervals`	`file`	Bed file with the genomic regions included in the library (optional)
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'genome']
`fasta`	`file`	The reference fasta file
`meta3`	`map`	Groovy Map containing reference information e.g. [ id:'genome']
`fai`	`file`	Index of reference fasta file
`meta4`	`map`	Groovy Map containing reference information e.g. [ id:'genome']
`dict`	`file`	GATK sequence dictionary
`meta5`	`map`	Groovy Map containing reference information e.g. [ id:'genome']
`known_sites`	`file`	VCF files with known sites for indels / snps (optional)
`meta6`	`map`	Groovy Map containing reference information e.g. [ id:'genome']
`known_sites_tbi`	`file`	Tabix index of the known_sites (optional)

Outputs

Name	Type	Emit	Description
`val(meta), path("*.table")`	`tuple`	`table`	-
`versions.yml`	`path`	`versions`	-

Authors: @yocra3, @FriederikeHanssen, @maxulysse Maintainers: @yocra3, @FriederikeHanssen, @maxulysse

GATK4_APPLYBQSR

Defined in modules/nf-core/gatk4/applybqsr/main.nf:1

Keywords: bam, base quality score recalibration, bqsr, cram, gatk4

Apply base quality score recalibration (BQSR) to a bam file

Tools

gatk4

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Homepage | Documentation | License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	BAM/CRAM file from alignment
`input_index`	`file`	BAI/CRAI file from alignment
`bqsr_table`	`file`	Recalibration table from gatk4_baserecalibrator
`intervals`	`file`	Bed file with the genomic regions included in the library (optional)

Outputs

Name	Type	Pattern	Description
`bam`	`file`	`${prefix}.bam`	Recalibrated BAM file
`bai`	`file`	`${prefix}*bai`	Recalibrated BAM index file
`cram`	`file`	`${prefix}.cram`	Recalibrated CRAM file

Authors: @yocra3, @FriederikeHanssen Maintainers: @yocra3, @FriederikeHanssen

GATK4_MERGEVCFS

Defined in modules/nf-core/gatk4/mergevcfs/main.nf:1

Keywords: gatk4, merge, vcf

Merges several vcf files

Tools

gatk4

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Homepage | Documentation | License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test']
`vcf`	`list`	Two or more VCF files
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'genome']
`dict`	`file`	Optional Sequence Dictionary as input

Outputs

Name	Type	Emit	Description
`val(meta), path('*.vcf.gz')`	`tuple`	`vcf`	-
`val(meta), path("*.tbi")`	`tuple`	`tbi`	-
`versions.yml`	`path`	`versions`	-

Authors: @kevinmenden Maintainers: @kevinmenden

GATK4_FILTERVARIANTTRANCHES

Defined in modules/nf-core/gatk4/filtervarianttranches/main.nf:1

Keywords: filtervarianttranches, gatk4, tranche filtering

Apply tranche filtering

Tools

gatk4

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Homepage | Documentation | License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`vcf`	`file`	a VCF file containing variants, must have info key:CNN_2D
`tbi`	`file`	tbi file matching with -vcf
`intervals`	`file`	Intervals
`resources`	`list`	resource A VCF containing known SNP and or INDEL sites. Can be supplied as many times as necessary
`resources_index`	`list`	Index of resource VCF containing known SNP and or INDEL sites. Can be supplied as many times as necessary
`fasta`	`file`	The reference fasta file
`fai`	`file`	Index of reference fasta file
`dict`	`file`	GATK sequence dictionary

Outputs

Name	Type	Emit	Description
`val(meta), path("*.vcf.gz")`	`tuple`	`vcf`	-
`val(meta), path("*.vcf.gz.tbi")`	`tuple`	`tbi`	-
`versions.yml`	`path`	`versions`	-

Authors: @FriederikeHanssen Maintainers: @FriederikeHanssen

GATK4_MARKDUPLICATES

Defined in modules/nf-core/gatk4/markduplicates/main.nf:1

Keywords: bam, gatk4, markduplicates, sort

This tool locates and tags duplicate reads in a BAM or SAM file, where duplicate reads are defined as originating from a single fragment of DNA.

Tools

gatk4

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Homepage | Documentation | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`bam`	`file`	Sorted BAM file
`fasta`	`file`	Fasta file
`fasta_fai`	`file`	Fasta index file

Outputs

Name	Type	Emit	Description
`val(meta), path("*cram")`	`tuple`	`cram`	-
`val(meta), path("*bam")`	`tuple`	`bam`	-
`val(meta), path("*.crai")`	`tuple`	`crai`	-
`val(meta), path("*.bai")`	`tuple`	`bai`	-
`val(meta), path("*.metrics")`	`tuple`	`metrics`	-
`versions.yml`	`path`	`versions`	-

Authors: @ajodeh-juma, @FriederikeHanssen, @maxulysse Maintainers: @ajodeh-juma, @FriederikeHanssen, @maxulysse

GATK4_MUTECT2

Defined in modules/nf-core/gatk4/mutect2/main.nf:1

Keywords: gatk4, haplotype, indels, mutect2, snvs, somatic

Call somatic SNVs and indels via local assembly of haplotypes.

Tools

gatk4

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Homepage | Documentation | License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test']
`input`	`list`	list of BAM files, also able to take CRAM as an input
`input_index`	`list`	list of BAM file indexes, also able to take CRAM indexes as an input
`intervals`	`file`	Specify region the tools is run on.
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fasta`	`file`	The reference fasta file
`meta3`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fai`	`file`	Index of reference fasta file
`meta4`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`dict`	`file`	GATK sequence dictionary
`germline_resource`	`file`	Population vcf of germline sequencing, containing allele fractions.
`germline_resource_tbi`	`file`	Index file for the germline resource.
`panel_of_normals`	`file`	vcf file to be used as a panel of normals.
`panel_of_normals_tbi`	`file`	Index for the panel of normals.

Outputs

Name	Type	Emit	Description
`val(meta), path("*.vcf.gz")`	`tuple`	`vcf`	-
`val(meta), path("*.tbi")`	`tuple`	`tbi`	-
`val(meta), path("*.stats")`	`tuple`	`stats`	-
`val(meta), path("*.f1r2.tar.gz")`	`tuple`	`f1r2`	-
`versions.yml`	`path`	`versions`	-

Authors: @GCJMackenzie, @ramprasadn Maintainers: @GCJMackenzie, @ramprasadn

GATK4_MERGEMUTECTSTATS

Defined in modules/nf-core/gatk4/mergemutectstats/main.nf:1

Keywords: gatk4, merge, mutect2, mutectstats

Merges mutect2 stats generated on different intervals/regions

Tools

gatk4

Genome Analysis Toolkit (GATK4)

Homepage | Documentation | License: BSD-3-clause

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`stats`	`file`	Stats file

Outputs

Name	Type	Emit	Description
`val(meta), path("*.vcf.gz.stats")`	`tuple`	`stats`	-
`versions.yml`	`path`	`versions`	-

Authors: @FriederikeHanssen Maintainers: @FriederikeHanssen

GATK4SPARK_BASERECALIBRATOR

Defined in modules/nf-core/gatk4spark/baserecalibrator/main.nf:1

Keywords: base quality score recalibration, table, bqsr, gatk4spark, sort

Generate recalibration table for Base Quality Score Recalibration (BQSR)

Tools

gatk4

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Homepage | Documentation | License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	BAM/CRAM file from alignment
`input_index`	`file`	BAI/CRAI file from alignment
`intervals`	`file`	Bed file with the genomic regions included in the library (optional)
`fasta`	`file`	The reference fasta file
`fai`	`file`	Index of reference fasta file
`dict`	`file`	GATK sequence dictionary
`known_sites`	`file`	VCF files with known sites for indels / snps (optional)
`known_sites_tbi`	`file`	Tabix index of the known_sites (optional)

Outputs

Name	Type	Emit	Description
`val(meta), path("*.table")`	`tuple`	`table`	-
`versions.yml`	`path`	`versions`	-

Authors: @yocra3, @FriederikeHanssen, @maxulysse Maintainers: @yocra3, @FriederikeHanssen, @maxulysse

GATK4SPARK_APPLYBQSR

Defined in modules/nf-core/gatk4spark/applybqsr/main.nf:1

Keywords: bam, base quality score recalibration, bqsr, cram, gatk4spark

Apply base quality score recalibration (BQSR) to a bam file

Tools

gatk4

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Homepage | Documentation | License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	BAM/CRAM file from alignment
`input_index`	`file`	BAI/CRAI file from alignment
`bqsr_table`	`file`	Recalibration table from gatk4_baserecalibrator
`intervals`	`file`	Bed file with the genomic regions included in the library (optional)

Outputs

Name	Type	Pattern	Description
`bam`	`file`	`${prefix}.bam`	Recalibrated BAM file
`bai`	`file`	`${prefix}*bai`	Recalibrated BAM index file
`cram`	`file`	`${prefix}.cram`	Recalibrated CRAM file

Authors: @yocra3, @FriederikeHanssen, @maxulysse Maintainers: @yocra3, @FriederikeHanssen, @maxulysse

GATK4SPARK_MARKDUPLICATES

Defined in modules/nf-core/gatk4spark/markduplicates/main.nf:1

Keywords: bam, gatk4spark, markduplicates, sort

This tool locates and tags duplicate reads in a BAM or SAM file, where duplicate reads are defined as originating from a single fragment of DNA.

Tools

gatk4

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Homepage | Documentation | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`bam`	`file`	Sorted BAM file
`fasta`	`file`	The reference fasta file
`fasta_fai`	`file`	Index of reference fasta file
`dict`	`file`	GATK sequence dictionary

Outputs

Name	Type	Emit	Description
`val(meta)`	`tuple`	-	-

Authors: @ajodeh-juma, @FriederikeHanssen, @maxulysse, @SusiJo Maintainers: @ajodeh-juma, @FriederikeHanssen, @maxulysse, @SusiJo

DEEPVARIANT_RUNDEEPVARIANT

Defined in modules/nf-core/deepvariant/rundeepvariant/main.nf:1

Keywords: variant calling, machine learning, neural network

DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data

Tools

deepvariant

DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data

Homepage | Documentation | biotools:deepvariant | License: BSD-3-clause

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	BAM/CRAM file
`index`	`file`	Index of BAM/CRAM file
`intervals`	`file`	file containing intervals
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fasta`	`file`	The reference fasta file
`meta3`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fai`	`file`	Index of reference fasta file
`meta4`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`gzi`	`file`	GZI index of reference fasta file
`meta5`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`par_bed`	`file`	BED file containing PAR regions

Outputs

Name	Type	Pattern	Description
`vcf`	`file`	`*.vcf.gz`	Compressed VCF file
`vcf_index`	`file`	`*.vcf.gz.{tbi,csi}`	Tabix index file of compressed VCF
`gvcf`	`file`	`*.g.vcf.gz`	Compressed GVCF file
`gvcf_index`	`file`	`*.g.vcf.gz.{tbi,csi}`	Tabix index file of compressed GVCF

Authors: @abhi18av, @ramprasadn Maintainers: @abhi18av, @ramprasadn

SENTIEON_GVCFTYPER

Defined in modules/nf-core/sentieon/gvcftyper/main.nf:1

Keywords: joint genotyping, genotype, gvcf

Perform joint genotyping on one or more samples pre-called with Sentieon's Haplotyper.

Tools

sentieon

Sentieon® provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Homepage | Documentation

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`gvcfs`	`file`	gVCF(.gz) file
`tbis`	`file`	index of gvcf file
`intervals`	`file`	Interval file with the genomic regions included in the library (optional)
`meta1`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`fasta`	`file`	Reference fasta file
`meta2`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`fai`	`file`	Reference fasta index file
`meta3`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`dbsnp`	`file`	dbSNP VCF file
`meta4`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`dbsnp_tbi`	`file`	dbSNP VCF index file

Outputs

Name	Type	Pattern	Description
`vcf_gz`	`file`	`*.vcf.gz`	VCF file
`vcf_gz_tbi`	`file`	`*.vcf.gz.tbi`	VCF index file

Authors: @asp8200 Maintainers: @asp8200

SENTIEON_TNSCOPE

Defined in modules/nf-core/sentieon/tnscope/main.nf:1

Keywords: tnscope, sentieon, variant_calling

TNscope algorithm performs somatic variant calling on the tumor-normal matched pair or the tumor only data, using a Haplotyper algorithm.

Tools

sentieon

Sentieon® provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Homepage | Documentation

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information. e.g. [ id:'test' ]
`input`	`file`	One or more BAM or CRAM files.
`input_index`	`file`	Indices for the input files
`intervals`	`file`	bed or interval_list file containing interval in the reference that will be used in the analysis. Only recommended for large WGS data, else the overhead may not be worth the additional parallelisation.
`meta2`	`map`	Groovy Map containing reference information. e.g. [ id:'test' ]
`fasta`	`file`	Genome fasta file
`meta3`	`map`	Groovy Map containing reference information. e.g. [ id:'test' ]
`fai`	`file`	Index of the genome fasta file
`meta4`	`map`	Groovy Map containing reference information. e.g. [ id:'test' ]
`dbsnp`	`file`	Single Nucleotide Polymorphism database (dbSNP) file
`meta5`	`map`	Groovy Map containing reference information. e.g. [ id:'test' ]
`dbsnp_tbi`	`file`	Index of the Single Nucleotide Polymorphism database (dbSNP) file
`meta6`	`map`	Groovy Map containing reference information. e.g. [ id:'test' ]
`pon`	`file`	Single Nucleotide Polymorphism database (dbSNP) file
`meta7`	`map`	Groovy Map containing reference information. e.g. [ id:'test' ]
`pon_tbi`	`file`	Index of the Single Nucleotide Polymorphism database (dbSNP) file
`meta8`	`map`	Groovy Map containing reference information. e.g. [ id:'test' ]
`cosmic`	`file`	Catalogue of Somatic Mutations in Cancer (COSMIC) VCF file.
`meta9`	`map`	Groovy Map containing reference information. e.g. [ id:'test' ]
`cosmic_tbi`	`file`	Index of the Catalogue of Somatic Mutations in Cancer (COSMIC) VCF file.

Outputs

Name	Type	Pattern	Description
`vcf`	`file`	`*.{vcf.gz}`	VCF file
`index`	`file`	`*.vcf.gz.tbi`	Index of the VCF file

Authors: @ramprasadn Maintainers: @ramprasadn

SENTIEON_DNAMODELAPPLY

Defined in modules/nf-core/sentieon/dnamodelapply/main.nf:1

Keywords: dnamodelapply, vcf, filter, sentieon

modifies the input VCF file by adding the MLrejected FILTER to the variants

Tools

sentieon

Sentieon® provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Homepage | Documentation

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. `[ id:'test', single_end:false ]`
`vcf`	`file`	INPUT VCF file
`idx`	`file`	Index of the input VCF file
`meta2`	`map`	Groovy Map containing reference information e.g. `[ id:'test' ]`
`fasta`	`file`	Genome fasta file
`meta3`	`map`	Groovy Map containing reference information e.g. `[ id:'test' ]`
`fai`	`file`	Index of the genome fasta file
`meta4`	`map`	Groovy Map containing reference information e.g. `[ id:'test' ]`
`ml_model`	`file`	machine learning model file

Outputs

Name	Type	Pattern	Description
`vcf`	`file`	`*.{vcf,vcf.gz}`	INPUT VCF file
`tbi`	`file`	`*.{tbi}`	Index of the input VCF file

Authors: @ramprasadn Maintainers: @ramprasadn

SENTIEON_BWAMEM

Defined in modules/nf-core/sentieon/bwamem/main.nf:1

Keywords: mem, bwa, alignment, map, fastq, bam, sentieon

Performs fastq alignment to a fasta reference using Sentieon's BWA MEM

Tools

sentieon

Sentieon® provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Homepage | Documentation

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing reference information. e.g. [ id:'test', single_end:false ]
`reads`	`file`	Genome fastq files (single-end or paired-end)
`meta2`	`map`	Groovy Map containing reference information. e.g. [ id:'test', single_end:false ]
`index`	`file`	BWA genome index files
`meta3`	`map`	Groovy Map containing reference information. e.g. [ id:'test', single_end:false ]
`fasta`	`file`	Genome fasta file
`meta4`	`map`	Groovy Map containing reference information. e.g. [ id:'test', single_end:false ]
`fasta_fai`	`file`	The index of the FASTA reference.

Outputs

Name	Type	Pattern	Description
`bam_and_bai`	`file`	`.{bam,bai}, .{bam,bai}`	BAM file with corresponding index. BAM file with corresponding index.

Authors: @asp8200 Maintainers: @asp8200, @DonFreed

SENTIEON_APPLYVARCAL

Defined in modules/nf-core/sentieon/applyvarcal/main.nf:1

Keywords: sentieon, applyvarcal, varcal, VQSR

Apply a score cutoff to filter variants based on a recalibration table. Sentieon's Aplyvarcal performs the second pass in a two-stage process called Variant Quality Score Recalibration (VQSR). Specifically, it applies filtering to the input variants based on the recalibration table produced in the previous step VarCal and a target sensitivity value. https://support.sentieon.com/manual/usages/general/#applyvarcal-algorithm

Tools

sentieon

Sentieon® provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Homepage | Documentation

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test']
`vcf`	`file`	VCF file to be recalibrated, this should be the same file as used for the first stage VariantRecalibrator.
`vcf_tbi`	`file`	tabix index for the input vcf file.
`recal`	`file`	Recalibration file produced when the input vcf was run through VariantRecalibrator in stage 1.
`recal_index`	`file`	Index file for the recalibration file.
`tranches`	`file`	Tranches file produced when the input vcf was run through VariantRecalibrator in stage 1.
`meta2`	`map`	Groovy Map containing sample information e.g. [ id:'test']
`fasta`	`file`	The reference fasta file
`meta3`	`map`	Groovy Map containing sample information e.g. [ id:'test']
`fai`	`file`	Index of reference fasta file

Outputs

Name	Type	Pattern	Description
`vcf`	`file`	`*.vcf.gz`	compressed vcf file containing the recalibrated variants.
`tbi`	`file`	`*vcf.gz.tbi`	Index of recalibrated vcf file.

Authors: @assp8200 Maintainers: @assp8200

SENTIEON_VARCAL

Defined in modules/nf-core/sentieon/varcal/main.nf:1

Keywords: sentieon, varcal, variant recalibration

Module for Sentieons VarCal. The VarCal algorithm calculates the Variant Quality Score Recalibration (VQSR). VarCal builds a recalibration model for scoring variant quality. https://support.sentieon.com/manual/usages/general/#varcal-algorithm

Tools

sentieon

Sentieon® provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Homepage | Documentation

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test' ]
`vcf`	`file`	input vcf file containing the variants to be recalibrated
`tbi`	`file`	tbi file matching with -vcf

Outputs

Name	Type	Pattern	Description
`recal`	`file`	`*.recal`	Output recal file used by ApplyVQSR
`idx`	`file`	`*.idx`	Index file for the recal output file
`tranches`	`file`	`*.tranches`	Output tranches file used by ApplyVQSR
`plots`	`file`	`*plots.R`	Optional output rscript file to aid in visualization of the input data and learned model.

Authors: @asp8200 Maintainers: @asp8200

SENTIEON_DNASCOPE

Defined in modules/nf-core/sentieon/dnascope/main.nf:1

Keywords: dnascope, sentieon, variant_calling

DNAscope algorithm performs an improved version of Haplotype variant calling.

Tools

sentieon

Sentieon® provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Homepage | Documentation

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information. e.g. [ id:'test', single_end:false ]
`bam`	`file`	BAM file.
`bai`	`file`	BAI file
`intervals`	`file`	bed or interval_list file containing interval in the reference that will be used in the analysis
`meta2`	`map`	Groovy Map containing meta information for fasta.
`fasta`	`file`	Genome fasta file
`meta3`	`map`	Groovy Map containing meta information for fasta index.
`fai`	`file`	Index of the genome fasta file
`meta4`	`map`	Groovy Map containing meta information for dbsnp.
`dbsnp`	`file`	Single Nucleotide Polymorphism database (dbSNP) file
`meta5`	`map`	Groovy Map containing meta information for dbsnp_tbi.
`dbsnp_tbi`	`file`	Index of the Single Nucleotide Polymorphism database (dbSNP) file
`meta6`	`map`	Groovy Map containing meta information for machine learning model for Dnascope.
`ml_model`	`file`	machine learning model file

Outputs

Name	Type	Pattern	Description
`vcf`	`file`	`*.unfiltered.vcf.gz`	Compressed VCF file
`vcf_tbi`	`file`	`*.unfiltered.vcf.gz.tbi`	Index of VCF file
`gvcf`	`file`	`*.g.vcf.gz`	Compressed GVCF file
`gvcf_tbi`	`file`	`*.g.vcf.gz.tbi`	Index of GVCF file

Authors: @ramprasadn Maintainers: @ramprasadn

SENTIEON_HAPLOTYPER

Defined in modules/nf-core/sentieon/haplotyper/main.nf:1

Keywords: sentieon, haplotypecaller, haplotype

Runs Sentieon's haplotyper for germline variant calling.

Tools

sentieon

Sentieon® provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Homepage | Documentation

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing reference information. e.g. [ id:'test', single_end:false ]
`input`	`file`	BAM/CRAM file from alignment
`input_index`	`file`	BAI/CRAI file from alignment
`intervals`	`file`	Bed file with the genomic regions included in the library (optional)
`recal_table`	`file`	Recalibration table from sentieon/qualcal (optional)
`meta2`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`fasta`	`file`	Genome fasta file
`meta3`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`fai`	`file`	The index of the FASTA reference.
`meta4`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`dbsnp`	`file`	VCF file containing known sites (optional)
`meta5`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`dbsnp_tbi`	`file`	VCF index of dbsnp (optional)

Outputs

Name	Type	Pattern	Description
`vcf`	`file`	`*.unfiltered.vcf.gz`	Compressed VCF file
`vcf_tbi`	`file`	`*.unfiltered.vcf.gz.tbi`	Index of VCF file
`gvcf`	`file`	`*.g.vcf.gz`	Compressed GVCF file
`gvcf_tbi`	`file`	`*.g.vcf.gz.tbi`	Index of GVCF file

Authors: @asp8200 Maintainers: @asp8200

SENTIEON_DEDUP

Defined in modules/nf-core/sentieon/dedup/main.nf:1

Keywords: mem, dedup, map, bam, cram, sentieon

Runs the sentieon tool LocusCollector followed by Dedup. LocusCollector collects read information that is used by Dedup which in turn marks or removes duplicate reads.

Tools

sentieon

Sentieon® provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Homepage | Documentation

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing reference information. e.g. [ id:'test', single_end:false ]
`bam`	`file`	BAM file.
`bai`	`file`	BAI file
`meta2`	`map`	Groovy Map containing reference information. e.g. [ id:'test', single_end:false ]
`fasta`	`file`	Genome fasta file
`meta3`	`map`	Groovy Map containing reference information. e.g. [ id:'test', single_end:false ]
`fasta_fai`	`file`	The index of the FASTA reference.

Outputs

Name	Type	Pattern	Description
`cram`	`file`	`*.cram`	CRAM file
`crai`	`file`	`*.crai`	CRAM index file
`bam`	`file`	`*.bam`	BAM file.
`bai`	`file`	`*.bai`	BAI file
`score`	`file`	`*.score`	The score file indicates which reads LocusCollector finds are likely duplicates.
`metrics`	`file`	`*.metrics`	Output file containing Dedup metrics incl. histogram data.
`metrics_multiqc_tsv`	`file`	`*.metrics.multiqc.tsv`	Output tsv-file containing Dedup metrics excl. histogram data.

Authors: @asp8200 Maintainers: @asp8200

SAMTOOLS_BAM2FQ

Defined in modules/nf-core/samtools/bam2fq/main.nf:1

Keywords: bam2fq, samtools, fastq

The module uses bam2fq method from samtools to convert a SAM, BAM or CRAM file to FASTQ format

Tools

samtools

Tools for dealing with SAM, BAM and CRAM files

Documentation | biotools:samtools | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`inputbam`	`file`	BAM/CRAM/SAM file
`split`	`boolean`	TRUE/FALSE value to indicate if reads should be separated into /1, /2 and if present other, or singleton. Note: choosing TRUE will generate 4 different files. Choosing FALSE will produce a single file, which will be interleaved in case the input contains paired reads.

Outputs

Name	Type	Emit	Description
`val(meta), path("*.fq.gz")`	`tuple`	`reads`	-
`versions.yml`	`path`	`versions`	-

Authors: @lescai Maintainers: @lescai

SAMTOOLS_MERGE

Defined in modules/nf-core/samtools/merge/main.nf:1

Keywords: merge, bam, sam, cram

Merge BAM or CRAM file

Tools

samtools

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Homepage | Documentation | biotools:samtools | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input_files`	`file`	BAM/CRAM file
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fasta`	`file`	Reference file the CRAM was created with (optional)
`meta3`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fai`	`file`	Index of the reference file the CRAM was created with (optional)

Outputs

Name	Type	Emit	Description
`val(meta)`	`tuple`	-	-

Authors: @drpatelh, @yuukiiwa , @maxulysse, @FriederikeHanssen, @ramprasadn Maintainers: @drpatelh, @yuukiiwa , @maxulysse, @FriederikeHanssen, @ramprasadn

SAMTOOLS_MPILEUP

Defined in modules/nf-core/samtools/mpileup/main.nf:1

Keywords: mpileup, bam, sam, cram

BAM

Tools

samtools

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Homepage | Documentation | biotools:samtools | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test' ]
`input`	`file`	BAM/CRAM/SAM file
`intervals`	`file`	Interval FILE
`meta2`	`map`	Groovy Map containing sample information e.g. [ id:'test' ]
`fasta`	`file`	FASTA reference file

Outputs

Name	Type	Emit	Description
`val(meta), path("*.mpileup.gz")`	`tuple`	`mpileup`	-
`versions.yml`	`path`	`versions`	-

Authors: @drpatelh, @joseespinosa Maintainers: @drpatelh, @joseespinosa

SAMTOOLS_FAIDX

Defined in modules/nf-core/samtools/faidx/main.nf:1

Keywords: index, fasta, faidx, chromosome

Index FASTA file, and optionally generate a file of chromosome sizes

Tools

samtools

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Homepage | Documentation | biotools:samtools | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing reference information e.g. [ id:'test' ]
`fasta`	`file`	FASTA file
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'test' ]
`fai`	`file`	FASTA index file

Outputs

Name	Type	Pattern	Description
`fa`	`file`	`*.{fa}`	FASTA file
`sizes`	`file`	`*.{sizes}`	File containing chromosome lengths
`fai`	`file`	`*.{fai}`	FASTA index file
`gzi`	`file`	`*.gzi`	Optional gzip index file for compressed inputs

Authors: @drpatelh, @ewels, @phue Maintainers: @maxulysse, @phue

SAMTOOLS_VIEW

Defined in modules/nf-core/samtools/view/main.nf:1

Keywords: view, bam, sam, cram

filter/convert SAM/BAM/CRAM file

Tools

samtools

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Homepage | Documentation | biotools:samtools | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	BAM/CRAM/SAM file
`index`	`file`	BAM.BAI/BAM.CSI/CRAM.CRAI file (optional)
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'test' ]
`fasta`	`file`	Reference file the CRAM was created with (optional)
`qname`	`file`	Optional file with read names to output only select alignments
`index_format`	`string`	Index format, used together with ext.args = '--write-index'

Outputs

Name	Type	Emit	Description
`val(meta)`	`tuple`	-	-

Authors: @drpatelh, @joseespinosa, @FriederikeHanssen, @priyanka-surana Maintainers: @drpatelh, @joseespinosa, @FriederikeHanssen, @priyanka-surana

SAMTOOLS_INDEX

Defined in modules/nf-core/samtools/index/main.nf:1

Keywords: index, bam, sam, cram

Index SAM/BAM/CRAM file

Tools

samtools

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Homepage | Documentation | biotools:samtools | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	input file

Outputs

Name	Type	Emit	Description
`val(meta), path("*.bai")`	`tuple`	`bai`	-
`val(meta), path("*.csi")`	`tuple`	`csi`	-
`val(meta), path("*.crai")`	`tuple`	`crai`	-
`versions.yml`	`path`	`versions`	-

Authors: @drpatelh, @ewels, @maxulysse Maintainers: @drpatelh, @ewels, @maxulysse

SAMTOOLS_COLLATEFASTQ

Defined in modules/nf-core/samtools/collatefastq/main.nf:1

Keywords: bam2fq, samtools, fastq

The module uses collate and then fastq methods from samtools to convert a SAM, BAM or CRAM file to FASTQ format

Tools

samtools

Tools for dealing with SAM, BAM and CRAM files

Documentation | biotools:samtools | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	BAM/CRAM/SAM file
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'test' ]
`fasta`	`file`	Reference genome fasta file
`interleave`	`boolean`	If true, the output is a single interleaved paired-end FASTQ If false, the output split paired-end FASTQ

Outputs

Name	Type	Emit	Description
`val(meta)`	`tuple`	-	-

Authors: @lescai, @maxulysse, @matthdsm Maintainers: @lescai, @maxulysse, @matthdsm

SAMTOOLS_STATS

Defined in modules/nf-core/samtools/stats/main.nf:1

Keywords: statistics, counts, bam, sam, cram

Produces comprehensive statistics from SAM/BAM/CRAM file

Tools

samtools

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Homepage | Documentation | biotools:samtools | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	BAM/CRAM file from alignment
`input_index`	`file`	BAI/CRAI file from alignment
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fasta`	`file`	Reference file the CRAM was created with (optional)

Outputs

Name	Type	Emit	Description
`val(meta), path("*.stats")`	`tuple`	`stats`	-
`versions.yml`	`path`	`versions`	-

Authors: @drpatelh, @FriederikeHanssen, @ramprasadn Maintainers: @drpatelh, @FriederikeHanssen, @ramprasadn

SAMTOOLS_CONVERT

Defined in modules/nf-core/samtools/convert/main.nf:1

Keywords: view, index, bam, cram

convert and then index CRAM -> BAM or BAM -> CRAM file

Tools

samtools

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Homepage | Documentation | biotools:samtools | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	BAM/CRAM file
`index`	`file`	BAM/CRAM index file
`meta2`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`fasta`	`file`	Reference file to create the CRAM file
`meta3`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`fai`	`file`	Reference index file to create the CRAM file

Outputs

Name	Type	Emit	Description
`val(meta), path("*.bam")`	`tuple`	`bam`	-
`val(meta), path("*.cram")`	`tuple`	`cram`	-
`val(meta), path("*.bai")`	`tuple`	`bai`	-
`val(meta), path("*.crai")`	`tuple`	`crai`	-
`versions.yml`	`path`	`versions`	-

Authors: @FriederikeHanssen, @maxulysse Maintainers: @FriederikeHanssen, @maxulysse, @matthdsm

VCFLIB_VCFFILTER

Defined in modules/nf-core/vcflib/vcffilter/main.nf:1

Keywords: filter, variant, vcf, quality

Command line tools for parsing and manipulating VCF files.

Tools

vcflib

Command line tools for parsing and manipulating VCF files.

Homepage | Documentation | biotools:vcflib | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test_sample_1' ]
`vcf`	`file`	VCF file
`tbi`	`file`	Index file

Outputs

Name	Type	Pattern	Description
`vcf`	`file`	`*.{vcf.gz}`	Filtered VCF file

Authors: @zachary-foster Maintainers: @zachary-foster

BBMAP_BBSPLIT

Defined in modules/nf-core/bbmap/bbsplit/main.nf:1

Keywords: align, map, fastq, genome, reference

Split sequencing reads by mapping them to multiple references simultaneously

Tools

bbmap

BBMap is a short read aligner, as well as various other bioinformatic tools.

Homepage | Documentation | biotools:bbmap | License: UC-LBL license (see package)

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`reads`	`file`	List of input FastQ files of size 1 and 2 for single-end and paired-end data, respectively.
`other_ref_names`	`list`	List of other reference ids apart from the primary
`other_ref_paths`	`list`	Path to other references paths corresponding to "other_ref_names"

Outputs

Name	Type	Pattern	Description
`index`	`-`	`-`	-
`primary_fastq`	`file`	`primaryfastq.gz`	Output reads that map to the primary reference
`all_fastq`	`file`	`*fastq.gz`	All reads mapping to any of the references
`stats`	`file`	`*.txt`	Tab-delimited text file containing mapping statistics
`log`	`file`	`*.log`	Log file

Authors: @joseespinosa, @drpatelh, @pinin4fjords Maintainers: @joseespinosa, @drpatelh, @pinin4fjords

MSISENSOR2_MSI

Defined in modules/nf-core/msisensor2/msi/main.nf:1

Keywords: msi, microsatellite, microsatellite instability, tumor, cfDNA

msisensor2 detection of MSI regions.

Tools

msisensor2

MSIsensor2 is a novel algorithm based machine learning, featuring a large upgrade in the microsatellite instability (MSI) detection for tumor only sequencing data, including Cell-Free DNA (cfDNA), Formalin-Fixed Paraffin-Embedded(FFPE) and other sample types. The original MSIsensor is specially designed for tumor/normal paired sequencing data.

Homepage | Documentation

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`tumor_bam`	`file`	BAM/CRAM/SAM file
`tumor_bam_index`	`file`	BAM/CRAM/SAM index file
`meta2`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`models`	`file`	Folder of MSISensor2 models (available from Github or as a product of msisensor2/scan)

Outputs

Name	Type	Pattern	Description
`msi`	`file`	`-`	MSI classifications as a text file
`distribution`	`file`	`-`	Read count distributions of MSI regions
`somatic`	`file`	`-`	Somatic MSI regions detected.

Authors: @adamrtalbot Maintainers: @adamrtalbot

CONTROLFREEC_FREEC2BED

Defined in modules/nf-core/controlfreec/freec2bed/main.nf:1

Keywords: cna, cnv, somatic, single, tumor-only

Plot Freec output

Tools

controlfreec

Copy number and genotype annotation from whole genome and whole exome sequencing data.

Homepage | Documentation | License: GPL >=2

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`ratio`	`file`	ratio file generated by FREEC

Outputs

Name	Type	Pattern	Description
`bed`	`file`	`*.bed`	Bed file

Authors: @FriederikeHanssen Maintainers: @FriederikeHanssen

CONTROLFREEC_MAKEGRAPH2

Defined in modules/nf-core/controlfreec/makegraph2/main.nf:1

Keywords: cna, cnv, somatic, single, tumor-only

Plot Freec output

Tools

controlfreec

Copy number and genotype annotation from whole genome and whole exome sequencing data.

Homepage | Documentation | License: GPL >=2

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`ratio`	`file`	ratio file generated by FREEC
`baf`	`file`	.BAF file generated by FREEC

Outputs

Name	Type	Pattern	Description
`png_baf`	`file`	`*_BAF.png`	Image of BAF plot
`png_ratio_log2`	`file`	`*_ratio.log2.png`	Image of ratio log2 plot
`png_ratio`	`file`	`*_ratio.png`	Image of ratio plot

Authors: @FriederikeHanssen

CONTROLFREEC_FREEC2CIRCOS

Defined in modules/nf-core/controlfreec/freec2circos/main.nf:1

Keywords: cna, cnv, somatic, single, tumor-only

Format Freec output to circos input format

Tools

controlfreec

Copy number and genotype annotation from whole genome and whole exome sequencing data.

Homepage | Documentation | License: GPL >=2

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`ratio`	`file`	ratio file generated by FREEC

Outputs

Name	Type	Pattern	Description
`circos`	`file`	`*.circos.txt`	Txt file

Authors: @FriederikeHanssen Maintainers: @FriederikeHanssen

CONTROLFREEC_FREEC

Defined in modules/nf-core/controlfreec/freec/main.nf:1

Keywords: cna, cnv, somatic, single, tumor-only

Copy number and genotype annotation from whole genome and whole exome sequencing data

Tools

controlfreec/freec

Copy number and genotype annotation from whole genome and whole exome sequencing data.

Homepage | Documentation | License: GPL >=2

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`mpileup_normal`	`file`	miniPileup file
`mpileup_tumor`	`file`	miniPileup file
`cpn_normal`	`file`	Raw copy number profiles (optional)
`cpn_tumor`	`file`	Raw copy number profiles (optional)
`minipileup_normal`	`file`	miniPileup file from previous run (optional)
`minipileup_tumor`	`file`	miniPileup file from previous run (optional)

Outputs

Name	Type	Pattern	Description
`bedgraph`	`file`	`.bedgraph`	Bedgraph format for the UCSC genome browser
`control_cpn`	`file`	`*_control.cpn`	files with raw copy number profiles
`sample_cpn`	`file`	`*_sample.cpn`	files with raw copy number profiles
`gcprofile_cpn`	`file`	`GC_profile.*.cpn`	file with GC-content profile.
`BAF`	`file`	`*_BAF.txt`	file B-allele frequencies for each possibly heterozygous SNP position
`CNV`	`file`	`*_CNVs`	file with coordinates of predicted copy number alterations.
`info`	`file`	`*_info.txt`	parsable file with information about FREEC run
`ratio`	`file`	`*_ratio.txt`	file with ratios and predicted copy number alterations for each window
`config`	`file`	`config.txt`	Config file used to run Control-FREEC

Authors: @FriederikeHanssen Maintainers: @FriederikeHanssen

CONTROLFREEC_ASSESSSIGNIFICANCE

Defined in modules/nf-core/controlfreec/assesssignificance/main.nf:1

Keywords: cna, cnv, somatic, single, tumor-only

Add both Wilcoxon test and Kolmogorov-Smirnov test p-values to each CNV output of FREEC

Tools

controlfreec/assesssignificance

Copy number and genotype annotation from whole genome and whole exome sequencing data.

Homepage | Documentation | License: GPL >=2

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`cnvs`	`file`	_CNVs file generated by FREEC
`ratio`	`file`	ratio file generated by FREEC

Outputs

Name	Type	Pattern	Description
`p_value_txt`	`file`	`*.p.value.txt`	CNV file containing p_values for each call

Authors: @FriederikeHanssen Maintainers: @FriederikeHanssen

GOLEFT_INDEXCOV

Defined in modules/nf-core/goleft/indexcov/main.nf:1

Keywords: coverage, cnv, genomics, depth

Quickly estimate coverage from a whole-genome bam or cram index. A bam index has 16KB resolution so that's what this gives, but it provides what appears to be a high-quality coverage estimate in seconds per genome.

Tools

goleft

goleft is a collection of bioinformatics tools distributed under MIT license in a single static binary

Homepage | Documentation | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false]
`bams`	`file`	Sorted BAM/CRAM/SAM files
`indexes`	`file`	BAI/CRAI files
`meta2`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false]
`fai`	`file`	FASTA index

Outputs

Name	Type	Emit	Description
`val(meta)`	`tuple`	-	-

Authors: @lindenb Maintainers: @lindenb

DRAGMAP_ALIGN

Defined in modules/nf-core/dragmap/align/main.nf:1

Keywords: alignment, map, fastq, bam, sam

Performs fastq alignment to a reference using DRAGMAP

Tools

dragmap

Dragmap is the Dragen mapper/aligner Open Source Software.

Homepage | Documentation | License: GPL v3

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`reads`	`file`	List of input FastQ files of size 1 and 2 for single-end and paired-end data, respectively.
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'test', single_end:false ]
`hashmap`	`file`	DRAGMAP hash table
`meta3`	`map`	Groovy Map containing reference information e.g. [ id:'genome']
`fasta`	`file`	Genome fasta reference files
`sort_bam`	`boolean`	Sort the BAM file

Outputs

Name	Type	Emit	Description
`val(meta), path("*.sam")`	`tuple`	`sam`	-
`val(meta), path("*.bam")`	`tuple`	`bam`	-
`val(meta), path("*.cram")`	`tuple`	`cram`	-
`val(meta), path("*.crai")`	`tuple`	`crai`	-
`val(meta), path("*.csi")`	`tuple`	`csi`	-
`val(meta), path('*.log')`	`tuple`	`log`	-
`versions.yml`	`path`	`versions`	-

Authors: @edmundmiller Maintainers: @edmundmiller

DRAGMAP_HASHTABLE

Defined in modules/nf-core/dragmap/hashtable/main.nf:1

Keywords: index, fasta, genome, reference

Create DRAGEN hashtable for reference genome

Tools

dragmap

Dragmap is the Dragen mapper/aligner Open Source Software.

Homepage | Documentation | License: GPL v3

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing reference information e.g. [ id:'test', single_end:false ]
`fasta`	`file`	Input genome fasta file

Outputs

Name	Type	Pattern	Description
`hashmap`	`file`	`*.{cmp,.bin,.txt}`	DRAGMAP hash table

Authors: @edmundmiller Maintainers: @edmundmiller

STRELKA_GERMLINE

Defined in modules/nf-core/strelka/germline/main.nf:1

Keywords: variantcalling, germline, wgs, vcf, variants

Strelka2 is a fast and accurate small variant caller optimized for analysis of germline variation

Tools

strelka

Strelka calls somatic and germline small variants from mapped sequencing reads

Homepage | Documentation | biotools:strelka | License: GPL v3

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test']
`input`	`file`	BAM/CRAM file
`input_index`	`file`	BAM/CRAI index file
`target_bed`	`file`	BED file containing target regions for variant calling
`target_bed_index`	`file`	Index for BED file containing target regions for variant calling

Outputs

Name	Type	Pattern	Description
`vcf`	`file`	`*.{vcf.gz}`	gzipped germline variant file
`vcf_tbi`	`file`	`*.vcf.gz.tbi`	index file for the vcf file
`genome_vcf`	`file`	`*_genome.vcf.gz`	variant records and compressed non-variant blocks
`genome_vcf_tbi`	`file`	`*_genome.vcf.gz.tbi`	index file for the genome_vcf file

Authors: @arontommi Maintainers: @arontommi

STRELKA_SOMATIC

Defined in modules/nf-core/strelka/somatic/main.nf:1

Keywords: variant calling, germline, wgs, vcf, variants

Strelka2 is a fast and accurate small variant caller optimized for analysis of germline variation in small cohorts and somatic variation in tumor/normal sample pairs

Tools

strelka

Strelka calls somatic and germline small variants from mapped sequencing reads

Homepage | Documentation | biotools:strelka | License: GPL v3

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input_normal`	`file`	BAM/CRAM/SAM file
`input_index_normal`	`file`	BAM/CRAM/SAM index file
`input_tumor`	`file`	BAM/CRAM/SAM file
`input_index_tumor`	`file`	BAM/CRAM/SAM index file
`manta_candidate_small_indels`	`file`	VCF.gz file
`manta_candidate_small_indels_tbi`	`file`	VCF.gz index file
`target_bed`	`file`	BED file containing target regions for variant calling
`target_bed_index`	`file`	Index for BED file containing target regions for variant calling

Outputs

Name	Type	Pattern	Description
`vcf_indels`	`file`	`*.{vcf.gz}`	Gzipped VCF file containing variants
`vcf_indels_tbi`	`file`	`*.{vcf.gz.tbi}`	Index for gzipped VCF file containing variants
`vcf_snvs`	`file`	`*.{vcf.gz}`	Gzipped VCF file containing variants
`vcf_snvs_tbi`	`file`	`*.{vcf.gz.tbi}`	Index for gzipped VCF file containing variants

Authors: @drpatelh Maintainers: @drpatelh

BWA_INDEX

Defined in modules/nf-core/bwa/index/main.nf:1

Keywords: index, fasta, genome, reference

Create BWA index for reference genome

Tools

bwa

BWA is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Homepage | Documentation | biotools:bwa | License: GPL-3.0-or-later

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing reference information. e.g. [ id:'test', single_end:false ]
`fasta`	`file`	Input genome fasta file

Outputs

Name	Type	Pattern	Description
`index`	`map`	`*.{amb,ann,bwt,pac,sa}`	Groovy Map containing reference information. e.g. [ id:'test', single_end:false ]

Authors: @drpatelh, @maxulysse Maintainers: @drpatelh, @maxulysse, @gallvp

BWA_MEM

Defined in modules/nf-core/bwa/mem/main.nf:1

Keywords: mem, bwa, alignment, map, fastq, bam, sam

Performs fastq alignment to a fasta reference using BWA

Tools

bwa

BWA is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Homepage | Documentation | biotools:bwa | License: GPL-3.0-or-later

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`reads`	`file`	List of input FastQ files of size 1 and 2 for single-end and paired-end data, respectively.
`meta2`	`map`	Groovy Map containing reference information. e.g. [ id:'test', single_end:false ]
`index`	`file`	BWA genome index files
`meta3`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`fasta`	`file`	Reference genome in FASTA format
`sort_bam`	`boolean`	use samtools sort (true) or samtools view (false)

Outputs

Name	Type	Emit	Description
`val(meta), path("*.bam")`	`tuple`	`bam`	-
`val(meta), path("*.cram")`	`tuple`	`cram`	-
`val(meta), path("*.csi")`	`tuple`	`csi`	-
`val(meta), path("*.crai")`	`tuple`	`crai`	-
`versions.yml`	`path`	`versions`	-

Authors: @drpatelh, @jeremy1805, @matthdsm Maintainers: @drpatelh, @jeremy1805, @matthdsm

SNPEFF_SNPEFF

Defined in modules/nf-core/snpeff/snpeff/main.nf:1

Keywords: annotation, effect prediction, snpeff, variant, vcf

Genetic variant annotation and functional effect prediction toolbox

Tools

snpeff

SnpEff is a variant annotation and effect prediction tool. It annotates and predicts the effects of genetic variants on genes and proteins (such as amino acid changes).

Homepage | Documentation | biotools:snpeff | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`vcf`	`file`	vcf to annotate
`meta2`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`cache`	`file`	path to snpEff cache (optional)

Outputs

Name	Type	Pattern	Description
`vcf`	`file`	`*.ann.vcf`	annotated vcf
`report`	`file`	`*.csv`	snpEff report csv file
`summary_html`	`file`	`*.html`	snpEff summary statistics in html file
`genes_txt`	`file`	`*.genes.txt`	txt (tab separated) file having counts of the number of variants affecting each transcript and gene

Authors: @maxulysse Maintainers: @maxulysse

SNPEFF_DOWNLOAD

Defined in modules/nf-core/snpeff/download/main.nf:1

Keywords: annotation, effect prediction, snpeff, variant, vcf

Genetic variant annotation and functional effect prediction toolbox

Tools

snpeff

SnpEff is a variant annotation and effect prediction tool. It annotates and predicts the effects of genetic variants on genes and proteins (such as amino acid changes).

Homepage | Documentation | biotools:snpeff | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`snpeff_db`	`string`	SnpEff database name

Outputs

Name	Type	Pattern	Description
`cache`	`file`	`-`	snpEff cache

Authors: @maxulysse Maintainers: @maxulysse

NGSCHECKMATE_NCM

Defined in modules/nf-core/ngscheckmate/ncm/main.nf:1

Keywords: ngscheckmate, matching, snp

Determining whether sequencing data comes from the same individual by using SNP matching. Designed for humans on vcf or bam files.

Tools

ngscheckmate

NGSCheckMate is a software package for identifying next generation sequencing (NGS) data files from the same individual, including matching between DNA and RNA.

Homepage | Documentation | biotools:ngscheckmate | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test']
`files`	`file`	VCF or BAM files for each sample, in a merged channel (possibly gzipped). BAM files require an index too.
`meta2`	`map`	Groovy Map containing SNP information e.g. [ id:'test' ]
`snp_bed`	`file`	BED file containing the SNPs to analyse
`meta3`	`map`	Groovy Map containing reference fasta index information e.g. [ id:'test' ]
`fasta`	`file`	fasta file for the genome, only used in the bam mode

Outputs

Name	Type	Emit	Description
`val(meta), path("*_corr_matrix.txt")`	`tuple`	`corr_matrix`	-
`val(meta), path("*_matched.txt")`	`tuple`	`matched`	-
`val(meta), path("*_all.txt")`	`tuple`	`all`	-
`val(meta), path("*.pdf")`	`tuple`	`pdf`	-
`val(meta), path("*.vcf")`	`tuple`	`vcf`	-
`versions.yml`	`path`	`versions`	-

Authors: @sppearce Maintainers: @sppearce

ENSEMBLVEP_VEP

Defined in modules/nf-core/ensemblvep/vep/main.nf:1

Keywords: annotation, vcf, json, tab

Ensembl Variant Effect Predictor (VEP). The output-file-format is controlled through task.ext.args.

Tools

ensemblvep

VEP determines the effect of your variants (SNPs, insertions, deletions, CNVs or structural variants) on genes, transcripts, and protein sequence, as well as regulatory regions.

Homepage | Documentation | License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`vcf`	`file`	vcf to annotate
`custom_extra_files`	`file`	extra sample-specific files to be used with the `--custom` flag to be configured with ext.args (optional)
`meta2`	`map`	Groovy Map containing fasta reference information e.g. [ id:'test' ]
`fasta`	`file`	reference FASTA file (optional)

Outputs

Name	Type	Pattern	Description
`vcf`	`file`	`*.vcf.gz`	annotated vcf (optional)
`tbi`	`file`	`*.vcf.gz.tbi`	annotated vcf index (optional)
`tab`	`file`	`*.ann.tab.gz`	tab file with annotated variants (optional)
`json`	`file`	`*.ann.json.gz`	json file with annotated variants (optional)
`report`	`-`	`-`	-

Authors: @maxulysse, @matthdsm, @nvnieuwk Maintainers: @maxulysse, @matthdsm, @nvnieuwk

ENSEMBLVEP_DOWNLOAD

Defined in modules/nf-core/ensemblvep/download/main.nf:1

Keywords: annotation, cache, download

Ensembl Variant Effect Predictor (VEP). The cache downloading options are controlled through task.ext.args.

Tools

ensemblvep

VEP determines the effect of your variants (SNPs, insertions, deletions, CNVs or structural variants) on genes, transcripts, and protein sequence, as well as regulatory regions.

Homepage | Documentation | License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`assembly`	`string`	Genome assembly
`species`	`string`	Specie
`cache_version`	`string`	cache version

Outputs

Name	Type	Pattern	Description
`cache`	`file`	`*`	cache

Authors: @maxulysse Maintainers: @maxulysse

VARLOCIRAPTOR_CALLVARIANTS

Defined in modules/nf-core/varlociraptor/callvariants/main.nf:1

Keywords: observations, variants, calling

Call variants for a given scenario specified with the varlociraptor calling grammar, preprocessed by varlociraptor preprocessing

Tools

varlociraptor

Flexible, uncertainty-aware variant calling with parameter free filtration via FDR control.

Homepage | Documentation | biotools:varlociraptor | License: GPL v3

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`vcfs`	`file`	Sorted VCF/BCF file containing sample observations, Can also be a list of files

Outputs

Name	Type	Pattern	Description
`bcf`	`file`	`*.bcf`	BCF file containing sample observations

Authors: @FriederikeHanssen Maintainers: @FriederikeHanssen, @famosab

VARLOCIRAPTOR_PREPROCESS

Defined in modules/nf-core/varlociraptor/preprocess/main.nf:1

Keywords: observations, variants, preprocessing

Obtains per-sample observations for the actual calling process with varlociraptor calls

Tools

varlociraptor

Flexible, uncertainty-aware variant calling with parameter free filtration via FDR control.

Homepage | Documentation | biotools:varlociraptor | License: GPL v3

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`bam`	`file`	Sorted BAM/CRAM/SAM file
`bai`	`file`	Index of the BAM/CRAM/SAM file
`candidates`	`file`	Sorted BCF/VCF file
`alignment_json`	`file`	File containing alignment properties obtained with varlociraptor/estimatealignmentproperties
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'test', single_end:false ]
`fasta`	`file`	Reference fasta file
`meta3`	`map`	Groovy Map containing reference index information e.g. [ id:'test', single_end:false ]
`fai`	`file`	Index for reference fasta file (must be with samtools index)

Outputs

Name	Type	Pattern	Description
`bcf`	`file`	`*.bcf`	BCF file containing sample observations

Authors: @FriederikeHanssen Maintainers: @FriederikeHanssen, @famosab

VARLOCIRAPTOR_ESTIMATEALIGNMENTPROPERTIES

Defined in modules/nf-core/varlociraptor/estimatealignmentproperties/main.nf:1

Keywords: estimation, alignment, variants

In order to judge about candidate indel and structural variants, Varlociraptor needs to know about certain properties of the underlying sequencing experiment in combination with the used read aligner.

Tools

varlociraptor

Flexible, uncertainty-aware variant calling with parameter free filtration via FDR control.

Homepage | Documentation | biotools:varlociraptor | License: GPL v3

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`bam`	`file`	Sorted BAM/CRAM/SAM file
`bai`	`file`	Index of sorted BAM/CRAM/SAM file
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'test', single_end:false ]
`fasta`	`file`	Reference fasta file
`meta3`	`map`	Groovy Map containing reference index information e.g. [ id:'test', single_end:false ]
`fai`	`file`	Index for reference fasta file (must be with samtools index)

Outputs

Name	Type	Pattern	Description
`alignment_properties_json`	`file`	`*.alignment-properties.json`	File containing alignment properties

Authors: @FriederikeHanssen Maintainers: @FriederikeHanssen, @famosab

CNVKIT_GENEMETRICS

Defined in modules/nf-core/cnvkit/genemetrics/main.nf:1

Keywords: cnvkit, bam, fasta, copy number

Copy number variant detection from high-throughput sequencing data

Tools

cnvkit

CNVkit is a Python library and command-line software toolkit to infer and visualize copy number from high-throughput DNA sequencing data. It is designed for use with hybrid capture, including both whole-exome and custom target panels, and short-read sequencing platforms such as Illumina and Ion Torrent.

Homepage | Documentation | biotools:cnvkit | License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`cnr`	`file`	CNR file
`cns`	`file`	CNS file [Optional]

Outputs

Name	Type	Emit	Description
`val(meta), path("*.tsv")`	`tuple`	`tsv`	-
`versions.yml`	`path`	`versions`	-

Authors: @adamrtalbot, @marrip, @priesgo Maintainers: @adamrtalbot, @marrip, @priesgo

CNVKIT_CALL

Defined in modules/nf-core/cnvkit/call/main.nf:1

Keywords: cnvkit, bam, fasta, copy number

Given segmented log2 ratio estimates (.cns), derive each segment’s absolute integer copy number

Tools

cnvkit

CNVkit is a Python library and command-line software toolkit to infer and visualize copy number from high-throughput DNA sequencing data. It is designed for use with hybrid capture, including both whole-exome and custom target panels, and short-read sequencing platforms such as Illumina and Ion Torrent.

Homepage | Documentation | biotools:cnvkit | License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`cns`	`file`	CNVKit CNS file.
`vcf`	`file`	Germline VCF file for BAF.

Outputs

Name	Type	Emit	Description
`val(meta), path("*.cns")`	`tuple`	`cns`	-
`versions.yml`	`path`	`versions`	-

Authors: @adamrtalbot, @priesgo Maintainers: @adamrtalbot, @priesgo

CNVKIT_BATCH

Defined in modules/nf-core/cnvkit/batch/main.nf:1

Keywords: cnvkit, bam, fasta, copy number

Copy number variant detection from high-throughput sequencing data

Tools

cnvkit

CNVkit is a Python library and command-line software toolkit to infer and visualize copy number from high-throughput DNA sequencing data. It is designed for use with hybrid capture, including both whole-exome and custom target panels, and short-read sequencing platforms such as Illumina and Ion Torrent.

Homepage | Documentation | biotools:cnvkit | License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`tumor`	`file`	Input tumour sample bam file (or cram)
`normal`	`file`	Input normal sample bam file (or cram)
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'test' ]
`fasta`	`file`	Input reference genome fasta file (only needed for cram_input and/or when normal_samples are provided)
`meta3`	`map`	Groovy Map containing reference information e.g. [ id:'test' ]
`fasta_fai`	`file`	Input reference genome fasta index (optional, but recommended for cram_input)
`meta4`	`map`	Groovy Map containing information about target file e.g. [ id:'test' ]
`targets`	`file`	Input target bed file
`meta5`	`map`	Groovy Map containing information about reference file e.g. [ id:'test' ]
`reference`	`file`	Input reference cnn-file (only for germline and tumor-only running)
`panel_of_normals`	`file`	Input panel of normals file

Outputs

Name	Type	Emit	Description
`val(meta), path("*.bed")`	`tuple`	`bed`	-
`val(meta), path("*.cnn")`	`tuple`	`cnn`	-
`val(meta), path("*.cnr")`	`tuple`	`cnr`	-
`val(meta), path("*.cns")`	`tuple`	`cns`	-
`val(meta), path("*.pdf")`	`tuple`	`pdf`	-
`val(meta), path("*.png")`	`tuple`	`png`	-
`versions.yml`	`path`	`versions`	-

Authors: @adamrtalbot, @drpatelh, @fbdtemme, @kaurravneet4123, @KevinMenden, @lassefolkersen, @MaxUlysse, @priesgo, @SusiJo Maintainers: @adamrtalbot, @drpatelh, @fbdtemme, @kaurravneet4123, @KevinMenden, @lassefolkersen, @MaxUlysse, @priesgo, @SusiJo

CNVKIT_ANTITARGET

Defined in modules/nf-core/cnvkit/antitarget/main.nf:1

Keywords: cvnkit, antitarget, cnv, copy number

Derive off-target (“antitarget”) bins from target regions.

Tools

cnvkit

CNVkit is a Python library and command-line software toolkit to infer and visualize copy number from high-throughput DNA sequencing data. It is designed for use with hybrid capture, including both whole-exome and custom target panels, and short-read sequencing platforms such as Illumina and Ion Torrent.

Homepage | Documentation | biotools:cnvkit | License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`targets`	`file`	File containing genomic regions

Outputs

Name	Type	Emit	Description
`val(meta), path("*.bed")`	`tuple`	`bed`	-
`versions.yml`	`path`	`versions`	-

Authors: @adamrtalbot, @priesgo, @SusiJo Maintainers: @adamrtalbot, @priesgo, @SusiJo

CNVKIT_EXPORT

Defined in modules/nf-core/cnvkit/export/main.nf:1

Keywords: cnvkit, copy number, export

Convert copy number ratio tables (.cnr files) or segments (.cns) to another format.

Tools

cnvkit

CNVkit is a Python library and command-line software toolkit to infer and visualize copy number from high-throughput DNA sequencing data. It is designed for use with hybrid capture, including both whole-exome and custom target panels, and short-read sequencing platforms such as Illumina and Ion Torrent.

Homepage | Documentation | biotools:cnvkit | License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`cns`	`file`	CNVKit CNS file.

Outputs

Name	Type	Emit	Description
`val(meta)`	`tuple`	-	-

Authors: @adamrtalbot, @priesgo Maintainers: @adamrtalbot, @priesgo

CNVKIT_REFERENCE

Defined in modules/nf-core/cnvkit/reference/main.nf:1

Keywords: cnvkit, reference, cnv, copy number

Compile a coverage reference from the given files (normal samples).

Tools

cnvkit

CNVkit is a Python library and command-line software toolkit to infer and visualize copy number from high-throughput DNA sequencing data. It is designed for use with hybrid capture, including both whole-exome and custom target panels, and short-read sequencing platforms such as Illumina and Ion Torrent.

Homepage | Documentation | biotools:cnvkit | License: Apache-2.0

Inputs

Name	Type	Description
`fasta`	`file`	File containing reference genome
`targets`	`file`	File containing genomic regions
`antitargets`	`file`	File containing off-target genomic regions

Outputs

Name	Type	Emit	Description
`*.cnn`	`path`	`cnn`	-
`versions.yml`	`path`	`versions`	-

Authors: @adamrtalbot, @priesgo, @SusiJo Maintainers: @adamrtalbot, @priesgo, @SusiJo

RBT_VCFSPLIT

Defined in modules/nf-core/rbt/vcfsplit/main.nf:1

Keywords: genomics, splitting, VCF, BCF, variants

A tool for splitting VCF/BCF files into N equal chunks, including BND support

Tools

rust-bio-tools

A growing collection of fast and secure command line utilities for dealing with NGS data implemented on top of Rust-Bio.

Homepage | Documentation | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. `[ id:'sample1' ]`
`vcf`	`file`	VCF file with variants to be split

Outputs

Name	Type	Pattern	Description
`bcfchunks`	`file`	`*.bcf`	Chunks of the input VCF file, split into `numchunks` equal parts.

Authors: @famosab Maintainers: @famosab

FGBIO_FASTQTOBAM

Defined in modules/nf-core/fgbio/fastqtobam/main.nf:1

Keywords: unaligned, bam, cram

Using the fgbio tools, converts FASTQ files sequenced into unaligned BAM or CRAM files possibly moving the UMI barcode into the RX field of the reads

Tools

fgbio

A set of tools for working with genomic and high throughput sequencing data, including UMIs

Homepage | Documentation | biotools:fgbio | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`reads`	`file`	pair of reads to be converted into BAM file

Outputs

Name	Type	Emit	Description
`val(meta), path("*.bam")`	`tuple`	`bam`	-
`val(meta), path("*.cram")`	`tuple`	`cram`	-
`versions.yml`	`path`	`versions`	-

Authors: @lescai, @matthdsm, @nvnieuwk Maintainers: @lescai, @matthdsm, @nvnieuwk

FGBIO_COPYUMIFROMREADNAME

Defined in modules/nf-core/fgbio/copyumifromreadname/main.nf:1

Keywords: sort, example, genomics

Copies the UMI at the end of a bam files read name to the RX tag.

Tools

fgbio

A set of tools for working with genomic and high throughput sequencing data, including UMIs

Homepage | Documentation | biotools:fgbio | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. `[ id:'sample1' ]`
`bam`	`file`	Sorted BAM/CRAM/SAM file
`bai`	`file`	Index for bam file

Outputs

Name	Type	Pattern	Description
`bam`	`file`	`*.{bam}`	Sorted BAM file
`bai`	`file`	`*.{bai}`	Index for bam file

Authors: @sppearce Maintainers: @sppearce

FGBIO_CALLMOLECULARCONSENSUSREADS

Defined in modules/nf-core/fgbio/callmolecularconsensusreads/main.nf:1

Keywords: UMIs, consensus sequence, bam

Calls consensus sequences from reads with the same unique molecular tag.

Tools

fgbio

Tools for working with genomic and high throughput sequencing data.

Homepage | Documentation | biotools:fgbio | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false, collapse:false ]
`grouped_bam`	`file`	The input SAM or BAM file, grouped by UMIs
`min_reads`	`integer`	Minimum number of original reads to build each consensus read.
`min_baseq`	`integer`	Ignore bases in raw reads that have Q below this value.

Outputs

Name	Type	Emit	Description
`val(meta), path("*.bam")`	`tuple`	`bam`	-
`versions.yml`	`path`	`versions`	-

Authors: @sruthipsuresh Maintainers: @sruthipsuresh

FGBIO_GROUPREADSBYUMI

Defined in modules/nf-core/fgbio/groupreadsbyumi/main.nf:1

Keywords: UMI, groupreads, fgbio

Groups reads together that appear to have come from the same original molecule. Reads are grouped by template, and then templates are sorted by the 5’ mapping positions of the reads from the template, used from earliest mapping position to latest. Reads that have the same end positions are then sub-grouped by UMI sequence. (!) Note: the MQ tag is required on reads with mapped mates (!) This can be added using samblaster with the optional argument --addMateTags.

Tools

fgbio

A set of tools for working with genomic and high throughput sequencing data, including UMIs

Homepage | Documentation | biotools:fgbio | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`bam`	`file`	BAM file. Note: the MQ tag is required on reads with mapped mates (!)
`strategy`	`string`	Required argument: defines the UMI assignment strategy. Must be chosen among: Identity, Edit, Adjacency, Paired.

Outputs

Name	Type	Emit	Description
`val(meta), path("*.bam")`	`tuple`	`bam`	-
`val(meta), path("*histogram.txt")`	`tuple`	`histogram`	-
`versions.yml`	`path`	`versions`	-

Authors: @lescai Maintainers: @lescai

BWAMEM2_INDEX

Defined in modules/nf-core/bwamem2/index/main.nf:1

Keywords: index, fasta, genome, reference

Create BWA-mem2 index for reference genome

Tools

bwamem2

BWA-mem2 is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Homepage | Documentation | biotools:bwa-mem2 | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`fasta`	`file`	Input genome fasta file

Outputs

Name	Type	Pattern	Description
`index`	`file`	`*.{0123,amb,ann,bwt.2bit.64,pac}`	BWA genome index files

Authors: @maxulysse Maintainers: @maxulysse

BWAMEM2_MEM

Defined in modules/nf-core/bwamem2/mem/main.nf:1

Keywords: mem, bwa, alignment, map, fastq, bam, sam

Performs fastq alignment to a fasta reference using BWA

Tools

bwa

BWA-mem2 is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Homepage | Documentation | biotools:bwa-mem2 | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`reads`	`file`	List of input FastQ files of size 1 and 2 for single-end and paired-end data, respectively.
`meta2`	`map`	Groovy Map containing reference/index information e.g. [ id:'test' ]
`index`	`file`	BWA genome index files
`meta3`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fasta`	`file`	Reference genome in FASTA format
`sort_bam`	`boolean`	use samtools sort (true) or samtools view (false)

Outputs

Name	Type	Emit	Description
`val(meta), path("*.sam")`	`tuple`	`sam`	-
`val(meta), path("*.bam")`	`tuple`	`bam`	-
`val(meta), path("*.cram")`	`tuple`	`cram`	-
`val(meta), path("*.crai")`	`tuple`	`crai`	-
`val(meta), path("*.csi")`	`tuple`	`csi`	-
`versions.yml`	`path`	`versions`	-

Authors: @maxulysse, @matthdsm Maintainers: @maxulysse, @matthdsm

MANTA_TUMORONLY

Defined in modules/nf-core/manta/tumoronly/main.nf:1

Keywords: somatic, wgs, wxs, panel, vcf, structural variants, small indels

Manta calls structural variants (SVs) and indels from mapped paired-end sequencing reads. It is optimized for analysis of germline variation in small sets of individuals and somatic variation in tumor/normal sample pairs.

Tools

manta

Structural variant and indel caller for mapped sequencing data

Homepage | Documentation | biotools:manta_sv | License: GPL v3

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	BAM/CRAM/SAM file
`input_index`	`file`	BAM/CRAM/SAM index file
`target_bed`	`file`	BED file containing target regions for variant calling
`target_bed_tbi`	`file`	Index for BED file containing target regions for variant calling
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fasta`	`file`	Genome reference FASTA file
`meta3`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fai`	`file`	Genome reference FASTA index file

Outputs

Name	Type	Pattern	Description
`candidate_small_indels_vcf`	`file`	`*.{vcf.gz}`	Gzipped VCF file containing variants
`candidate_small_indels_vcf_tbi`	`file`	`*.{vcf.gz.tbi}`	Index for gzipped VCF file containing variants
`candidate_sv_vcf`	`file`	`*.{vcf.gz}`	Gzipped VCF file containing variants
`candidate_sv_vcf_tbi`	`file`	`*.{vcf.gz.tbi}`	Index for gzipped VCF file containing variants
`tumor_sv_vcf`	`file`	`*.{vcf.gz}`	Gzipped VCF file containing variants
`tumor_sv_vcf_tbi`	`file`	`*.{vcf.gz.tbi}`	Index for gzipped VCF file containing variants

Authors: @maxulysse, @nvnieuwk Maintainers: @maxulysse, @nvnieuwk

MANTA_GERMLINE

Defined in modules/nf-core/manta/germline/main.nf:1

Keywords: somatic, wgs, wxs, panel, vcf, structural variants, small indels

Manta calls structural variants (SVs) and indels from mapped paired-end sequencing reads. It is optimized for analysis of germline variation in small sets of individuals and somatic variation in tumor/normal sample pairs.

Tools

manta

Structural variant and indel caller for mapped sequencing data

Homepage | Documentation | biotools:manta_sv | License: GPL v3

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	BAM/CRAM/SAM file. For joint calling use a list of files.
`index`	`file`	BAM/CRAM/SAM index file. For joint calling use a list of files.
`target_bed`	`file`	BED file containing target regions for variant calling
`target_bed_tbi`	`file`	Index for BED file containing target regions for variant calling
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fasta`	`file`	Genome reference FASTA file
`meta3`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fai`	`file`	Genome reference FASTA index file

Outputs

Name	Type	Pattern	Description
`candidate_small_indels_vcf`	`file`	`*.{vcf.gz}`	Gzipped VCF file containing variants
`candidate_small_indels_vcf_tbi`	`file`	`*.{vcf.gz.tbi}`	Index for gzipped VCF file containing variants
`candidate_sv_vcf`	`file`	`*.{vcf.gz}`	Gzipped VCF file containing variants
`candidate_sv_vcf_tbi`	`file`	`*.{vcf.gz.tbi}`	Index for gzipped VCF file containing variants
`diploid_sv_vcf`	`file`	`*.{vcf.gz}`	Gzipped VCF file containing variants
`diploid_sv_vcf_tbi`	`file`	`*.{vcf.gz.tbi}`	Index for gzipped VCF file containing variants

Authors: @maxulysse, @ramprasadn, @nvnieuwk Maintainers: @maxulysse, @ramprasadn, @nvnieuwk

MANTA_SOMATIC

Defined in modules/nf-core/manta/somatic/main.nf:1

Keywords: somatic, wgs, wxs, panel, vcf, structural variants, small indels

Manta calls structural variants (SVs) and indels from mapped paired-end sequencing reads. It is optimized for analysis of germline variation in small sets of individuals and somatic variation in tumor/normal sample pairs.

Tools

manta

Structural variant and indel caller for mapped sequencing data

Homepage | Documentation | biotools:manta_sv | License: GPL v3

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input_normal`	`file`	BAM/CRAM/SAM file
`input_index_normal`	`file`	BAM/CRAM/SAM index file
`input_tumor`	`file`	BAM/CRAM/SAM file
`input_index_tumor`	`file`	BAM/CRAM/SAM index file
`target_bed`	`file`	BED file containing target regions for variant calling
`target_bed_tbi`	`file`	Index for BED file containing target regions for variant calling
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fasta`	`file`	Genome reference FASTA file
`meta3`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fai`	`file`	Genome reference FASTA index file

Outputs

Name	Type	Pattern	Description
`candidate_small_indels_vcf`	`file`	`*.{vcf.gz}`	Gzipped VCF file containing variants
`candidate_small_indels_vcf_tbi`	`file`	`*.{vcf.gz.tbi}`	Index for gzipped VCF file containing variants
`candidate_sv_vcf`	`file`	`*.{vcf.gz}`	Gzipped VCF file containing variants
`candidate_sv_vcf_tbi`	`file`	`*.{vcf.gz.tbi}`	Index for gzipped VCF file containing variants
`diploid_sv_vcf`	`file`	`*.{vcf.gz}`	Gzipped VCF file containing variants
`diploid_sv_vcf_tbi`	`file`	`*.{vcf.gz.tbi}`	Index for gzipped VCF file containing variants
`somatic_sv_vcf`	`file`	`*.{vcf.gz}`	Gzipped VCF file containing variants
`somatic_sv_vcf_tbi`	`file`	`*.{vcf.gz.tbi}`	Index for gzipped VCF file containing variants

Authors: @FriederikeHanssen, @nvnieuwk Maintainers: @FriederikeHanssen, @nvnieuwk

BCFTOOLS_CONCAT

Defined in modules/nf-core/bcftools/concat/main.nf:1

Keywords: variant calling, concat, bcftools, VCF

Concatenate VCF files

Tools

concat

Concatenate VCF files.

Homepage | Documentation | biotools:bcftools | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`vcfs`	`list`	List containing 2 or more vcf files e.g. [ 'file1.vcf', 'file2.vcf' ]
`tbi`	`list`	List containing 2 or more index files (optional) e.g. [ 'file1.tbi', 'file2.tbi' ]

Outputs

Name	Type	Emit	Description
`val(meta)`	`tuple`	-	-

Authors: @abhi18av, @nvnieuwk Maintainers: @abhi18av, @nvnieuwk

BCFTOOLS_SORT

Defined in modules/nf-core/bcftools/sort/main.nf:1

Keywords: sorting, VCF, variant calling

Sorts VCF files

Tools

sort

Sort VCF files by coordinates.

Homepage | Documentation | biotools:bcftools | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`vcf`	`file`	The VCF/BCF file to be sorted

Outputs

Name	Type	Emit	Description
`val(meta)`	`tuple`	-	-

Authors: @Gwennid Maintainers: @Gwennid

BCFTOOLS_MERGE

Defined in modules/nf-core/bcftools/merge/main.nf:1

Keywords: variant calling, merge, VCF

Merge VCF files

Tools

merge

Merge VCF files.

Homepage | Documentation | biotools:bcftools | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`vcfs`	`file`	List containing 2 or more vcf files e.g. [ 'file1.vcf', 'file2.vcf' ]
`tbis`	`file`	List containing the tbi index files corresponding to the vcfs input files e.g. [ 'file1.vcf.tbi', 'file2.vcf.tbi' ]
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fasta`	`file`	(Optional) The fasta reference file (only necessary for the `--gvcf FILE` parameter)
`meta3`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fai`	`file`	(Optional) The fasta reference file index (only necessary for the `--gvcf FILE` parameter)
`meta4`	`map`	Groovy Map containing bed information e.g. [ id:'genome' ]
`bed`	`file`	(Optional) The bed regions to merge on

Outputs

Name	Type	Pattern	Description
`vcf`	`file`	`*.{vcf,vcf.gz,bcf,bcf.gz}`	merged output file
`index`	`file`	`*.{csi,tbi}`	index of merged output

Authors: @joseespinosa, @drpatelh, @nvnieuwk, @ramprasadn Maintainers: @joseespinosa, @drpatelh, @nvnieuwk, @ramprasadn

BCFTOOLS_MPILEUP

Defined in modules/nf-core/bcftools/mpileup/main.nf:1

Keywords: variant calling, mpileup, VCF

Compresses VCF files

Tools

mpileup

Generates genotype likelihoods at each genomic position with coverage.

Homepage | Documentation | biotools:bcftools | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`bam`	`file`	Input BAM file
`intervals`	`file`	Input intervals file. A file (commonly '.bed') containing regions to subset
`meta2`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`fasta`	`file`	FASTA reference file
`save_mpileup`	`boolean`	Save mpileup file generated by bcftools mpileup

Outputs

Name	Type	Emit	Description
`val(meta), path("*vcf.gz")`	`tuple`	`vcf`	-
`val(meta), path("*vcf.gz.tbi")`	`tuple`	`tbi`	-
`val(meta), path("*stats.txt")`	`tuple`	`stats`	-
`val(meta), path("*.mpileup.gz")`	`tuple`	`mpileup`	-
`versions.yml`	`path`	`versions`	-

Authors: @joseespinosa, @drpatelh Maintainers: @joseespinosa, @drpatelh

BCFTOOLS_ANNOTATE

Defined in modules/nf-core/bcftools/annotate/main.nf:1

Keywords: bcftools, annotate, vcf, remove, add

Add or remove annotations.

Tools

annotate

Add or remove annotations.

Homepage | Documentation | biotools:bcftools | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	Query VCF or BCF file, can be either uncompressed or compressed
`index`	`file`	Index of the query VCF or BCF file
`annotations`	`file`	Bgzip-compressed file with annotations
`annotations_index`	`file`	Index of the annotations file

Outputs

Name	Type	Pattern	Description
`vcf`	`file`	`*{vcf,vcf.gz,bcf,bcf.gz}`	Compressed annotated VCF file
`tbi`	`file`	`*.tbi`	Alternative VCF file index
`csi`	`file`	`*.csi`	Default VCF file index

Authors: @projectoriented, @ramprasadn Maintainers: @projectoriented, @ramprasadn

BCFTOOLS_NORM

Defined in modules/nf-core/bcftools/norm/main.nf:1

Keywords: normalize, norm, variant calling, VCF

Normalize VCF file

Tools

norm

Normalize VCF files.

Homepage | Documentation | biotools:bcftools | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`vcf`	`file`	The vcf file to be normalized e.g. 'file1.vcf'
`tbi`	`file`	An optional index of the VCF file (for when the VCF is compressed)
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fasta`	`file`	FASTA reference file

Outputs

Name	Type	Emit	Description
`val(meta)`	`tuple`	-	-

Authors: @abhi18av, @ramprasadn Maintainers: @abhi18av, @ramprasadn

BCFTOOLS_VIEW

Defined in modules/nf-core/bcftools/view/main.nf:1

Keywords: variant calling, view, bcftools, VCF

View, subset and filter VCF or BCF files by position and filtering expression. Convert between VCF and BCF

Tools

view

View, subset and filter VCF or BCF files by position and filtering expression. Convert between VCF and BCF

Homepage | Documentation | biotools:bcftools | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`vcf`	`file`	The vcf file to be inspected. e.g. 'file.vcf'
`index`	`file`	The tab index for the VCF file to be inspected. e.g. 'file.tbi'

Outputs

Name	Type	Pattern	Description
`vcf`	`file`	`*.{vcf,vcf.gz,bcf,bcf.gz}`	VCF normalized output file
`tbi`	`file`	`*.tbi`	Alternative VCF file index
`csi`	`file`	`*.csi`	Default VCF file index

Authors: @abhi18av Maintainers: @abhi18av

BCFTOOLS_STATS

Defined in modules/nf-core/bcftools/stats/main.nf:1

Keywords: variant calling, stats, VCF

Generates stats from VCF files

Tools

stats

Parses VCF or BCF and produces text file stats which is suitable for machine processing and can be plotted using plot-vcfstats.

Homepage | Documentation | biotools:bcftools | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`vcf`	`file`	VCF input file
`tbi`	`file`	The tab index for the VCF file to be inspected. Optional: only required when parameter regions is chosen.
`meta2`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`regions`	`file`	Optionally, restrict the operation to regions listed in this file. (VCF, BED or tab-delimited)
`meta3`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`targets`	`file`	Optionally, restrict the operation to regions listed in this file (doesn't rely upon tbi index files)
`meta4`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`samples`	`file`	Optional, file of sample names to be included or excluded. e.g. 'file.tsv'
`meta5`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`exons`	`file`	Tab-delimited file with exons for indel frameshifts (chr,beg,end; 1-based, inclusive, optionally bgzip compressed). e.g. 'exons.tsv.gz'
`meta6`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`fasta`	`file`	Faidx indexed reference sequence file to determine INDEL context. e.g. 'reference.fa'

Outputs

Name	Type	Emit	Description
`val(meta), path("*stats.txt")`	`tuple`	`stats`	-
`versions.yml`	`path`	`versions`	-

Authors: @joseespinosa, @drpatelh, @SusiJo, @TCLamnidis Maintainers: @joseespinosa, @drpatelh, @SusiJo, @TCLamnidis

BCFTOOLS_ISEC

Defined in modules/nf-core/bcftools/isec/main.nf:1

Keywords: variant calling, intersect, union, complement, VCF, BCF

Apply set operations to VCF files

Tools

isec

Computes intersections, unions and complements of VCF files.

Homepage | Documentation | biotools:bcftools | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`vcfs`	`list`	List containing 2 or more vcf/bcf files. These must be compressed and have an associated index. e.g. [ 'file1.vcf.gz', 'file2.vcf' ]
`tbis`	`list`	List containing the tbi index files corresponding to the vcf/bcf input files e.g. [ 'file1.vcf.tbi', 'file2.vcf.tbi' ]

Outputs

Name	Type	Pattern	Description
`results`	`directory`	`${prefix}`	Folder containing the set operations results perform on the vcf files

Authors: @joseespinosa, @drpatelh Maintainers: @joseespinosa, @drpatelh

MSISENSORPRO_SCAN

Defined in modules/nf-core/msisensorpro/scan/main.nf:1

Keywords: micro-satellite-scan, msisensor-pro, scan

MSIsensor-pro evaluates Microsatellite Instability (MSI) for cancer patients with next generation sequencing data. It accepts the whole genome sequencing, whole exome sequencing and target region (panel) sequencing data as input

Tools

msisensorpro

Microsatellite Instability (MSI) detection using high-throughput sequencing data.

Homepage | Documentation | License: Custom Licence

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`fasta`	`file`	Reference genome

Outputs

Name	Type	Pattern	Description
`list`	`file`	`*.{list}`	File containing microsatellite list

Authors: @FriederikeHanssen Maintainers: @FriederikeHanssen

MSISENSORPRO_MSISOMATIC

Defined in modules/nf-core/msisensorpro/msisomatic/main.nf:1

Keywords: micro-satellite-scan, msisensor-pro, msi, somatic

MSIsensor-pro evaluates Microsatellite Instability (MSI) for cancer patients with next generation sequencing data. It accepts the whole genome sequencing, whole exome sequencing and target region (panel) sequencing data as input

Tools

msisensorpro

Microsatellite Instability (MSI) detection using high-throughput sequencing data.

Homepage | Documentation | License: Custom Licence

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`normal`	`file`	BAM/CRAM/SAM file
`normal_index`	`file`	BAM/CRAM/SAM index file
`tumor`	`file`	BAM/CRAM/SAM file
`tumor_index`	`file`	BAM/CRAM/SAM index file
`intervals`	`file`	bed file containing interval information, optional
`meta2`	`map`	Groovy Map containing genome information e.g. [ id:'genome' ]
`fasta`	`file`	Reference genome

Outputs

Name	Type	Pattern	Description
`output_report`	`file`	`-`	File containing final report with all detected microsatellites, unstable somatic microsatellites, msi score
`output_dis`	`file`	`-`	File containing distribution results
`output_germline`	`file`	`-`	File containing germline results
`output_somatic`	`file`	`-`	File containing somatic results

Authors: @FriederikeHanssen Maintainers: @FriederikeHanssen

MUSE_SUMP

Defined in modules/nf-core/muse/sump/main.nf:1

Keywords: variant calling, somatic, wgs, wxs, vcf

Computes tier-based cutoffs from a sample-specific error model which is generated by muse/call and reports the finalized variants

Tools

MuSE

Somatic point mutation caller based on Markov substitution model for molecular evolution

Homepage | Documentation | License: https://github.com/danielfan/MuSE/blob/master/LICENSE

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. `[ id:'sample1', single_end:false ]`
`muse_call_txt`	`file`	single input file generated by 'MuSE call'
`meta2`	`map`	Groovy Map containing reference information. e.g. `[ id:'test' ]`
`ref_vcf`	`file`	dbSNP vcf file that should be bgzip compressed, tabix indexed and based on the same reference genome used in 'MuSE call'
`ref_vcf_tbi`	`file`	Tabix index for the dbSNP vcf file

Outputs

Name	Type	Pattern	Description
`vcf`	`map`	`*.vcf.gz`	bgzipped vcf file with called variants
`tbi`	`map`	`*.vcf.gz.tbi`	tabix index of bgzipped vcf file with called variants

Authors: @famosab Maintainers: @famosab

MUSE_CALL

Defined in modules/nf-core/muse/call/main.nf:1

Keywords: variant calling, somatic, wgs, wxs, vcf

pre-filtering and calculating position-specific summary statistics using the Markov substitution model

Tools

MuSE

Somatic point mutation caller based on Markov substitution model for molecular evolution

Homepage | Documentation | License: https://github.com/danielfan/MuSE/blob/master/LICENSE

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. `[ id:'sample1' ]`
`tumor_bam`	`file`	Sorted tumor BAM file
`tumor_bai`	`file`	Index file for the tumor BAM file
`normal_bam`	`file`	Sorted matched normal BAM file
`normal_bai`	`file`	Index file for the normal BAM file
`meta2`	`map`	Groovy Map containing reference information. e.g. `[ id:'test' ]`
`reference`	`file`	reference genome file

Outputs

Name	Type	Pattern	Description
`txt`	`file`	`*.MuSE.txt`	position-specific summary statistics

Authors: @famosab Maintainers: @famosab

PARABRICKS_FQ2BAM

Defined in modules/nf-core/parabricks/fq2bam/main.nf:1

Keywords: align, sort, bqsr, duplicates

NVIDIA Clara Parabricks GPU-accelerated alignment, sorting, BQSR calculation, and duplicate marking. Note this nf-core module requires files to be copied into the working directory and not symlinked.

Tools

parabricks

NVIDIA Clara Parabricks GPU-accelerated genomics tools

Homepage | Documentation | License: custom

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`reads`	`file`	fastq.gz files
`meta2`	`map`	Groovy Map containing fasta information
`fasta`	`file`	reference fasta file - must be unzipped
`meta3`	`map`	Groovy Map containing index information
`index`	`file`	reference BWA index
`meta4`	`map`	Groovy Map containing index information
`interval_file`	`file`	(optional) file(s) containing genomic intervals for use in base quality score recalibration (BQSR)
`meta5`	`map`	Groovy Map containing known sites information
`known_sites`	`file`	(optional) known sites file(s) for calculating BQSR. markdups must be true to perform BQSR.

Outputs

Name	Type	Pattern	Description
`bam`	`file`	`*.bam`	Sorted BAM file
`bai`	`file`	`*.bai`	index corresponding to sorted BAM file
`cram`	`file`	`*.cram`	Sorted CRAM file
`crai`	`file`	`*.crai`	index corresponding to sorted CRAM file
`bqsr_table`	`file`	`*.table`	(optional) table from base quality score recalibration calculation, to be used with parabricks/applybqsr
`qc_metrics`	`directory`	`*_qc_metrics`	(optional) optional directory of qc metrics
`duplicate_metrics`	`file`	`*.duplicate-metrics.txt`	(optional) metrics calculated from marking duplicates in the bam file
`compatible_versions`	`-`	`-`	-

Authors: @bsiranosian, @adamrtalbot Maintainers: @bsiranosian, @adamrtalbot, @gallvp, @famosab

SVDB_MERGE

Defined in modules/nf-core/svdb/merge/main.nf:1

Keywords: structural variants, vcf, merge

The merge module merges structural variants within one or more vcf files.

Tools

svdb

structural variant database software

Homepage | Documentation | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test' ]
`vcfs`	`list`	One or more VCF files. The order and number of files should correspond to the order and number of tags in the `priority` input channel.
`input_priority`	`list`	Prioritize the input VCF files according to this list, e.g ['tiddit','cnvnator']. The order and number of tags should correspond to the order and number of VCFs in the `vcfs` input channel.
`sort_inputs`	`boolean`	Should the input files be sorted by name. The priority tag will be sorted together with it's corresponding VCF file.

Outputs

Name	Type	Emit	Description
`val(meta)`	`tuple`	-	-

Authors: @ramprasadn Maintainers: @ramprasadn, @fellen31

TIDDIT_SV

Defined in modules/nf-core/tiddit/sv/main.nf:1

Keywords: structural, variants, vcf

Identify chromosomal rearrangements.

Tools

sv

Search for structural variants.

Homepage | Documentation | biotools:tiddit | License: GPL-3.0-or-later

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	BAM/CRAM file
`input_index`	`file`	BAM/CRAM index file
`meta2`	`map`	Groovy Map containing sample information e.g. `[ id:'test_fasta']`
`fasta`	`file`	Input FASTA file
`meta3`	`map`	Groovy Map containing sample information from bwa index e.g. `[ id:'test_bwa-index' ]`
`bwa_index`	`file`	BWA genome index files

Outputs

Name	Type	Emit	Description
`val(meta), path("*.vcf")`	`tuple`	`vcf`	-
`val(meta), path("*.ploidies.tab")`	`tuple`	`ploidy`	-
`versions.yml`	`path`	`versions`	-

Authors: @maxulysse Maintainers: @maxulysse

TABIX_TABIX

Defined in modules/nf-core/tabix/tabix/main.nf:1

Keywords: index, tabix, vcf

create tabix index from a sorted bgzip tab-delimited genome file

Tools

tabix

Generic indexer for TAB-delimited genome position files.

Homepage | Documentation | biotools:tabix | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`tab`	`file`	TAB-delimited genome position file compressed with bgzip

Outputs

Name	Type	Pattern	Description
`tbi`	`file`	`*.{tbi}`	tabix index file
`csi`	`file`	`*.{csi}`	coordinate sorted index file

Authors: @joseespinosa, @drpatelh, @maxulysse Maintainers: @joseespinosa, @drpatelh, @maxulysse

TABIX_BGZIPTABIX

Defined in modules/nf-core/tabix/bgziptabix/main.nf:1

Keywords: bgzip, compress, index, tabix, vcf

bgzip a sorted tab-delimited genome file and then create tabix index

Tools

tabix

Generic indexer for TAB-delimited genome position files.

Homepage | Documentation | biotools:tabix | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	Sorted tab-delimited genome file

Outputs

Name	Type	Pattern	Description
`gz_tbi`	`file`	`.gz, .tbi`	bgzipped tab-delimited genome file tabix index file
`gz_csi`	`file`	`.gz, .csi`	bgzipped tab-delimited genome file csi index file

Authors: @maxulysse, @DLBPointon Maintainers: @maxulysse, @DLBPointon

CAT_CAT

Defined in modules/nf-core/cat/cat/main.nf:1

Keywords: concatenate, gzip, cat

A module for concatenation of gzipped or uncompressed files

Tools

cat

Just concatenation

Documentation | License: GPL-3.0-or-later

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`files_in`	`file`	List of compressed / uncompressed files

Outputs

Name	Type	Emit	Description
`val(meta)`	`tuple`	-	-

Authors: @erikrikarddaniel, @FriederikeHanssen Maintainers: @erikrikarddaniel, @FriederikeHanssen

CAT_FASTQ

Defined in modules/nf-core/cat/fastq/main.nf:1

Keywords: cat, fastq, concatenate

Concatenates fastq files

Tools

cat

The cat utility reads files sequentially, writing them to the standard output.

Documentation | License: GPL-3.0-or-later

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`reads`	`file`	List of input FastQ files to be concatenated.

Outputs

Name	Type	Emit	Description
`val(meta), path("*.merged.fastq.gz")`	`tuple`	`reads`	-
`versions.yml`	`path`	`versions`	-

Authors: @joseespinosa, @drpatelh Maintainers: @joseespinosa, @drpatelh

CREATE_INTERVALS_BED

Defined in modules/local/create_intervals_bed/main.nf:1

Outputs

Name	Type	Emit	Description
`*.bed`	`path`	`bed`	-
`versions.yml`	`path`	`versions`	-

ADD_INFO_TO_VCF

Defined in modules/local/add_info_to_vcf/main.nf:1

Inputs

Name	Type	Description
`val(meta), path(vcf_gz)`	`tuple`	-

Outputs

Name	Type	Emit	Description
`val(meta), path("*.added_info.vcf")`	`tuple`	`vcf`	-
`versions.yml`	`path`	`versions`	-

SAMTOOLS_REINDEX_BAM

Defined in modules/local/samtools/reindex_bam/main.nf:5

The aim of this process is to re-index the bam file without the duplicate, supplementary, unmapped etc, for goleft/indexcov It creates a BAM containing only a header (so indexcov can get the sample name) and a BAM index were low quality reads, supplementary etc, have been removed

Inputs

Name	Type	Description
`val(meta), path(input), path(input_index)`	`tuple`	-
`val(meta2), path(fasta)`	`tuple`	-
`val(meta3), path(fai)`	`tuple`	-

Outputs

Name	Type	Emit	Description
`val(meta)`	`tuple`	-	-

This pipeline was built with Nextflow. Documentation generated by nf-docs v0.1.0 on 2026-01-23 17:27:10 UTC.

Processes

Contents

MULTIQC

Tools

Outputs

UNTAR

Tools

Inputs

Outputs

MOSDEPTH

Tools

Inputs

Outputs

VCFTOOLS

Tools

Inputs

Outputs

UNZIP

Tools

Inputs

Outputs

YTE

Tools

Inputs

Outputs

GAWK

Tools

Inputs

Outputs

FASTQC

Tools

Inputs

Outputs

ASCAT

Tools

Inputs

Outputs

FREEBAYES

Tools

Inputs

Outputs

FASTP

Tools

Inputs

Outputs

GUNZIP

Tools

Inputs

Outputs

SPRING_DECOMPRESS

Tools

Inputs

Outputs

LOFREQ_CALLPARALLEL

Tools

Inputs

Outputs

GATK4_INTERVALLISTTOBED

Tools

Inputs

Outputs

GATK4_CALCULATECONTAMINATION

Tools

Inputs

Outputs

GATK4_FILTERMUTECTCALLS

Tools

Inputs

Outputs

GATK4_APPLYVQSR

Tools

Inputs

Outputs

GATK4_GENOMICSDBIMPORT

Tools

Inputs

Outputs

GATK4_LEARNREADORIENTATIONMODEL

Tools

Inputs