Processes

This page documents all processes in the pipeline.

MULTIQC
UNTAR
MOSDEPTH
SEQ2HLA
FASTQC
GFFREAD
GUNZIP
GATK4_COMBINEGVCFS
GATK4_INDEXFEATUREFILE
GATK4_VARIANTFILTRATION
GATK4_CREATESEQUENCEDICTIONARY
GATK4_SPLITNCIGARREADS
GATK4_HAPLOTYPECALLER
GATK4_INTERVALLISTTOOLS
GATK4_BASERECALIBRATOR
GATK4_APPLYBQSR
GATK4_BEDTOINTERVALLIST
GATK4_MERGEVCFS
UMITOOLS_EXTRACT
SAMTOOLS_SORT
SAMTOOLS_MERGE
SAMTOOLS_IDXSTATS
SAMTOOLS_FAIDX
SAMTOOLS_INDEX
SAMTOOLS_FLAGSTAT
SAMTOOLS_STATS
SAMTOOLS_CONVERT
BEDTOOLS_SORT
BEDTOOLS_MERGE
STAR_GENOMEGENERATE
STAR_ALIGN
STAR_INDEXVERSION
SNPEFF_SNPEFF
SNPEFF_DOWNLOAD
ENSEMBLVEP_VEP
ENSEMBLVEP_DOWNLOAD
BCFTOOLS_ANNOTATE
PICARD_MARKDUPLICATES
TABIX_TABIX
TABIX_BGZIPTABIX
CAT_FASTQ
REMOVE_UNKNOWN_REGIONS
GTF2BED

MULTIQC

Defined in modules/nf-core/multiqc/main.nf:21

Keywords: QC, bioinformatics tools, Beautiful stand-alone HTML report

Aggregate results from bioinformatics analyses across many samples into a single report

Code Documentation

Aggregate results from multiple analysis tools into a single report. MultiQC searches a given directory for analysis logs and compiles them into a single HTML report. It supports output from many common bioinformatics tools including FastQC, STAR, Picard, GATK, and more. The report provides:

Summary statistics across all samples
Interactive plots for QC metrics
Data tables for detailed metrics
Export functionality for plots and data

Tools

multiqc

MultiQC searches a given directory for analysis logs and compiles a HTML report. It's a general use tool, perfect for summarising the output from numerous bioinformatics tools.

Homepage | Documentation | biotools:multiqc | License: GPL-3.0-or-later

Outputs

Name	Type	Pattern	Description
`report`	`-`	`-`	-
`data`	`-`	`-`	-
`plots`	`-`	`-`	-

Authors: @abhi18av, @bunop, @drpatelh, @jfy133 Maintainers: @abhi18av, @bunop, @drpatelh, @jfy133

UNTAR

Defined in modules/nf-core/untar/main.nf:1

Keywords: untar, uncompress, extract

Extract files from tar, tar.gz, tar.bz2, tar.xz archives

Tools

untar

Extract tar, tar.gz, tar.bz2, tar.xz files.

Documentation | License: GPL-3.0-or-later

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`archive`	`file`	File to be untarred

Outputs

Name	Type	Pattern	Description
`untar`	`map`	`*/`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]

Authors: @joseespinosa, @drpatelh, @matthdsm, @jfy133 Maintainers: @joseespinosa, @drpatelh, @matthdsm, @jfy133

MOSDEPTH

Defined in modules/nf-core/mosdepth/main.nf:1

Keywords: mosdepth, bam, cram, coverage

Calculates genome-wide sequencing coverage.

Tools

mosdepth

Fast BAM/CRAM depth calculation for WGS, exome, or targeted sequencing.

Documentation | biotools:mosdepth | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`bam`	`file`	Input BAM/CRAM file
`bai`	`file`	Index for BAM/CRAM file
`bed`	`file`	BED file with intersected intervals
`meta2`	`map`	Groovy Map containing bed information e.g. [ id:'test' ]
`fasta`	`file`	Reference genome FASTA file

Outputs

Name	Type	Pattern	Description
`global_txt`	`file`	`*.{global.dist.txt}`	Text file with global cumulative coverage distribution
`summary_txt`	`file`	`*.{summary.txt}`	Text file with summary mean depths per chromosome and regions
`regions_txt`	`file`	`*.{region.dist.txt}`	Text file with region cumulative coverage distribution
`per_base_d4`	`file`	`*.{per-base.d4}`	D4 file with per-base coverage
`per_base_bed`	`file`	`*.{per-base.bed.gz}`	BED file with per-base coverage
`per_base_csi`	`file`	`*.{per-base.bed.gz.csi}`	Index file for BED file with per-base coverage
`regions_bed`	`file`	`*.{regions.bed.gz}`	BED file with per-region coverage
`regions_csi`	`file`	`*.{regions.bed.gz.csi}`	Index file for BED file with per-region coverage
`quantized_bed`	`file`	`*.{quantized.bed.gz}`	BED file with binned coverage
`quantized_csi`	`file`	`*.{quantized.bed.gz.csi}`	Index file for BED file with binned coverage
`thresholds_bed`	`file`	`*.{thresholds.bed.gz}`	BED file with the number of bases in each region that are covered at or above each threshold
`thresholds_csi`	`file`	`*.{thresholds.bed.gz.csi}`	Index file for BED file with threshold coverage

Authors: @joseespinosa, @drpatelh, @ramprasadn, @matthdsm Maintainers: @joseespinosa, @ramprasadn, @matthdsm

SEQ2HLA

Defined in modules/nf-core/seq2hla/main.nf:20

Keywords: hla, typing, rna-seq, genomics, immunogenetics

Precision HLA typing and expression from RNA-seq data using seq2HLA

Code Documentation

Perform HLA typing from RNA-seq data using seq2HLA. seq2HLA determines HLA class I and class II genotypes from RNA-seq reads by mapping to a reference database of HLA alleles. It provides:

2-digit resolution typing (e.g., HLA-A*02)
4-digit resolution typing (e.g., HLA-A*02:01)
Expression levels of HLA alleles
Ambiguity reports when alleles cannot be distinguished Supports both classical HLA genes (HLA-A, -B, -C, -DRB1, -DQB1, -DQA1) and non-classical genes. Requires paired-end RNA-seq reads as input.

Tools

seq2hla

Precision HLA typing and expression from next-generation RNA sequencing data

Homepage | Documentation | biotools:seq2HLA | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. `[ id:'sample1', single_end:false ]`
`reads`	`file`	Paired-end FASTQ files for RNA-seq data

Outputs

Name	Type	Pattern	Description
`class1_genotype_2d`	`file`	`*ClassI-class.HLAgenotype2digits`	HLA Class I 2-digit genotype results
`class2_genotype_2d`	`file`	`*ClassII.HLAgenotype2digits`	HLA Class II 2-digit genotype results
`class1_genotype_4d`	`file`	`*ClassI-class.HLAgenotype4digits`	HLA Class I 4-digit genotype results
`class2_genotype_4d`	`file`	`*ClassII.HLAgenotype4digits`	HLA Class II 4-digit genotype results
`class1_bowtielog`	`file`	`*ClassI-class.bowtielog`	HLA Class I Bowtie alignment log
`class2_bowtielog`	`file`	`*ClassII.bowtielog`	HLA Class II Bowtie alignment log
`class1_expression`	`file`	`*ClassI-class.expression`	HLA Class I expression results
`class2_expression`	`file`	`*ClassII.expression`	HLA Class II expression results
`class1_nonclass_genotype_2d`	`file`	`*ClassI-nonclass.HLAgenotype2digits`	HLA Class I non-classical 2-digit genotype results
`ambiguity`	`file`	`*.ambiguity`	HLA typing ambiguity results
`class1_nonclass_genotype_4d`	`file`	`*ClassI-nonclass.HLAgenotype4digits`	HLA Class I non-classical 4-digit genotype results
`class1_nonclass_bowtielog`	`file`	`*ClassI-nonclass.bowtielog`	HLA Class I non-classical Bowtie alignment log
`class1_nonclass_expression`	`file`	`*ClassI-nonclass.expression`	HLA Class I non-classical expression results

Authors: @FriederikeHanssen Maintainers: @FriederikeHanssen

FASTQC

Defined in modules/nf-core/fastqc/main.nf:19

Keywords: quality control, qc, adapters, fastq

Run FastQC on sequenced reads

Code Documentation

Run FastQC quality control on sequencing reads. FastQC provides a comprehensive quality control report for high-throughput sequencing data. It generates an HTML report and a ZIP archive containing detailed metrics including:

Basic statistics (total sequences, sequence length, GC content)
Per-base sequence quality scores
Per-sequence quality scores
Per-base sequence content
Sequence duplication levels
Overrepresented sequences
Adapter content

Tools

fastqc

FastQC gives general quality metrics about your reads. It provides information about the quality score distribution across your reads, the per base sequence content (%A/C/G/T).

You get information about adapter contamination and other overrepresented sequences.

Homepage | Documentation | biotools:fastqc | License: GPL-2.0-only

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`reads`	`file`	List of input FastQ files of size 1 and 2 for single-end and paired-end data, respectively.

Outputs

Name	Type	Pattern	Description
`html`	`file`	`*_{fastqc.html}`	FastQC report
`zip`	`file`	`*_{fastqc.zip}`	FastQC report archive

Authors: @drpatelh, @grst, @ewels, @FelixKrueger Maintainers: @drpatelh, @grst, @ewels, @FelixKrueger

GFFREAD

Defined in modules/nf-core/gffread/main.nf:1

Keywords: gff, conversion, validation

Validate, filter, convert and perform various other operations on GFF files

Tools

gffread

GFF/GTF utility providing format conversions, region filtering, FASTA sequence extraction and more.

Homepage | Documentation | biotools:gffread | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing meta data e.g. [ id:'test' ]
`gff`	`file`	A reference file in either the GFF3, GFF2 or GTF format.

Outputs

Name	Type	Pattern	Description
`gtf`	`file`	`*.{gtf}`	GTF file resulting from the conversion of the GFF input file if '-T' argument is present
`gffread_gff`	`file`	`*.gff3`	GFF3 file resulting from the conversion of the GFF input file if '-T' argument is absent
`gffread_fasta`	`file`	`*.fasta`	Fasta file produced when either of '-w', '-x', '-y' parameters is present

Authors: @edmundmiller Maintainers: @edmundmiller, @gallvp

GUNZIP

Defined in modules/nf-core/gunzip/main.nf:16

Keywords: gunzip, compression, decompression

Compresses and decompresses files.

Tools

gunzip

gzip is a file format and a software application used for file compression and decompression.

Documentation | License: GPL-3.0-or-later

Inputs

Name	Type	Description
`meta`	`map`	Optional groovy Map containing meta information e.g. [ id:'test', single_end:false ]
`archive`	`file`	File to be compressed/uncompressed

Outputs

Name	Type	Pattern	Description
`gunzip`	`file`	`.`	Compressed/uncompressed file

Authors: @joseespinosa, @drpatelh, @jfy133 Maintainers: @joseespinosa, @drpatelh, @jfy133, @gallvp

GATK4_COMBINEGVCFS

Defined in modules/nf-core/gatk4/combinegvcfs/main.nf:1

Keywords: gvcf, gatk4, vcf, combinegvcfs, short variant discovery

Combine per-sample gVCF files produced by HaplotypeCaller into a multi-sample gVCF file

Tools

gatk4

Genome Analysis Toolkit (GATK4). Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Homepage | Documentation | License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test' ]
`vcf`	`file`	Compressed VCF files
`vcf_idx`	`file`	VCF Index file

Outputs

Name	Type	Pattern	Description
`combined_gvcf`	`file`	`*.combined.g.vcf.gz`	Compressed Combined GVCF file

Authors: @sateeshperi, @mjcipriano, @hseabolt, @maxulysse Maintainers: @sateeshperi, @mjcipriano, @hseabolt, @maxulysse

GATK4_INDEXFEATUREFILE

Defined in modules/nf-core/gatk4/indexfeaturefile/main.nf:1

Keywords: feature, gatk4, index, indexfeaturefile

Creates an index for a feature file, e.g. VCF or BED file.

Tools

gatk4

Genome Analysis Toolkit (GATK4)

Homepage | Documentation | License: BSD-3-clause

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`feature_file`	`file`	VCF/BED file

Outputs

Name	Type	Pattern	Description
`index`	`file`	`*.{tbi,idx}`	Index for VCF/BED file

Authors: @santiagorevale Maintainers: @santiagorevale

GATK4_VARIANTFILTRATION

Defined in modules/nf-core/gatk4/variantfiltration/main.nf:1

Keywords: filter, gatk4, variantfiltration, vcf

Filter variants

Tools

gatk4

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Homepage | Documentation | License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test']
`vcf`	`list`	List of VCF(.gz) files
`tbi`	`list`	List of VCF file indexes
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fasta`	`file`	Fasta file of reference genome
`meta3`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fai`	`file`	Index of fasta file
`meta4`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`dict`	`file`	Sequence dictionary of fastea file
`meta5`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`gzi`	`file`	Genome index file only needed when the genome file was compressed with the BGZF algorithm.

Outputs

Name	Type	Pattern	Description
`vcf`	`file`	`*.vcf.gz`	Compressed VCF file
`tbi`	`file`	`*.vcf.gz.tbi`	Index of VCF file

Authors: @kevinmenden, @ramprasadn Maintainers: @kevinmenden, @ramprasadn

GATK4_CREATESEQUENCEDICTIONARY

Defined in modules/nf-core/gatk4/createsequencedictionary/main.nf:1

Keywords: createsequencedictionary, dictionary, fasta, gatk4

Creates a sequence dictionary for a reference sequence

Tools

gatk

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Homepage | Documentation | License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fasta`	`file`	Input fasta file

Outputs

Name	Type	Pattern	Description
`dict`	`file`	`*.{dict}`	gatk dictionary file

Authors: @maxulysse, @ramprasadn Maintainers: @maxulysse, @ramprasadn

GATK4_SPLITNCIGARREADS

Defined in modules/nf-core/gatk4/splitncigarreads/main.nf:1

Keywords: gatk4, merge, vcf

Splits reads that contain Ns in their cigar string

Tools

gatk4

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Homepage | Documentation | License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test']
`bam`	`list`	BAM/SAM/CRAM file containing reads
`bai`	`list`	BAI/SAI/CRAI index file (optional)
`intervals`	`file`	Bed file with the genomic regions included in the library (optional)
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'reference' ]
`fasta`	`file`	The reference fasta file
`meta3`	`map`	Groovy Map containing reference information e.g. [ id:'reference' ]
`fai`	`file`	Index of reference fasta file
`meta4`	`map`	Groovy Map containing reference information e.g. [ id:'reference' ]
`dict`	`file`	GATK sequence dictionary

Outputs

Name	Type	Pattern	Description
`bam`	`file`	`*.{bam,sam,cram}`	Output file with split reads (BAM/SAM/CRAM)

Authors: @kevinmenden Maintainers: @kevinmenden

GATK4_HAPLOTYPECALLER

Defined in modules/nf-core/gatk4/haplotypecaller/main.nf:25

Keywords: gatk4, haplotype, haplotypecaller

Call germline SNPs and indels via local re-assembly of haplotypes

Code Documentation

Call germline SNPs and indels using GATK HaplotypeCaller. HaplotypeCaller is GATK's flagship variant caller, performing local de-novo assembly of haplotypes in regions showing variation. It can produce either standard VCF output or GVCF output for joint calling. Key features:

Local re-assembly for accurate indel calling
Population-aware calling using dbSNP
Support for GVCF output mode for cohort analysis
DRAGstr model support for improved STR calling For RNA-seq data, this should be run after SplitNCigarReads processing.

Tools

gatk4

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Homepage | Documentation | License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	BAM/CRAM file from alignment
`input_index`	`file`	BAI/CRAI file from alignment
`intervals`	`file`	Bed file with the genomic regions included in the library (optional)
`dragstr_model`	`file`	Text file containing the DragSTR model of the used BAM/CRAM file (optional)
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'test_reference' ]
`fasta`	`file`	The reference fasta file
`meta3`	`map`	Groovy Map containing reference information e.g. [ id:'test_reference' ]
`fai`	`file`	Index of reference fasta file
`meta4`	`map`	Groovy Map containing reference information e.g. [ id:'test_reference' ]
`dict`	`file`	GATK sequence dictionary
`meta5`	`map`	Groovy Map containing dbsnp information e.g. [ id:'test_dbsnp' ]
`dbsnp`	`file`	VCF file containing known sites (optional)
`meta6`	`map`	Groovy Map containing dbsnp information e.g. [ id:'test_dbsnp' ]
`dbsnp_tbi`	`file`	VCF index of dbsnp (optional)

Outputs

Name	Type	Pattern	Description
`vcf`	`file`	`*.vcf.gz`	Compressed VCF file
`tbi`	`file`	`*.vcf.gz.tbi`	Index of VCF file
`bam`	`file`	`*.realigned.bam`	Assembled haplotypes and locally realigned reads

Authors: @suzannejin, @FriederikeHanssen Maintainers: @suzannejin, @FriederikeHanssen

GATK4_INTERVALLISTTOOLS

Defined in modules/nf-core/gatk4/intervallisttools/main.nf:1

Keywords: bed, gatk4, interval_list, sort

Splits the interval list file into unique, equally-sized interval files and place it under a directory

Tools

gatk4

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Homepage | Documentation | License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`intervals`	`file`	Interval file

Outputs

Name	Type	Pattern	Description
`interval_list`	`file`	`*.interval_list`	Interval list files

Authors: @praveenraj2018 Maintainers: @praveenraj2018

GATK4_BASERECALIBRATOR

Defined in modules/nf-core/gatk4/baserecalibrator/main.nf:1

Keywords: base quality score recalibration, table, bqsr, gatk4, sort

Generate recalibration table for Base Quality Score Recalibration (BQSR)

Tools

gatk4

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Homepage | Documentation | License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	BAM/CRAM file from alignment
`input_index`	`file`	BAI/CRAI file from alignment
`intervals`	`file`	Bed file with the genomic regions included in the library (optional)
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'genome']
`fasta`	`file`	The reference fasta file
`meta3`	`map`	Groovy Map containing reference information e.g. [ id:'genome']
`fai`	`file`	Index of reference fasta file
`meta4`	`map`	Groovy Map containing reference information e.g. [ id:'genome']
`dict`	`file`	GATK sequence dictionary
`meta5`	`map`	Groovy Map containing reference information e.g. [ id:'genome']
`known_sites`	`file`	VCF files with known sites for indels / snps
`meta6`	`map`	Groovy Map containing reference information e.g. [ id:'genome']
`known_sites_tbi`	`file`	Tabix index of the known_sites

Outputs

Name	Type	Pattern	Description
`table`	`file`	`*.{table}`	Recalibration table from BaseRecalibrator

Authors: @yocra3, @FriederikeHanssen, @maxulysse Maintainers: @yocra3, @FriederikeHanssen, @maxulysse

GATK4_APPLYBQSR

Defined in modules/nf-core/gatk4/applybqsr/main.nf:1

Keywords: bam, base quality score recalibration, bqsr, cram, gatk4

Apply base quality score recalibration (BQSR) to a bam file

Tools

gatk4

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Homepage | Documentation | License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	BAM/CRAM file from alignment
`input_index`	`file`	BAI/CRAI file from alignment
`bqsr_table`	`file`	Recalibration table from gatk4_baserecalibrator
`intervals`	`file`	Bed file with the genomic regions included in the library (optional)

Outputs

Name	Type	Pattern	Description
`bam`	`file`	`${prefix}.bam`	Recalibrated BAM file
`bai`	`file`	`${prefix}*bai`	Recalibrated BAM index file
`cram`	`file`	`${prefix}.cram`	Recalibrated CRAM file

Authors: @yocra3, @FriederikeHanssen Maintainers: @yocra3, @FriederikeHanssen

GATK4_BEDTOINTERVALLIST

Defined in modules/nf-core/gatk4/bedtointervallist/main.nf:1

Keywords: bed, bedtointervallist, gatk4, interval list

Creates an interval list from a bed file and a reference dict

Tools

gatk4

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Homepage | Documentation | License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test']
`bed`	`file`	Input bed file
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`dict`	`file`	Sequence dictionary

Outputs

Name	Type	Pattern	Description
`interval_list`	`file`	`*.interval_list`	gatk interval list file

Authors: @kevinmenden, @ramprasadn Maintainers: @kevinmenden, @ramprasadn

GATK4_MERGEVCFS

Defined in modules/nf-core/gatk4/mergevcfs/main.nf:1

Keywords: gatk4, merge, vcf

Merges several vcf files

Tools

gatk4

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Homepage | Documentation | License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test']
`vcf`	`list`	Two or more VCF files
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'genome']
`dict`	`file`	Optional Sequence Dictionary as input

Outputs

Name	Type	Pattern	Description
`vcf`	`file`	`*.vcf.gz`	merged vcf file
`tbi`	`file`	`*.tbi`	index files for the merged vcf files

Authors: @kevinmenden Maintainers: @kevinmenden

UMITOOLS_EXTRACT

Defined in modules/nf-core/umitools/extract/main.nf:1

Keywords: UMI, barcode, extract, umitools

Extracts UMI barcode from a read and add it to the read name, leaving any sample barcode in place

Tools

umi_tools

UMI-tools contains tools for dealing with Unique Molecular Identifiers (UMIs)/Random Molecular Tags (RMTs) and single cell RNA-Seq cell barcodes

Documentation

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`reads`	`list`	List of input FASTQ files whose UMIs will be extracted.

Outputs

| Name | Type | Pattern | Description | | ------- | ------ | -------------- | ---------------------- | ----------------------------------------------------------------- | ----------------------------------------------------------------------- | | reads | file | *.{fastq.gz} | Extracted FASTQ files. | For single-end reads, pattern is \${prefix}.umi_extract.fastq.gz. | For paired-end reads, pattern is \${prefix}.umiextract.fastq.gz. | | log | file | *.{log} | Logfile for umi_tools |

Authors: @drpatelh, @grst Maintainers: @drpatelh, @grst

SAMTOOLS_SORT

Defined in modules/nf-core/samtools/sort/main.nf:1

Keywords: sort, bam, sam, cram

Sort SAM/BAM/CRAM file

Tools

samtools

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Homepage | Documentation | biotools:samtools | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`bam`	`file`	BAM/CRAM/SAM file(s)
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fasta`	`file`	Reference genome FASTA file

Outputs

Name	Type	Pattern	Description
`bam`	`file`	`*.{bam}`	Sorted BAM file
`cram`	`file`	`*.{cram}`	Sorted CRAM file
`sam`	`file`	`*.{sam}`	Sorted SAM file
`crai`	`file`	`*.crai`	CRAM index file (optional)
`csi`	`file`	`*.csi`	BAM index file (optional)
`bai`	`file`	`*.bai`	BAM index file (optional)

Authors: @drpatelh, @ewels, @matthdsm Maintainers: @drpatelh, @ewels, @matthdsm

SAMTOOLS_MERGE

Defined in modules/nf-core/samtools/merge/main.nf:1

Keywords: merge, bam, sam, cram

Merge BAM or CRAM file

Tools

samtools

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Homepage | Documentation | biotools:samtools | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input_files`	`file`	BAM/CRAM file
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fasta`	`file`	Reference file the CRAM was created with (optional)
`meta3`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fai`	`file`	Index of the reference file the CRAM was created with (optional)
`meta4`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`gzi`	`file`	Index of the compressed reference file the CRAM was created with (optional)

Outputs

Name	Type	Pattern	Description
`bam`	`file`	`*.{bam}`	BAM file
`cram`	`file`	`*.{cram}`	CRAM file
`csi`	`file`	`*.csi`	BAM index file (optional)
`crai`	`file`	`*.crai`	CRAM index file (optional)

Authors: @yuukiiwa, @maxulysse, @FriederikeHanssen, @ramprasadn Maintainers: @yuukiiwa, @maxulysse, @FriederikeHanssen, @ramprasadn

SAMTOOLS_IDXSTATS

Defined in modules/nf-core/samtools/idxstats/main.nf:1

Keywords: stats, mapping, counts, chromosome, bam, sam, cram

Reports alignment summary statistics for a BAM/CRAM/SAM file

Tools

samtools

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Homepage | Documentation | biotools:samtools | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`bam`	`file`	BAM/CRAM/SAM file
`bai`	`file`	Index for BAM/CRAM/SAM file

Outputs

Name	Type	Pattern	Description
`idxstats`	`file`	`*.{idxstats}`	File containing samtools idxstats output

Authors: @drpatelh Maintainers: @drpatelh

SAMTOOLS_FAIDX

Defined in modules/nf-core/samtools/faidx/main.nf:1

Keywords: index, fasta, faidx, chromosome

Index FASTA file, and optionally generate a file of chromosome sizes

Tools

samtools

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Homepage | Documentation | biotools:samtools | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing reference information e.g. [ id:'test' ]
`fasta`	`file`	FASTA file
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'test' ]
`fai`	`file`	FASTA index file

Outputs

Name	Type	Pattern	Description
`fa`	`file`	`*.{fa}`	FASTA file
`sizes`	`file`	`*.{sizes}`	File containing chromosome lengths
`fai`	`file`	`*.{fai}`	FASTA index file
`gzi`	`file`	`*.gzi`	Optional gzip index file for compressed inputs

Authors: @drpatelh, @ewels, @phue Maintainers: @maxulysse, @phue

SAMTOOLS_INDEX

Defined in modules/nf-core/samtools/index/main.nf:1

Keywords: index, bam, sam, cram

Index SAM/BAM/CRAM file

Tools

samtools

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Homepage | Documentation | biotools:samtools | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	input file

Outputs

Name	Type	Pattern	Description
`bai`	`file`	`*.{bai,crai,sai}`	BAM/CRAM/SAM index file
`csi`	`file`	`*.{csi}`	CSI index file
`crai`	`file`	`*.{bai,crai,sai}`	BAM/CRAM/SAM index file

Authors: @drpatelh, @ewels, @maxulysse Maintainers: @drpatelh, @ewels, @maxulysse

SAMTOOLS_FLAGSTAT

Defined in modules/nf-core/samtools/flagstat/main.nf:1

Keywords: stats, mapping, counts, bam, sam, cram

Counts the number of alignments in a BAM/CRAM/SAM file for each FLAG type

Tools

samtools

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Homepage | Documentation | biotools:samtools | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`bam`	`file`	BAM/CRAM/SAM file
`bai`	`file`	Index for BAM/CRAM/SAM file

Outputs

Name	Type	Pattern	Description
`flagstat`	`file`	`*.{flagstat}`	File containing samtools flagstat output

Authors: @drpatelh Maintainers: @drpatelh

SAMTOOLS_STATS

Defined in modules/nf-core/samtools/stats/main.nf:1

Keywords: statistics, counts, bam, sam, cram

Produces comprehensive statistics from SAM/BAM/CRAM file

Tools

samtools

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Homepage | Documentation | biotools:samtools | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	BAM/CRAM file from alignment
`input_index`	`file`	BAI/CRAI file from alignment
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fasta`	`file`	Reference file the CRAM was created with (optional)

Outputs

Name	Type	Pattern	Description
`stats`	`file`	`*.{stats}`	File containing samtools stats output

Authors: @drpatelh, @FriederikeHanssen, @ramprasadn Maintainers: @drpatelh, @FriederikeHanssen, @ramprasadn

SAMTOOLS_CONVERT

Defined in modules/nf-core/samtools/convert/main.nf:1

Keywords: view, index, bam, cram

convert and then index CRAM -> BAM or BAM -> CRAM file

Tools

samtools

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Homepage | Documentation | biotools:samtools | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	BAM/CRAM file
`index`	`file`	BAM/CRAM index file
`meta2`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`fasta`	`file`	Reference file to create the CRAM file
`meta3`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`fai`	`file`	Reference index file to create the CRAM file

Outputs

Name	Type	Pattern	Description
`bam`	`file`	`*{.bam}`	filtered/converted BAM file
`cram`	`file`	`*{cram}`	filtered/converted CRAM file
`bai`	`file`	`*{.bai}`	filtered/converted BAM index
`crai`	`file`	`*{.crai}`	filtered/converted CRAM index

Authors: @FriederikeHanssen, @maxulysse Maintainers: @FriederikeHanssen, @maxulysse, @matthdsm

BEDTOOLS_SORT

Defined in modules/nf-core/bedtools/sort/main.nf:1

Keywords: bed, sort, bedtools, chromosome

Sorts a feature file by chromosome and other criteria.

Tools

bedtools

A set of tools for genomic analysis tasks, specifically enabling genome arithmetic (merge, count, complement) on various file types.

Documentation | biotools:bedtools | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`intervals`	`file`	BED/BEDGRAPH

Outputs

Name	Type	Pattern	Description
`sorted`	`file`	`*.${extension}`	Sorted output file

Authors: @edmundmiller, @sruthipsuresh, @drpatelh, @chris-cheshire, @adamrtalbot Maintainers: @edmundmiller, @sruthipsuresh, @drpatelh, @chris-cheshire, @adamrtalbot

BEDTOOLS_MERGE

Defined in modules/nf-core/bedtools/merge/main.nf:1

Keywords: bed, merge, bedtools, overlapped bed

combines overlapping or “book-ended” features in an interval file into a single feature which spans all of the combined features.

Tools

bedtools

A set of tools for genomic analysis tasks, specifically enabling genome arithmetic (merge, count, complement) on various file types.

Documentation | biotools:bedtools | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`bed`	`file`	Input BED file

Outputs

Name	Type	Pattern	Description
`bed`	`file`	`*.{bed}`	Overlapped bed file with combined features

Authors: @edmundmiller, @sruthipsuresh, @drpatelh Maintainers: @edmundmiller, @sruthipsuresh, @drpatelh

STAR_GENOMEGENERATE

Defined in modules/nf-core/star/genomegenerate/main.nf:1

Keywords: index, fasta, genome, reference

Create index for STAR

Tools

star

STAR is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Homepage | biotools:star | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`fasta`	`file`	Fasta file of the reference genome
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'test' ]
`gtf`	`file`	GTF file of the reference genome

Outputs

Name	Type	Pattern	Description
`index`	`directory`	`star`	Folder containing the star index files

Authors: @kevinmenden, @drpatelh Maintainers: @kevinmenden, @drpatelh

STAR_ALIGN

Defined in modules/nf-core/star/align/main.nf:1

Keywords: align, fasta, genome, reference

Align reads to a reference genome using STAR

Tools

star

STAR is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Homepage | biotools:star | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`reads`	`file`	List of input FastQ files of size 1 and 2 for single-end and paired-end data, respectively.
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'test' ]
`index`	`directory`	STAR genome index
`meta3`	`map`	Groovy Map containing reference information e.g. [ id:'test' ]
`gtf`	`file`	Annotation GTF file

Outputs

Name	Type	Pattern	Description
`log_final`	`file`	`*Log.final.out`	STAR final log file
`log_out`	`file`	`*Log.out`	STAR lot out file
`log_progress`	`file`	`*Log.progress.out`	STAR log progress file
`bam`	`file`	`*.{bam}`	Output BAM file containing read alignments
`bam_sorted`	`file`	`*sortedByCoord.out.bam`	Sorted BAM file of read alignments (optional)
`bam_sorted_aligned`	`file`	`*.Aligned.sortedByCoord.out.bam`	Sorted BAM file of read alignments (optional)
`bam_transcript`	`file`	`*toTranscriptome.out.bam`	Output BAM file of transcriptome alignment (optional)
`bam_unsorted`	`file`	`*Aligned.unsort.out.bam`	Unsorted BAM file of read alignments (optional)
`fastq`	`file`	`*fastq.gz`	Unmapped FastQ files (optional)
`tab`	`file`	`*.tab`	STAR output tab file(s) (optional)
`spl_junc_tab`	`file`	`*.SJ.out.tab`	STAR output splice junction tab file
`read_per_gene_tab`	`file`	`*.ReadsPerGene.out.tab`	STAR output read per gene tab file
`junction`	`file`	`*.out.junction`	STAR chimeric junction output file (optional)
`sam`	`file`	`*.out.sam`	STAR output SAM file(s) (optional)
`wig`	`file`	`*.wig`	STAR output wiggle format file(s) (optional)
`bedgraph`	`file`	`*.bg`	STAR output bedGraph format file(s) (optional)

Authors: @kevinmenden, @drpatelh, @praveenraj2018 Maintainers: @kevinmenden, @drpatelh, @praveenraj2018

STAR_INDEXVERSION

Defined in modules/nf-core/star/indexversion/main.nf:1

Keywords: index, version, rna

Get the minimal allowed index version from STAR

Tools

star

STAR is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Homepage | biotools:star | License: MIT

Outputs

Name	Type	Pattern	Description
`index_version`	`-`	`-`	-

Authors: @nvnieuwk Maintainers: @nvnieuwk

SNPEFF_SNPEFF

Defined in modules/nf-core/snpeff/snpeff/main.nf:1

Keywords: annotation, effect prediction, snpeff, variant, vcf

Genetic variant annotation and functional effect prediction toolbox

Tools

snpeff

SnpEff is a variant annotation and effect prediction tool. It annotates and predicts the effects of genetic variants on genes and proteins (such as amino acid changes).

Homepage | Documentation | biotools:snpeff | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`vcf`	`file`	vcf to annotate
`meta2`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`cache`	`file`	path to snpEff cache (optional)

Outputs

Name	Type	Pattern	Description
`vcf`	`file`	`*.ann.vcf`	annotated vcf
`report`	`string`	`*.csv`	The process The tool name snpEff report csv file
`summary_html`	`string`	`*.html`	The process The tool name snpEff summary statistics in html file
`genes_txt`	`string`	`*.genes.txt`	The process The tool name txt (tab separated) file having counts of the number of variants affecting each transcript and gene

Authors: @maxulysse Maintainers: @maxulysse

SNPEFF_DOWNLOAD

Defined in modules/nf-core/snpeff/download/main.nf:1

Keywords: annotation, effect prediction, snpeff, variant, vcf

Genetic variant annotation and functional effect prediction toolbox

Tools

snpeff

SnpEff is a variant annotation and effect prediction tool. It annotates and predicts the effects of genetic variants on genes and proteins (such as amino acid changes).

Homepage | Documentation | biotools:snpeff | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`snpeff_db`	`string`	SnpEff database name

Outputs

Name	Type	Pattern	Description
`cache`	`file`	`-`	snpEff cache

Authors: @maxulysse Maintainers: @maxulysse

ENSEMBLVEP_VEP

Defined in modules/nf-core/ensemblvep/vep/main.nf:1

Keywords: annotation, vcf, json, tab

Ensembl Variant Effect Predictor (VEP). The output-file-format is controlled through task.ext.args.

Tools

ensemblvep

VEP determines the effect of your variants (SNPs, insertions, deletions, CNVs or structural variants) on genes, transcripts, and protein sequence, as well as regulatory regions.

Homepage | Documentation | License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`vcf`	`file`	vcf to annotate
`custom_extra_files`	`file`	extra sample-specific files to be used with the `--custom` flag to be configured with ext.args (optional)
`meta2`	`map`	Groovy Map containing fasta reference information e.g. [ id:'test' ]
`fasta`	`file`	reference FASTA file (optional)

Outputs

Name	Type	Pattern	Description
`vcf`	`file`	`*.vcf.gz`	annotated vcf (optional)
`tbi`	`file`	`*.vcf.gz.tbi`	annotated vcf index (optional)
`tab`	`file`	`*.ann.tab.gz`	tab file with annotated variants (optional)
`json`	`file`	`*.ann.json.gz`	json file with annotated variants (optional)
`report`	`string`	`*.html`	The process The tool name VEP report file

Authors: @maxulysse, @matthdsm, @nvnieuwk Maintainers: @maxulysse, @matthdsm, @nvnieuwk

ENSEMBLVEP_DOWNLOAD

Defined in modules/nf-core/ensemblvep/download/main.nf:1

Keywords: annotation, cache, download

Ensembl Variant Effect Predictor (VEP). The cache downloading options are controlled through task.ext.args.

Tools

ensemblvep

VEP determines the effect of your variants (SNPs, insertions, deletions, CNVs or structural variants) on genes, transcripts, and protein sequence, as well as regulatory regions.

Homepage | Documentation | License: Apache-2.0

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`assembly`	`string`	Genome assembly
`species`	`string`	Specie
`cache_version`	`string`	cache version

Outputs

Name	Type	Pattern	Description
`cache`	`file`	`*`	cache

Authors: @maxulysse Maintainers: @maxulysse

BCFTOOLS_ANNOTATE

Defined in modules/nf-core/bcftools/annotate/main.nf:1

Keywords: bcftools, annotate, vcf, remove, add

Add or remove annotations.

Tools

annotate

Add or remove annotations.

Homepage | Documentation | biotools:bcftools | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	Query VCF or BCF file, can be either uncompressed or compressed
`index`	`file`	Index of the query VCF or BCF file
`annotations`	`file`	Bgzip-compressed file with annotations
`annotations_index`	`file`	Index of the annotations file

Outputs

Name	Type	Pattern	Description
`vcf`	`file`	`*{vcf,vcf.gz,bcf,bcf.gz}`	Compressed annotated VCF file
`tbi`	`file`	`*.tbi`	Alternative VCF file index
`csi`	`file`	`*.csi`	Default VCF file index

Authors: @projectoriented, @ramprasadn Maintainers: @projectoriented, @ramprasadn

PICARD_MARKDUPLICATES

Defined in modules/nf-core/picard/markduplicates/main.nf:1

Keywords: markduplicates, pcr, duplicates, bam, sam, cram

Locate and tag duplicate reads in a BAM file

Tools

picard

A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.

Homepage | Documentation | biotools:picard_tools | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`reads`	`file`	Sequence reads file, can be SAM/BAM/CRAM format
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fasta`	`file`	Reference genome fasta file, required for CRAM input
`meta3`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fai`	`file`	Reference genome fasta index

Outputs

Name	Type	Pattern	Description
`bam`	`file`	`*.{bam}`	BAM file with duplicate reads marked/removed
`bai`	`file`	`*.{bai}`	An optional BAM index file. If desired, --CREATE_INDEX must be passed as a flag
`cram`	`file`	`*.{cram}`	Output CRAM file
`metrics`	`file`	`*.{metrics.txt}`	Duplicate metrics file generated by picard

Authors: @drpatelh, @projectoriented, @ramprasadn Maintainers: @drpatelh, @projectoriented, @ramprasadn

TABIX_TABIX

Defined in modules/nf-core/tabix/tabix/main.nf:1

Keywords: index, tabix, vcf

create tabix index from a sorted bgzip tab-delimited genome file

Tools

tabix

Generic indexer for TAB-delimited genome position files.

Homepage | Documentation | biotools:tabix | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`tab`	`file`	TAB-delimited genome position file compressed with bgzip

Outputs

Name	Type	Pattern	Description
`index`	`file`	`*.{tbi,csi}`	Tabix index file (either tbi or csi)

Authors: @joseespinosa, @drpatelh, @maxulysse Maintainers: @joseespinosa, @drpatelh, @maxulysse

TABIX_BGZIPTABIX

Defined in modules/nf-core/tabix/bgziptabix/main.nf:1

Keywords: bgzip, compress, index, tabix, vcf

bgzip a sorted tab-delimited genome file and then create tabix index

Tools

tabix

Generic indexer for TAB-delimited genome position files.

Homepage | Documentation | biotools:tabix | License: MIT

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	Sorted tab-delimited genome file

Outputs

Name	Type	Pattern	Description
`gz_index`	`file`	`.gz, .{tbi,csi}`	bgzipped tab-delimited genome file Tabix index file (either tbi or csi)

Authors: @maxulysse, @DLBPointon Maintainers: @maxulysse, @DLBPointon

CAT_FASTQ

Defined in modules/nf-core/cat/fastq/main.nf:1

Keywords: cat, fastq, concatenate

Concatenates fastq files

Tools

cat

The cat utility reads files sequentially, writing them to the standard output.

Documentation | License: GPL-3.0-or-later

Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`reads`	`file`	List of input FastQ files to be concatenated.

Outputs

Name	Type	Pattern	Description
`reads`	`file`	`*.{merged.fastq.gz}`	Merged fastq file

Authors: @joseespinosa, @drpatelh Maintainers: @joseespinosa, @drpatelh

REMOVE_UNKNOWN_REGIONS

Defined in modules/local/remove_unknown_regions/main.nf:1

Inputs

Name	Type	Description
`val(meta), path(bed)`	`tuple`	-
`val(meta2), path(dict)`	`tuple`	-

Outputs

Name	Type	Emit	Description
`val(meta), path('*.bed')`	`tuple`	`bed`	-

GTF2BED

Defined in modules/local/gtf2bed/main.nf:13

Convert GTF annotation file to BED format. Extracts genomic features (exons, transcripts, or genes) from a GTF file and outputs them in BED format for use with interval-based tools. The output BED file uses 0-based coordinates (BED standard) converted from the 1-based GTF coordinates.

Inputs

Name	Type	Description
`val(meta), path(gtf)`	`tuple`	-

Outputs

Name	Type	Emit	Description
`val(meta), path('*.bed')`	`tuple`	`bed`	-

This pipeline was built with Nextflow. Documentation generated by nf-docs v0.1.0 on 2026-01-23 17:23:12 UTC.

Processes

Contents

MULTIQC

Code Documentation

Tools

Outputs

UNTAR

Tools

Inputs

Outputs

MOSDEPTH

Tools

Inputs

Outputs

SEQ2HLA

Code Documentation

Tools

Inputs

Outputs

FASTQC

Code Documentation

Tools

Inputs

Outputs

GFFREAD

Tools

Inputs

Outputs

GUNZIP

Tools

Inputs

Outputs

GATK4_COMBINEGVCFS

Tools

Inputs

Outputs

GATK4_INDEXFEATUREFILE

Tools

Inputs

Outputs

GATK4_VARIANTFILTRATION

Tools

Inputs

Outputs

GATK4_CREATESEQUENCEDICTIONARY

Tools

Inputs

Outputs

GATK4_SPLITNCIGARREADS

Tools

Inputs

Outputs

GATK4_HAPLOTYPECALLER

Code Documentation

Tools

Inputs

Outputs

GATK4_INTERVALLISTTOOLS

Tools

Inputs

Outputs

GATK4_BASERECALIBRATOR

Tools

Inputs

Outputs

GATK4_APPLYBQSR

Tools

Inputs

Outputs

GATK4_BEDTOINTERVALLIST

Tools

Inputs

Outputs

GATK4_MERGEVCFS

Tools

Inputs

Outputs

UMITOOLS_EXTRACT

Tools

Inputs