nf-core/rnavar

Version: 1.3.0dev · GATK4 RNA variant calling pipeline

Inputs

Input/output options

Name	Description	Type	Default	Required
`--input`	Path to comma-separated file containing information about the samples in the experiment.	`string`	n/a	yes
`--outdir`	The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.	`string`	n/a	yes
`--tools`	Specify which additional tools RNAvar should use. Values can be 'seq2hla', 'bcfann', 'snpeff', 'vep' or 'merge'. If you specify 'merge', the pipeline runs both snpeff and VEP annotation.	`string`	n/a	no
`--save_merged_fastq`	Save FastQ files after merging re-sequenced libraries in the results directory.	`boolean`	n/a	no

Preprocessing of alignment

Name	Description	Type	Default	Required
`--extract_umi`	Specify whether to remove UMIs from the reads with UMI-tools extract.	`boolean`	n/a	no
`--umitools_extract_method`	UMI pattern to use. Can be either 'string' (default) or 'regex'.	`string`	`string`	no
`--umitools_bc_pattern`	The UMI barcode pattern to use e.g. 'NNNNNN' indicates that the first 6 nucleotides of the read are from the UMI.	`string`	n/a	no
`--umitools_bc_pattern2`	The UMI barcode pattern to use if the UMI is located in read 2.	`string`	n/a	no
`--umitools_umi_separator`	The character that separates the UMI in the read name. Most likely a colon if you skipped the extraction with UMI-tools and used other software.	`string`	n/a	no

Alignment options

Name	Description	Type	Default	Required
`--aligner`	Specifies the alignment algorithm to use.	`string`	`star`	yes
`--star_index`	Path to STAR index folder or compressed file (tar.gz)	`string`	n/a	no
`--star_twopass`	Enable STAR 2-pass mapping mode.	`boolean`	`True`	no
`--star_ignore_sjdbgtf`	Do not use GTF file during STAR index building step	`boolean`	n/a	no
`--star_max_memory_bamsort`	Option to limit RAM when sorting BAM file. Value to be specified in bytes. If 0, will be set to the genome index size.	`integer`	`0`	no
`--star_bins_bamsort`	Specifies the number of genome bins for coordinate-sorting	`integer`	`50`	no
`--star_max_collapsed_junc`	Specifies the maximum number of collapsed junctions	`integer`	`1000000`	no
`--star_max_intron_size`	Specifies the maximum intron size	`integer`	n/a	no
`--seq_center`	Sequencing center information to be added to read group of BAM files.	`string`	n/a	no
`--seq_platform`	Specify the sequencing platform used	`string`	`illumina`	yes
`--save_unaligned`	Where possible, save unaligned reads from aligner to the results directory.	`boolean`	n/a	no
`--save_align_intermeds`	Save the intermediate BAM files from the alignment step.	`boolean`	n/a	no
`--bam_csi_index`	Create a CSI index for BAM files instead of the traditional BAI index. This will be required for genomes with larger chromosome sizes.	`boolean`	n/a	no

Postprocessing of alignment

Name	Description	Type	Default	Required
`--remove_duplicates`	Specify whether to remove duplicates from the BAM during Picard MarkDuplicates step.	`boolean`	n/a	no

Variant calling

Name	Description	Type	Default	Required
`--gatk_hc_call_conf`	The minimum phred-scaled confidence threshold at which variants should be called.	`integer`	`20`	no
`--generate_gvcf`	Enable generation of GVCFs by sample additionnaly to the VCFs.	`boolean`	n/a	no
`--gatk_interval_scatter_count`	Number of times the gene interval list to be split in order to run GATK haplotype caller in parallel	`integer`	`25`	no
`--no_intervals`	Do not use gene interval file during variant calling	`boolean`	n/a	no

Variant filtering

Name	Description	Type	Default	Required
`--gatk_vf_qd_filter`	Value to be used for the QualByDepth (QD) filter	`number`	`2`	no
`--gatk_vf_fs_filter`	Value to be used for the FisherStrand (FS) filter	`number`	`30`	no
`--gatk_vf_window_size`	The window size (in bases) in which to evaluate clustered SNPs.	`integer`	`35`	no
`--gatk_vf_cluster_size`	The number of SNPs which make up a cluster. Must be at least 2.	`integer`	`3`	no

Variant Annotation

Name	Description	Type	Default	Required
`--vep_cache`	Path to VEP cache.	`string`	`s3://annotation-cache/vep_cache/`	no
`--snpeff_cache`	Path to snpEff cache.	`string`	`s3://annotation-cache/snpeff_cache/`	no
`--vep_include_fasta`	Allow usage of fasta file for annotation with VEP	`boolean`	n/a	no
`--vep_dbnsfp`	Enable the use of the VEP dbNSFP plugin.	`boolean`	n/a	no
`--dbnsfp`	Path to dbNSFP processed file.	`string`	n/a	no
`--dbnsfp_tbi`	Path to dbNSFP tabix indexed file.	`string`	n/a	no
`--dbnsfp_consequence`	Consequence to annotate with	`string`	n/a	no
`--dbnsfp_fields`	Fields to annotate with	`string`	`rs_dbSNP,HGVSc_VEP,HGVSp_VEP,1000Gp3_EAS_AF,1000Gp3_AMR_AF,LRT_score,GERP++_RS,gnomAD_exomes_AF`	no
`--vep_loftee`	Enable the use of the VEP LOFTEE plugin.	`boolean`	n/a	no
`--vep_spliceai`	Enable the use of the VEP SpliceAI plugin.	`boolean`	n/a	no
`--spliceai_snv`	Path to spliceai raw scores snv file.	`string`	n/a	no
`--spliceai_snv_tbi`	Path to spliceai raw scores snv tabix indexed file.	`string`	n/a	no
`--spliceai_indel`	Path to spliceai raw scores indel file.	`string`	n/a	no
`--spliceai_indel_tbi`	Path to spliceai raw scores indel tabix indexed file.	`string`	n/a	no
`--vep_spliceregion`	Enable the use of the VEP SpliceRegion plugin.	`boolean`	n/a	no
`--vep_custom_args`	Add an extra custom argument to VEP.	`string`	`--everything --filter_common --per_gene --total_length --offline --format vcf`	no
`--outdir_cache`	The output directory where the cache will be saved. You have to use absolute paths to storage on Cloud infrastructure.	`string`	n/a	no
`--vep_out_format`	VEP output-file format.	`string`	`vcf`	no
`--bcftools_annotations`	A vcf file containing custom annotations to be used with bcftools annotate. Needs to be bgzipped.	`string`	n/a	no
`--bcftools_annotations_tbi`	Index file for `bcftools_annotations`	`string`	n/a	no
`--bcftools_columns`	Optional text file with list of columns to use from `bcftools_annotations`, one name per row	`string`	n/a	no
`--bcftools_header_lines`	Text file with the header lines of `bcftools_annotations`	`string`	n/a	no

Pipeline stage options

Name	Description	Type	Default	Required
`--skip_baserecalibration`	Skip the process of base recalibration steps i.e., GATK BaseRecalibrator and GATK ApplyBQSR.	`boolean`	n/a	no
`--skip_intervallisttools`	Skip the process of preparing interval lists for the GATK variant calling step	`boolean`	n/a	no
`--skip_variantfiltration`	Skip variant filtering of GATK	`boolean`	n/a	no
`--skip_variantannotation`	Skip variant annotation	`boolean`	n/a	no
`--skip_multiqc`	Skip MultiQC reports	`boolean`	n/a	no
`--skip_exon_bed_check`	Skip the check of the exon bed	`boolean`	n/a	no

General reference genome options

Name	Description	Type	Default	Required
`--igenomes_base`	The base path to the igenomes reference files	`string`	`s3://ngi-igenomes/igenomes/`	no
`--igenomes_ignore`	Do not load the iGenomes reference config.	`boolean`	n/a	no
`--save_reference`	Save built references.	`boolean`	n/a	no
`--download_cache`	Download annotation cache.	`boolean`	n/a	no

Reference genome options

Name	Description	Type	Default	Required
`--genome`	Name of iGenomes reference.	`string`	`GRCh38`	no
`--fasta`	Path to FASTA genome file.	`string`	n/a	no
`--dict`	Path to FASTA dictionary file.	`string`	n/a	no
`--fasta_fai`	Path to FASTA reference index.	`string`	n/a	no
`--gtf`	Path to GTF annotation file.	`string`	n/a	no
`--gff`	Path to GFF3 annotation file.	`string`	n/a	no
`--exon_bed`	Path to BED file containing exon intervals. This will be created from the GTF file if not specified.	`string`	n/a	no
`--read_length`	Read length	`number`	`150`	no
`--known_indels`	Path to known indels file.	`string`	n/a	no
`--known_indels_tbi`	Path to known indels file index.	`string`	n/a	no
`--dbsnp`	Path to dbsnp file.	`string`	n/a	no
`--dbsnp_tbi`	Path to dbsnp index.	`string`	n/a	no
`--snpeff_db`	snpEff DB version.	`string`	n/a	no
`--vep_genome`	VEP genome.	`string`	n/a	no
`--vep_species`	VEP species.	`string`	n/a	no
`--vep_cache_version`	VEP cache version.	`integer`	n/a	no
`--feature_type`	Type of feature to parse from annotation file	`string`	`exon`	no

Institutional config options

Name	Description	Type	Default	Required
`--custom_config_version`	Git commit id for Institutional configs.	`string`	`master`	no
`--custom_config_base`	Base directory for Institutional configs.	`string`	`https://raw.githubusercontent.com/nf-core/configs/master`	no
`--config_profile_name`	Institutional config name.	`string`	n/a	no
`--config_profile_description`	Institutional config description.	`string`	n/a	no
`--config_profile_contact`	Institutional config contact information.	`string`	n/a	no
`--config_profile_url`	Institutional config URL link.	`string`	n/a	no

Generic options

Name	Description	Type	Default	Required
`--version`	Display version and exit.	`boolean`	n/a	no
`--publish_dir_mode`	Method used to save pipeline results to output directory.	`string`	`copy`	no
`--email`	Email address for completion summary.	`string`	n/a	no
`--email_on_fail`	Email address for completion summary, only when pipeline fails.	`string`	n/a	no
`--plaintext_email`	Send plain-text email instead of HTML.	`boolean`	n/a	no
`--max_multiqc_email_size`	File size limit when attaching MultiQC reports to summary emails.	`string`	`25.MB`	no
`--monochrome_logs`	Do not use coloured log outputs.	`boolean`	n/a	no
`--hook_url`	Incoming hook URL for messaging service	`string`	n/a	no
`--multiqc_config`	Custom config file to supply to MultiQC.	`string`	n/a	no
`--multiqc_logo`	Custom logo file to supply to MultiQC. File name must also be set in the MultiQC config file	`string`	n/a	no
`--multiqc_methods_description`	Custom MultiQC yaml file containing HTML including a methods description.	`string`	n/a	no
`--multiqc_title`	MultiQC report title. Printed as page header, used for filename if not otherwise specified.	`string`	n/a	no
`--validate_params`	Boolean whether to validate parameters against the schema at runtime	`boolean`	`True`	no
`--modules_testdata_base_path`	Base URL or local path to location of pipeline test dataset files	`string`	`https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/`	no
`--pipelines_testdata_base_path`	Base URL or local path to location of pipeline test dataset files	`string`	`https://raw.githubusercontent.com/nf-core/test-datasets/rnavar/data/`	no
`--trace_report_suffix`	Suffix to add to the trace report filename. Default is the date and time in the format yyyy-MM-dd_HH-mm-ss.	`string`	n/a	no
`--help`	Display the help message.	`boolean`	n/a	no
`--help_full`	Display the full detailed help message.	`boolean`	n/a	no
`--show_hidden`	Display hidden parameters in the help message (only works when --help or --help_full are provided).	`boolean`	n/a	no

Workflows

Name	Description	Entry
`NFCORE_RNAVAR`	n/a	no
(entry)	n/a	yes
`RNAVAR`	Main workflow for RNA variant calling analysis. This workflow performs end-to-end RNA-seq variant calling including: - Quality control with FastQC - Read alignment with STAR - Duplicate marking with Picard - Split N CIGAR reads for RNA-seq data - Base quality score recalibration (BQSR) - Variant calling with GATK HaplotypeCaller - Variant filtering - Variant annotation with SnpEff and VEP - HLA typing with seq2HLA (optional) The workflow supports multiple input types including FASTQ, BAM, CRAM, and VCF files.	no
`BAM_STATS_SAMTOOLS`	Produces comprehensive statistics from SAM/BAM/CRAM file	no
`FASTQ_ALIGN_STAR`	Align reads to a reference genome using bowtie2 then sort with samtools	no
`VCF_ANNOTATE_SNPEFF`	Perform annotation with snpEff and bgzip + tabix index the resulting VCF file	no
`VCF_ANNOTATE_ENSEMBLVEP`	Perform annotation with ensemblvep and bgzip + tabix index the resulting VCF file	no
`BAM_MARKDUPLICATES_PICARD`	Picard MarkDuplicates, index BAM file and run samtools stats, flagstat and idxstats	no
`BAM_SORT_STATS_SAMTOOLS`	Sort SAM/BAM/CRAM file	no
`PREPARE_ALIGNMENT`	n/a	no
`SPLITNCIGAR`	Split reads that contain N CIGAR operations for RNA-seq variant calling. This subworkflow handles the GATK SplitNCigarReads step which is essential for RNA-seq variant calling. It splits reads that span introns (N in CIGAR) and reassigns mapping qualities to meet GATK requirements. The workflow processes BAM files in parallel across genomic intervals, then merges and indexes the results for efficient downstream processing.	no
`RECALIBRATE`	Apply base quality score recalibration (BQSR) to BAM files. This subworkflow applies the BQSR model generated by GATK BaseRecalibrator to adjust base quality scores in BAM files. Recalibrated quality scores improve the accuracy of variant calling by correcting systematic errors in the original quality scores assigned by the sequencing machine. Optionally generates alignment statistics using samtools stats for QC.	no
`DOWNLOAD_CACHE_SNPEFF_VEP`	n/a	no
`PIPELINE_INITIALISATION`	Initialize the nf-core/rnavar pipeline. Performs all setup tasks required before running the main workflow: - Display version information if requested - Validate parameters against the schema - Check Conda channel configuration - Parse and validate the input samplesheet - Generate parameter summary for logging	no
`PIPELINE_COMPLETION`	Handle pipeline completion tasks. Executes cleanup and notification tasks when the pipeline finishes: - Send completion email with run summary - Generate completion summary to stdout - Send notifications to messaging platforms (Slack, Teams, etc.) - Log error messages for failed runs	no
`ANNOTATION_CACHE_INITIALISATION`	n/a	no
`PREPARE_GENOME`	n/a	no
`VCF_ANNOTATE_ALL`	Annotate variants using multiple annotation tools. This subworkflow provides flexible variant annotation using one or more tools: - SnpEff: Functional annotation and effect prediction - VEP (Ensembl Variant Effect Predictor): Comprehensive variant annotation - BCFtools annotate: Add custom annotations from external files - Merge: Combined SnpEff + VEP annotation The tools to use are specified via the `tools` parameter as a comma-separated list (e.g., "snpeff,vep" or "merge").	no

`NFCORE_RNAVAR` Inputs

Name	Description
`samplesheet`	n/a
`align`	n/a

`NFCORE_RNAVAR` Outputs

Name	Description
`?`	n/a
`?`	n/a

`RNAVAR` Inputs

Name	Description
`input`	n/a
`bcftools_annotations`	n/a
`bcftools_annotations_tbi`	n/a
`bcftools_columns`	n/a
`bcftools_header_lines`	n/a
`dbsnp`	n/a
`dbsnp_tbi`	n/a
`dict`	n/a
`exon_bed`	n/a
`fasta`	n/a
`fasta_fai`	n/a
`gtf`	n/a
`known_sites`	n/a
`known_sites_tbi`	n/a
`star_index`	n/a
`snpeff_cache`	n/a
`snpeff_db`	n/a
`vep_genome`	n/a
`vep_species`	n/a
`vep_cache_version`	n/a
`vep_include_fasta`	n/a
`vep_cache`	n/a
`vep_extra_files`	n/a
`seq_center`	n/a
`seq_platform`	n/a
`aligner`	n/a
`bam_csi_index`	n/a
`extract_umi`	n/a
`generate_gvcf`	n/a
`skip_multiqc`	n/a
`skip_baserecalibration`	n/a
`skip_intervallisttools`	n/a
`skip_variantannotation`	n/a
`skip_variantfiltration`	n/a
`star_ignore_sjdbgtf`	n/a
`tools`	n/a

`RNAVAR` Outputs

Name	Description
`?`	n/a
`?`	n/a

`BAM_STATS_SAMTOOLS` Inputs

Name	Description
`ch_bam_bai`	The input channel containing the BAM/CRAM and it's index Structure: [ val(meta), path(bam), path(bai) ]
`ch_fasta`	Reference genome fasta file Structure: [ path(fasta) ]

`BAM_STATS_SAMTOOLS` Outputs

Name	Description
`stats`	File containing samtools stats output Structure: [ val(meta), path(stats) ]
`flagstat`	File containing samtools flagstat output Structure: [ val(meta), path(flagstat) ]
`idxstats`	File containing samtools idxstats output Structure: [ val(meta), path(idxstats)]
`versions`	Files containing software versions Structure: [ path(versions.yml) ]

`FASTQ_ALIGN_STAR` Inputs

Name	Description
`ch_reads`	List of input FastQ files of size 1 and 2 for single-end and paired-end data, respectively. Structure: [ val(meta), [ path(reads) ] ]
`ch_index`	STAR genome index
`ch_gtf`	GTF file used to set the splice junctions with the --sjdbGTFfile flag
`val_star_ignore_sjdbgtf`	If true the --sjdbGTFfile flag is set
`val_seq_platform`	Sequencing platform to be added to the bam header using the --outSAMattrRGline flag
`val_seq_center`	Sequencing center to be added to the bam header using the --outSAMattrRGline flag
`ch_fasta`	Reference genome fasta file
`ch_transcripts_fasta`	Optional reference genome fasta file

`FASTQ_ALIGN_STAR` Outputs

Name	Description
`orig_bam`	Output BAM file containing read alignments Structure: [ val(meta), path(bam) ]
`log_final`	STAR final log file Structure: [ val(meta), path(log_final) ]
`log_out`	STAR log out file Structure: [ val(meta), path(log_out) ]
`log_progress`	STAR log progress file Structure: [ val(meta), path(log_progress) ]
`bam_sorted`	Sorted BAM file of read alignments (optional) Structure: [ val(meta), path(bam) ]
`orig_bam_transcript`	Output BAM file of transcriptome alignment (optional) Structure: [ val(meta), path(bam) ]
`fastq`	Unmapped FastQ files (optional) Structure: [ val(meta), path(fastq) ]
`tab`	STAR output tab file(s) (optional) Structure: [ val(meta), path(tab) ]
`bam`	BAM file ordered by samtools Structure: [ val(meta), path(bam) ]
`bai`	BAI index of the ordered BAM file Structure: [ val(meta), path(bai) ]
`stats`	File containing samtools stats output Structure: [ val(meta), path(stats) ]
`flagstat`	File containing samtools flagstat output Structure: [ val(meta), path(flagstat) ]
`idxstats`	File containing samtools idxstats output Structure: [ val(meta), path(idxstats) ]
`bam_transcript`	Transcriptome-level BAM file ordered by samtools (optional) Structure: [ val(meta), path(bam) ]
`bai_transcript`	Transcriptome-level BAI index of the ordered BAM file (optional) Structure: [ val(meta), path(bai) ]
`stats_transcript`	Transcriptome-level file containing samtools stats output (optional) Structure: [ val(meta), path(stats) ]
`flagstat_transcript`	Transcriptome-level file containing samtools flagstat output (optional) Structure: [ val(meta), path(flagstat) ]
`idxstats_transcript`	Transcriptome-level file containing samtools idxstats output (optional) Structure: [ val(meta), path(idxstats) ]
`versions`	File containing software versions

`VCF_ANNOTATE_SNPEFF` Inputs

Name	Description
`ch_vcf`	vcf file Structure: [ val(meta), path(vcf) ]
`val_snpeff_db`	db version to use
`ch_snpeff_cache`	path to root cache folder for snpEff (optional) Structure: [ path(cache) ]

`VCF_ANNOTATE_SNPEFF` Outputs

Name	Description
`vcf_tbi`	Compressed vcf file + tabix index Structure: [ val(meta), path(vcf), path(tbi) ]
`reports`	html reports Structure: [ path(html) ]
`summary`	html reports Structure: [ path(csv) ]
`genes_txt`	html reports Structure: [ path(txt) ]
`versions`	Files containing software versions Structure: [ path(versions.yml) ]

`VCF_ANNOTATE_ENSEMBLVEP` Inputs

Name	Description
`ch_vcf`	vcf file to annotate Structure: [ val(meta), path(vcf), [path(custom_file1), path(custom_file2)... (optional)] ]
`ch_fasta`	Reference genome fasta file (optional) Structure: [ val(meta2), path(fasta) ]
`val_genome`	genome to use
`val_species`	species to use
`val_cache_version`	cache version to use
`ch_cache`	the root cache folder for ensemblvep (optional) Structure: [ val(meta3), path(cache) ]
`ch_extra_files`	any extra files needed by plugins for ensemblvep (optional) Structure: [ path(file1), path(file2)... ]

`VCF_ANNOTATE_ENSEMBLVEP` Outputs

Name	Description
`vcf_tbi`	Compressed vcf file + tabix index Structure: [ val(meta), path(vcf), path(tbi) ]
`json`	json file Structure: [ val(meta), path(json) ]
`tab`	tab file Structure: [ val(meta), path(tab) ]
`reports`	html reports
`versions`	File containing software versions

`BAM_MARKDUPLICATES_PICARD` Inputs

Name	Description
`ch_reads`	Sequence reads in BAM/CRAM/SAM format Structure: [ val(meta), path(reads) ]
`ch_fasta`	Reference genome fasta file required for CRAM input Structure: [ path(fasta) ]
`ch_fasta`	Index of the reference genome fasta file Structure: [ path(fai) ]

`BAM_MARKDUPLICATES_PICARD` Outputs

Name	Description
`bam`	processed BAM/SAM file Structure: [ val(meta), path(bam) ]
`bai`	BAM/SAM samtools index Structure: [ val(meta), path(bai) ]
`cram`	processed CRAM file Structure: [ val(meta), path(cram) ]
`crai`	CRAM samtools index Structure: [ val(meta), path(crai) ]
`csi`	CSI samtools index Structure: [ val(meta), path(csi) ]
`stats`	File containing samtools stats output Structure: [ val(meta), path(stats) ]
`flagstat`	File containing samtools flagstat output Structure: [ val(meta), path(flagstat) ]
`idxstats`	File containing samtools idxstats output Structure: [ val(meta), path(idxstats) ]
`versions`	Files containing software versions Structure: [ path(versions.yml) ]

`BAM_SORT_STATS_SAMTOOLS` Inputs

Name	Description
`meta`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`bam`	BAM/CRAM/SAM file
`fasta`	Reference genome fasta file

`BAM_SORT_STATS_SAMTOOLS` Outputs

Name	Description
`meta`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`bam`	Sorted BAM/CRAM/SAM file
`bai`	BAM/CRAM/SAM index file
`crai`	BAM/CRAM/SAM index file
`stats`	File containing samtools stats output
`flagstat`	File containing samtools flagstat output
`idxstats`	File containing samtools idxstats output
`versions`	File containing software versions

`PREPARE_ALIGNMENT` Inputs

Name	Description
`cram`	n/a
`bam`	n/a

`PREPARE_ALIGNMENT` Outputs

Name	Description
`bam`	n/a
`versions`	n/a

`SPLITNCIGAR` Inputs

Name	Description
`bam`	n/a
`fasta`	n/a
`fai`	n/a
`dict`	n/a
`intervals`	n/a

`SPLITNCIGAR` Outputs

Name	Description
`bam_bai`	n/a
`versions`	n/a

`RECALIBRATE` Inputs

Name	Description
`skip_samtools`	n/a
`bam`	n/a
`dict`	n/a
`fai`	n/a
`fasta`	n/a

`RECALIBRATE` Outputs

Name	Description
`bam`	n/a
`qc`	n/a
`versions`	n/a

`DOWNLOAD_CACHE_SNPEFF_VEP` Inputs

Name	Description
`ensemblvep_info`	n/a
`snpeff_info`	n/a

`DOWNLOAD_CACHE_SNPEFF_VEP` Outputs

Name	Description
`ensemblvep_cache`	n/a
`snpeff_cache`	n/a

`PIPELINE_INITIALISATION` Inputs

Name	Description
`version`	n/a
`validate_params`	n/a
`nextflow_cli_args`	n/a
`outdir`	n/a
`input`	n/a
`help`	n/a
`help_full`	n/a
`show_hidden`	n/a

`PIPELINE_INITIALISATION` Outputs

Name	Description
`samplesheet`	n/a
`align`	n/a
`versions`	n/a

`PIPELINE_COMPLETION` Inputs

Name	Description
`email`	n/a
`email_on_fail`	n/a
`plaintext_email`	n/a
`outdir`	n/a
`monochrome_logs`	n/a
`hook_url`	n/a
`multiqc_report`	n/a

`PIPELINE_COMPLETION` Outputs

Name	Description
`<none>`	n/a

`ANNOTATION_CACHE_INITIALISATION` Inputs

Name	Description
`snpeff_enabled`	n/a
`snpeff_cache`	n/a
`snpeff_db`	n/a
`vep_enabled`	n/a
`vep_cache`	n/a
`vep_species`	n/a
`vep_cache_version`	n/a
`vep_genome`	n/a
`vep_custom_args`	n/a
`help_message`	n/a

`ANNOTATION_CACHE_INITIALISATION` Outputs

Name	Description
`?`	n/a
`?`	n/a

`PREPARE_GENOME` Inputs

Name	Description
`bcftools_annotations`	n/a
`bcftools_annotations_tbi`	n/a
`dbsnp`	n/a
`dbsnp_tbi`	n/a
`dict`	n/a
`exon_bed`	n/a
`fasta`	n/a
`fasta_fai`	n/a
`gff`	n/a
`gtf`	n/a
`known_indels`	n/a
`known_indels_tbi`	n/a
`star_index`	n/a
`feature_type`	n/a
`skip_exon_bed_check`	n/a
`align`	n/a

`PREPARE_GENOME` Outputs

Name	Description
`bcfann`	n/a
`bcfann_tbi`	n/a
`dbsnp`	n/a
`dbsnp_tbi`	n/a
`dict`	n/a
`exon_bed`	n/a
`fasta`	n/a
`fasta_fai`	n/a
`gtf`	n/a
`known_indels`	n/a
`known_indels_tbi`	n/a
`known_sites`	n/a
`known_sites_tbi`	n/a
`star_index`	n/a
`versions`	n/a

`VCF_ANNOTATE_ALL` Inputs

Name	Description
`vcf`	n/a
`fasta`	n/a
`tools`	n/a
`snpeff_db`	n/a
`snpeff_cache`	n/a
`vep_genome`	n/a
`vep_species`	n/a
`vep_cache_version`	n/a
`vep_cache`	n/a
`vep_extra_files`	n/a
`bcftools_annotations`	n/a
`bcftools_annotations_index`	n/a
`bcftools_columns`	n/a
`bcftools_header_lines`	n/a

`VCF_ANNOTATE_ALL` Outputs

Name	Description
`?`	n/a
`?`	n/a
`?`	n/a
`?`	n/a

Processes

Name	Description
`MULTIQC`	Aggregate results from bioinformatics analyses across many samples into a single report
`UNTAR`	Extract files from tar, tar.gz, tar.bz2, tar.xz archives
`MOSDEPTH`	Calculates genome-wide sequencing coverage.
`SEQ2HLA`	Precision HLA typing and expression from RNA-seq data using seq2HLA
`FASTQC`	Run FastQC on sequenced reads
`GFFREAD`	Validate, filter, convert and perform various other operations on GFF files
`GUNZIP`	Compresses and decompresses files.
`GATK4_COMBINEGVCFS`	Combine per-sample gVCF files produced by HaplotypeCaller into a multi-sample gVCF file
`GATK4_INDEXFEATUREFILE`	Creates an index for a feature file, e.g. VCF or BED file.
`GATK4_VARIANTFILTRATION`	Filter variants
`GATK4_CREATESEQUENCEDICTIONARY`	Creates a sequence dictionary for a reference sequence
`GATK4_SPLITNCIGARREADS`	Splits reads that contain Ns in their cigar string
`GATK4_HAPLOTYPECALLER`	Call germline SNPs and indels via local re-assembly of haplotypes
`GATK4_INTERVALLISTTOOLS`	Splits the interval list file into unique, equally-sized interval files and place it under a directory
`GATK4_BASERECALIBRATOR`	Generate recalibration table for Base Quality Score Recalibration (BQSR)
`GATK4_APPLYBQSR`	Apply base quality score recalibration (BQSR) to a bam file
`GATK4_BEDTOINTERVALLIST`	Creates an interval list from a bed file and a reference dict
`GATK4_MERGEVCFS`	Merges several vcf files
`UMITOOLS_EXTRACT`	Extracts UMI barcode from a read and add it to the read name, leaving any sample barcode in place
`SAMTOOLS_SORT`	Sort SAM/BAM/CRAM file
`SAMTOOLS_MERGE`	Merge BAM or CRAM file
`SAMTOOLS_IDXSTATS`	Reports alignment summary statistics for a BAM/CRAM/SAM file
`SAMTOOLS_FAIDX`	Index FASTA file, and optionally generate a file of chromosome sizes
`SAMTOOLS_INDEX`	Index SAM/BAM/CRAM file
`SAMTOOLS_FLAGSTAT`	Counts the number of alignments in a BAM/CRAM/SAM file for each FLAG type
`SAMTOOLS_STATS`	Produces comprehensive statistics from SAM/BAM/CRAM file
`SAMTOOLS_CONVERT`	convert and then index CRAM -> BAM or BAM -> CRAM file
`BEDTOOLS_SORT`	Sorts a feature file by chromosome and other criteria.
`BEDTOOLS_MERGE`	combines overlapping or “book-ended” features in an interval file into a single feature which spans all of the combined features.
`STAR_GENOMEGENERATE`	Create index for STAR
`STAR_ALIGN`	Align reads to a reference genome using STAR
`STAR_INDEXVERSION`	Get the minimal allowed index version from STAR
`SNPEFF_SNPEFF`	Genetic variant annotation and functional effect prediction toolbox
`SNPEFF_DOWNLOAD`	Genetic variant annotation and functional effect prediction toolbox
`ENSEMBLVEP_VEP`	Ensembl Variant Effect Predictor (VEP). The output-file-format is controlled through `task.ext.args`.
`ENSEMBLVEP_DOWNLOAD`	Ensembl Variant Effect Predictor (VEP). The cache downloading options are controlled through `task.ext.args`.
`BCFTOOLS_ANNOTATE`	Add or remove annotations.
`PICARD_MARKDUPLICATES`	Locate and tag duplicate reads in a BAM file
`TABIX_TABIX`	create tabix index from a sorted bgzip tab-delimited genome file
`TABIX_BGZIPTABIX`	bgzip a sorted tab-delimited genome file and then create tabix index
`CAT_FASTQ`	Concatenates fastq files
`REMOVE_UNKNOWN_REGIONS`	n/a
`GTF2BED`	Convert GTF annotation file to BED format. Extracts genomic features (exons, transcripts, or genes) from a GTF file and outputs them in BED format for use with interval-based tools. The output BED file uses 0-based coordinates (BED standard) converted from the 1-based GTF coordinates.

`MULTIQC` Outputs

Name	Type	Pattern	Description
`report`	`-`	n/a	n/a
`data`	`-`	n/a	n/a
`plots`	`-`	n/a	n/a

`UNTAR` Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`archive`	`file`	File to be untarred

`UNTAR` Outputs

Name	Type	Pattern	Description
`untar`	`map`	`*/`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]

`MOSDEPTH` Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`bam`	`file`	Input BAM/CRAM file
`bai`	`file`	Index for BAM/CRAM file
`bed`	`file`	BED file with intersected intervals
`meta2`	`map`	Groovy Map containing bed information e.g. [ id:'test' ]
`fasta`	`file`	Reference genome FASTA file

`MOSDEPTH` Outputs

Name	Type	Pattern	Description
`global_txt`	`file`	`*.{global.dist.txt}`	Text file with global cumulative coverage distribution
`summary_txt`	`file`	`*.{summary.txt}`	Text file with summary mean depths per chromosome and regions
`regions_txt`	`file`	`*.{region.dist.txt}`	Text file with region cumulative coverage distribution
`per_base_d4`	`file`	`*.{per-base.d4}`	D4 file with per-base coverage
`per_base_bed`	`file`	`*.{per-base.bed.gz}`	BED file with per-base coverage
`per_base_csi`	`file`	`*.{per-base.bed.gz.csi}`	Index file for BED file with per-base coverage
`regions_bed`	`file`	`*.{regions.bed.gz}`	BED file with per-region coverage
`regions_csi`	`file`	`*.{regions.bed.gz.csi}`	Index file for BED file with per-region coverage
`quantized_bed`	`file`	`*.{quantized.bed.gz}`	BED file with binned coverage
`quantized_csi`	`file`	`*.{quantized.bed.gz.csi}`	Index file for BED file with binned coverage
`thresholds_bed`	`file`	`*.{thresholds.bed.gz}`	BED file with the number of bases in each region that are covered at or above each threshold
`thresholds_csi`	`file`	`*.{thresholds.bed.gz.csi}`	Index file for BED file with threshold coverage

`SEQ2HLA` Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. `[ id:'sample1', single_end:false ]`
`reads`	`file`	Paired-end FASTQ files for RNA-seq data

`SEQ2HLA` Outputs

Name	Type	Pattern	Description
`class1_genotype_2d`	`file`	`*ClassI-class.HLAgenotype2digits`	HLA Class I 2-digit genotype results
`class2_genotype_2d`	`file`	`*ClassII.HLAgenotype2digits`	HLA Class II 2-digit genotype results
`class1_genotype_4d`	`file`	`*ClassI-class.HLAgenotype4digits`	HLA Class I 4-digit genotype results
`class2_genotype_4d`	`file`	`*ClassII.HLAgenotype4digits`	HLA Class II 4-digit genotype results
`class1_bowtielog`	`file`	`*ClassI-class.bowtielog`	HLA Class I Bowtie alignment log
`class2_bowtielog`	`file`	`*ClassII.bowtielog`	HLA Class II Bowtie alignment log
`class1_expression`	`file`	`*ClassI-class.expression`	HLA Class I expression results
`class2_expression`	`file`	`*ClassII.expression`	HLA Class II expression results
`class1_nonclass_genotype_2d`	`file`	`*ClassI-nonclass.HLAgenotype2digits`	HLA Class I non-classical 2-digit genotype results
`ambiguity`	`file`	`*.ambiguity`	HLA typing ambiguity results
`class1_nonclass_genotype_4d`	`file`	`*ClassI-nonclass.HLAgenotype4digits`	HLA Class I non-classical 4-digit genotype results
`class1_nonclass_bowtielog`	`file`	`*ClassI-nonclass.bowtielog`	HLA Class I non-classical Bowtie alignment log
`class1_nonclass_expression`	`file`	`*ClassI-nonclass.expression`	HLA Class I non-classical expression results

`FASTQC` Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`reads`	`file`	List of input FastQ files of size 1 and 2 for single-end and paired-end data, respectively.

`FASTQC` Outputs

Name	Type	Pattern	Description
`html`	`file`	`*_{fastqc.html}`	FastQC report
`zip`	`file`	`*_{fastqc.zip}`	FastQC report archive

`GFFREAD` Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing meta data e.g. [ id:'test' ]
`gff`	`file`	A reference file in either the GFF3, GFF2 or GTF format.

`GFFREAD` Outputs

Name	Type	Pattern	Description
`gtf`	`file`	`*.{gtf}`	GTF file resulting from the conversion of the GFF input file if '-T' argument is present
`gffread_gff`	`file`	`*.gff3`	GFF3 file resulting from the conversion of the GFF input file if '-T' argument is absent
`gffread_fasta`	`file`	`*.fasta`	Fasta file produced when either of '-w', '-x', '-y' parameters is present

`GUNZIP` Inputs

Name	Type	Description
`meta`	`map`	Optional groovy Map containing meta information e.g. [ id:'test', single_end:false ]
`archive`	`file`	File to be compressed/uncompressed

`GUNZIP` Outputs

Name	Type	Pattern	Description
`gunzip`	`file`	`.`	Compressed/uncompressed file

`GATK4_COMBINEGVCFS` Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test' ]
`vcf`	`file`	Compressed VCF files
`vcf_idx`	`file`	VCF Index file

`GATK4_COMBINEGVCFS` Outputs

Name	Type	Pattern	Description
`combined_gvcf`	`file`	`*.combined.g.vcf.gz`	Compressed Combined GVCF file

`GATK4_INDEXFEATUREFILE` Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`feature_file`	`file`	VCF/BED file

`GATK4_INDEXFEATUREFILE` Outputs

Name	Type	Pattern	Description
`index`	`file`	`*.{tbi,idx}`	Index for VCF/BED file

`GATK4_VARIANTFILTRATION` Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test']
`vcf`	`list`	List of VCF(.gz) files
`tbi`	`list`	List of VCF file indexes
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fasta`	`file`	Fasta file of reference genome
`meta3`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fai`	`file`	Index of fasta file
`meta4`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`dict`	`file`	Sequence dictionary of fastea file
`meta5`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`gzi`	`file`	Genome index file only needed when the genome file was compressed with the BGZF algorithm.

`GATK4_VARIANTFILTRATION` Outputs

Name	Type	Pattern	Description
`vcf`	`file`	`*.vcf.gz`	Compressed VCF file
`tbi`	`file`	`*.vcf.gz.tbi`	Index of VCF file

`GATK4_CREATESEQUENCEDICTIONARY` Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fasta`	`file`	Input fasta file

`GATK4_CREATESEQUENCEDICTIONARY` Outputs

Name	Type	Pattern	Description
`dict`	`file`	`*.{dict}`	gatk dictionary file

`GATK4_SPLITNCIGARREADS` Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test']
`bam`	`list`	BAM/SAM/CRAM file containing reads
`bai`	`list`	BAI/SAI/CRAI index file (optional)
`intervals`	`file`	Bed file with the genomic regions included in the library (optional)
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'reference' ]
`fasta`	`file`	The reference fasta file
`meta3`	`map`	Groovy Map containing reference information e.g. [ id:'reference' ]
`fai`	`file`	Index of reference fasta file
`meta4`	`map`	Groovy Map containing reference information e.g. [ id:'reference' ]
`dict`	`file`	GATK sequence dictionary

`GATK4_SPLITNCIGARREADS` Outputs

Name	Type	Pattern	Description
`bam`	`file`	`*.{bam,sam,cram}`	Output file with split reads (BAM/SAM/CRAM)

`GATK4_HAPLOTYPECALLER` Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	BAM/CRAM file from alignment
`input_index`	`file`	BAI/CRAI file from alignment
`intervals`	`file`	Bed file with the genomic regions included in the library (optional)
`dragstr_model`	`file`	Text file containing the DragSTR model of the used BAM/CRAM file (optional)
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'test_reference' ]
`fasta`	`file`	The reference fasta file
`meta3`	`map`	Groovy Map containing reference information e.g. [ id:'test_reference' ]
`fai`	`file`	Index of reference fasta file
`meta4`	`map`	Groovy Map containing reference information e.g. [ id:'test_reference' ]
`dict`	`file`	GATK sequence dictionary
`meta5`	`map`	Groovy Map containing dbsnp information e.g. [ id:'test_dbsnp' ]
`dbsnp`	`file`	VCF file containing known sites (optional)
`meta6`	`map`	Groovy Map containing dbsnp information e.g. [ id:'test_dbsnp' ]
`dbsnp_tbi`	`file`	VCF index of dbsnp (optional)

`GATK4_HAPLOTYPECALLER` Outputs

Name	Type	Pattern	Description
`vcf`	`file`	`*.vcf.gz`	Compressed VCF file
`tbi`	`file`	`*.vcf.gz.tbi`	Index of VCF file
`bam`	`file`	`*.realigned.bam`	Assembled haplotypes and locally realigned reads

`GATK4_INTERVALLISTTOOLS` Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`intervals`	`file`	Interval file

`GATK4_INTERVALLISTTOOLS` Outputs

Name	Type	Pattern	Description
`interval_list`	`file`	`*.interval_list`	Interval list files

`GATK4_BASERECALIBRATOR` Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	BAM/CRAM file from alignment
`input_index`	`file`	BAI/CRAI file from alignment
`intervals`	`file`	Bed file with the genomic regions included in the library (optional)
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'genome']
`fasta`	`file`	The reference fasta file
`meta3`	`map`	Groovy Map containing reference information e.g. [ id:'genome']
`fai`	`file`	Index of reference fasta file
`meta4`	`map`	Groovy Map containing reference information e.g. [ id:'genome']
`dict`	`file`	GATK sequence dictionary
`meta5`	`map`	Groovy Map containing reference information e.g. [ id:'genome']
`known_sites`	`file`	VCF files with known sites for indels / snps
`meta6`	`map`	Groovy Map containing reference information e.g. [ id:'genome']
`known_sites_tbi`	`file`	Tabix index of the known_sites

`GATK4_BASERECALIBRATOR` Outputs

Name	Type	Pattern	Description
`table`	`file`	`*.{table}`	Recalibration table from BaseRecalibrator

`GATK4_APPLYBQSR` Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	BAM/CRAM file from alignment
`input_index`	`file`	BAI/CRAI file from alignment
`bqsr_table`	`file`	Recalibration table from gatk4_baserecalibrator
`intervals`	`file`	Bed file with the genomic regions included in the library (optional)

`GATK4_APPLYBQSR` Outputs

Name	Type	Pattern	Description
`bam`	`file`	`${prefix}.bam`	Recalibrated BAM file
`bai`	`file`	`${prefix}*bai`	Recalibrated BAM index file
`cram`	`file`	`${prefix}.cram`	Recalibrated CRAM file

`GATK4_BEDTOINTERVALLIST` Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test']
`bed`	`file`	Input bed file
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`dict`	`file`	Sequence dictionary

`GATK4_BEDTOINTERVALLIST` Outputs

Name	Type	Pattern	Description
`interval_list`	`file`	`*.interval_list`	gatk interval list file

`GATK4_MERGEVCFS` Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test']
`vcf`	`list`	Two or more VCF files
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'genome']
`dict`	`file`	Optional Sequence Dictionary as input

`GATK4_MERGEVCFS` Outputs

Name	Type	Pattern	Description
`vcf`	`file`	`*.vcf.gz`	merged vcf file
`tbi`	`file`	`*.tbi`	index files for the merged vcf files

`UMITOOLS_EXTRACT` Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`reads`	`list`	List of input FASTQ files whose UMIs will be extracted.

`UMITOOLS_EXTRACT` Outputs

Name	Type	Pattern	Description
`reads`	`file`	`*.{fastq.gz}`	Extracted FASTQ files. \| For single-end reads, pattern is \${prefix}.umi_extract.fastq.gz. \| For paired-end reads, pattern is \${prefix}.umi_extract_{1,2}.fastq.gz.
`log`	`file`	`*.{log}`	Logfile for umi_tools

`SAMTOOLS_SORT` Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`bam`	`file`	BAM/CRAM/SAM file(s)
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fasta`	`file`	Reference genome FASTA file

`SAMTOOLS_SORT` Outputs

Name	Type	Pattern	Description
`bam`	`file`	`*.{bam}`	Sorted BAM file
`cram`	`file`	`*.{cram}`	Sorted CRAM file
`sam`	`file`	`*.{sam}`	Sorted SAM file
`crai`	`file`	`*.crai`	CRAM index file (optional)
`csi`	`file`	`*.csi`	BAM index file (optional)
`bai`	`file`	`*.bai`	BAM index file (optional)

`SAMTOOLS_MERGE` Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input_files`	`file`	BAM/CRAM file
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fasta`	`file`	Reference file the CRAM was created with (optional)
`meta3`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fai`	`file`	Index of the reference file the CRAM was created with (optional)
`meta4`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`gzi`	`file`	Index of the compressed reference file the CRAM was created with (optional)

`SAMTOOLS_MERGE` Outputs

Name	Type	Pattern	Description
`bam`	`file`	`*.{bam}`	BAM file
`cram`	`file`	`*.{cram}`	CRAM file
`csi`	`file`	`*.csi`	BAM index file (optional)
`crai`	`file`	`*.crai`	CRAM index file (optional)

`SAMTOOLS_IDXSTATS` Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`bam`	`file`	BAM/CRAM/SAM file
`bai`	`file`	Index for BAM/CRAM/SAM file

`SAMTOOLS_IDXSTATS` Outputs

Name	Type	Pattern	Description
`idxstats`	`file`	`*.{idxstats}`	File containing samtools idxstats output

`SAMTOOLS_FAIDX` Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing reference information e.g. [ id:'test' ]
`fasta`	`file`	FASTA file
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'test' ]
`fai`	`file`	FASTA index file

`SAMTOOLS_FAIDX` Outputs

Name	Type	Pattern	Description
`fa`	`file`	`*.{fa}`	FASTA file
`sizes`	`file`	`*.{sizes}`	File containing chromosome lengths
`fai`	`file`	`*.{fai}`	FASTA index file
`gzi`	`file`	`*.gzi`	Optional gzip index file for compressed inputs

`SAMTOOLS_INDEX` Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	input file

`SAMTOOLS_INDEX` Outputs

Name	Type	Pattern	Description
`bai`	`file`	`*.{bai,crai,sai}`	BAM/CRAM/SAM index file
`csi`	`file`	`*.{csi}`	CSI index file
`crai`	`file`	`*.{bai,crai,sai}`	BAM/CRAM/SAM index file

`SAMTOOLS_FLAGSTAT` Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`bam`	`file`	BAM/CRAM/SAM file
`bai`	`file`	Index for BAM/CRAM/SAM file

`SAMTOOLS_FLAGSTAT` Outputs

Name	Type	Pattern	Description
`flagstat`	`file`	`*.{flagstat}`	File containing samtools flagstat output

`SAMTOOLS_STATS` Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	BAM/CRAM file from alignment
`input_index`	`file`	BAI/CRAI file from alignment
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fasta`	`file`	Reference file the CRAM was created with (optional)

`SAMTOOLS_STATS` Outputs

Name	Type	Pattern	Description
`stats`	`file`	`*.{stats}`	File containing samtools stats output

`SAMTOOLS_CONVERT` Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	BAM/CRAM file
`index`	`file`	BAM/CRAM index file
`meta2`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`fasta`	`file`	Reference file to create the CRAM file
`meta3`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`fai`	`file`	Reference index file to create the CRAM file

`SAMTOOLS_CONVERT` Outputs

Name	Type	Pattern	Description
`bam`	`file`	`*{.bam}`	filtered/converted BAM file
`cram`	`file`	`*{cram}`	filtered/converted CRAM file
`bai`	`file`	`*{.bai}`	filtered/converted BAM index
`crai`	`file`	`*{.crai}`	filtered/converted CRAM index

`BEDTOOLS_SORT` Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`intervals`	`file`	BED/BEDGRAPH

`BEDTOOLS_SORT` Outputs

Name	Type	Pattern	Description
`sorted`	`file`	`*.${extension}`	Sorted output file

`BEDTOOLS_MERGE` Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`bed`	`file`	Input BED file

`BEDTOOLS_MERGE` Outputs

Name	Type	Pattern	Description
`bed`	`file`	`*.{bed}`	Overlapped bed file with combined features

`STAR_GENOMEGENERATE` Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`fasta`	`file`	Fasta file of the reference genome
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'test' ]
`gtf`	`file`	GTF file of the reference genome

`STAR_GENOMEGENERATE` Outputs

Name	Type	Pattern	Description
`index`	`directory`	`star`	Folder containing the star index files

`STAR_ALIGN` Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`reads`	`file`	List of input FastQ files of size 1 and 2 for single-end and paired-end data, respectively.
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'test' ]
`index`	`directory`	STAR genome index
`meta3`	`map`	Groovy Map containing reference information e.g. [ id:'test' ]
`gtf`	`file`	Annotation GTF file

`STAR_ALIGN` Outputs

Name	Type	Pattern	Description
`log_final`	`file`	`*Log.final.out`	STAR final log file
`log_out`	`file`	`*Log.out`	STAR lot out file
`log_progress`	`file`	`*Log.progress.out`	STAR log progress file
`bam`	`file`	`*.{bam}`	Output BAM file containing read alignments
`bam_sorted`	`file`	`*sortedByCoord.out.bam`	Sorted BAM file of read alignments (optional)
`bam_sorted_aligned`	`file`	`*.Aligned.sortedByCoord.out.bam`	Sorted BAM file of read alignments (optional)
`bam_transcript`	`file`	`*toTranscriptome.out.bam`	Output BAM file of transcriptome alignment (optional)
`bam_unsorted`	`file`	`*Aligned.unsort.out.bam`	Unsorted BAM file of read alignments (optional)
`fastq`	`file`	`*fastq.gz`	Unmapped FastQ files (optional)
`tab`	`file`	`*.tab`	STAR output tab file(s) (optional)
`spl_junc_tab`	`file`	`*.SJ.out.tab`	STAR output splice junction tab file
`read_per_gene_tab`	`file`	`*.ReadsPerGene.out.tab`	STAR output read per gene tab file
`junction`	`file`	`*.out.junction`	STAR chimeric junction output file (optional)
`sam`	`file`	`*.out.sam`	STAR output SAM file(s) (optional)
`wig`	`file`	`*.wig`	STAR output wiggle format file(s) (optional)
`bedgraph`	`file`	`*.bg`	STAR output bedGraph format file(s) (optional)

`STAR_INDEXVERSION` Outputs

Name	Type	Pattern	Description
`index_version`	`-`	n/a	n/a

`SNPEFF_SNPEFF` Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`vcf`	`file`	vcf to annotate
`meta2`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`cache`	`file`	path to snpEff cache (optional)

`SNPEFF_SNPEFF` Outputs

Name	Type	Pattern	Description
`vcf`	`file`	`*.ann.vcf`	annotated vcf
`report`	`string`	`*.csv`	The process The tool name snpEff report csv file
`summary_html`	`string`	`*.html`	The process The tool name snpEff summary statistics in html file
`genes_txt`	`string`	`*.genes.txt`	The process The tool name txt (tab separated) file having counts of the number of variants affecting each transcript and gene

`SNPEFF_DOWNLOAD` Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`snpeff_db`	`string`	SnpEff database name

`SNPEFF_DOWNLOAD` Outputs

Name	Type	Pattern	Description
`cache`	`file`	n/a	snpEff cache

`ENSEMBLVEP_VEP` Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`vcf`	`file`	vcf to annotate
`custom_extra_files`	`file`	extra sample-specific files to be used with the `--custom` flag to be configured with ext.args (optional)
`meta2`	`map`	Groovy Map containing fasta reference information e.g. [ id:'test' ]
`fasta`	`file`	reference FASTA file (optional)

`ENSEMBLVEP_VEP` Outputs

Name	Type	Pattern	Description
`vcf`	`file`	`*.vcf.gz`	annotated vcf (optional)
`tbi`	`file`	`*.vcf.gz.tbi`	annotated vcf index (optional)
`tab`	`file`	`*.ann.tab.gz`	tab file with annotated variants (optional)
`json`	`file`	`*.ann.json.gz`	json file with annotated variants (optional)
`report`	`string`	`*.html`	The process The tool name VEP report file

`ENSEMBLVEP_DOWNLOAD` Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`assembly`	`string`	Genome assembly
`species`	`string`	Specie
`cache_version`	`string`	cache version

`ENSEMBLVEP_DOWNLOAD` Outputs

Name	Type	Pattern	Description
`cache`	`file`	`*`	cache

`BCFTOOLS_ANNOTATE` Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	Query VCF or BCF file, can be either uncompressed or compressed
`index`	`file`	Index of the query VCF or BCF file
`annotations`	`file`	Bgzip-compressed file with annotations
`annotations_index`	`file`	Index of the annotations file

`BCFTOOLS_ANNOTATE` Outputs

Name	Type	Pattern	Description
`vcf`	`file`	`*{vcf,vcf.gz,bcf,bcf.gz}`	Compressed annotated VCF file
`tbi`	`file`	`*.tbi`	Alternative VCF file index
`csi`	`file`	`*.csi`	Default VCF file index

`PICARD_MARKDUPLICATES` Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`reads`	`file`	Sequence reads file, can be SAM/BAM/CRAM format
`meta2`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fasta`	`file`	Reference genome fasta file, required for CRAM input
`meta3`	`map`	Groovy Map containing reference information e.g. [ id:'genome' ]
`fai`	`file`	Reference genome fasta index

`PICARD_MARKDUPLICATES` Outputs

Name	Type	Pattern	Description
`bam`	`file`	`*.{bam}`	BAM file with duplicate reads marked/removed
`bai`	`file`	`*.{bai}`	An optional BAM index file. If desired, --CREATE_INDEX must be passed as a flag
`cram`	`file`	`*.{cram}`	Output CRAM file
`metrics`	`file`	`*.{metrics.txt}`	Duplicate metrics file generated by picard

`TABIX_TABIX` Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`tab`	`file`	TAB-delimited genome position file compressed with bgzip

`TABIX_TABIX` Outputs

Name	Type	Pattern	Description
`index`	`file`	`*.{tbi,csi}`	Tabix index file (either tbi or csi)

`TABIX_BGZIPTABIX` Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`input`	`file`	Sorted tab-delimited genome file

`TABIX_BGZIPTABIX` Outputs

Name	Type	Pattern	Description
`gz_index`	`file`	`.gz, .{tbi,csi}`	bgzipped tab-delimited genome file Tabix index file (either tbi or csi)

`CAT_FASTQ` Inputs

Name	Type	Description
`meta`	`map`	Groovy Map containing sample information e.g. [ id:'test', single_end:false ]
`reads`	`file`	List of input FastQ files to be concatenated.

`CAT_FASTQ` Outputs

Name	Type	Pattern	Description
`reads`	`file`	`*.{merged.fastq.gz}`	Merged fastq file

`REMOVE_UNKNOWN_REGIONS` Inputs

Name	Type	Description
`val(meta), path(bed)`	`tuple`	n/a
`val(meta2), path(dict)`	`tuple`	n/a

`REMOVE_UNKNOWN_REGIONS` Outputs

Name	Type	Emit	Description
`val(meta), path('*.bed')`	`tuple`	`bed`	n/a

`GTF2BED` Inputs

Name	Type	Description
`val(meta), path(gtf)`	`tuple`	n/a

`GTF2BED` Outputs

Name	Type	Emit	Description
`val(meta), path('*.bed')`	`tuple`	`bed`	n/a

Functions

Name	Parameters	Returns	Description
`paramsSummaryMultiqc`	`summary_params`	n/a	n/a
`validateInputParameters`	n/a	n/a	Validate pipeline input parameters. Checks that all required parameters are provided and valid. Currently validates that the specified genome exists in the config.
`validateInputSamplesheet`	`input`	n/a	Validate and parse input samplesheet entries. Ensures that multiple runs of the same sample have consistent sequencing type (all single-end or all paired-end).
`checkSamplesAfterGrouping`	`input`	n/a	Validate samples after grouping by sample ID. Performs consistency checks on grouped sample data: - Ensures only one BAM/CRAM file per sample - Prevents mixing of FASTQ and BAM/CRAM inputs - Validates consistent single-end/paired-end status - Properly interleaves paired-end FASTQ files
`genomeExistsError`	n/a	n/a	Check if the specified genome exists in the configuration. Throws an error with a helpful message listing available genomes if the specified genome key is not found in the config.
`toolCitationText`	n/a	n/a	n/a
`toolBibliographyText`	n/a	n/a	n/a
`methodsDescriptionText`	`mqc_methods_yaml`	n/a	n/a
`isCloudUrl`	`cache_url`	n/a	n/a
`isCompatibleStarIndex`	`index_version`, `minimal_index_version`	n/a	n/a
`convertVersionToList`	`version`	n/a	n/a

This pipeline was built with Nextflow. Documentation generated by nf-docs v0.2.0 on 2026-03-03 22:40:54 UTC.

nf-core/rnavar

Inputs

Input/output options

Preprocessing of alignment

Alignment options

Postprocessing of alignment

Variant calling

Variant filtering

Variant Annotation

Pipeline stage options

General reference genome options

Reference genome options

Institutional config options

Generic options

Workflows

NFCORE_RNAVAR Inputs

NFCORE_RNAVAR Outputs

RNAVAR Inputs

RNAVAR Outputs

BAM_STATS_SAMTOOLS Inputs

BAM_STATS_SAMTOOLS Outputs

FASTQ_ALIGN_STAR Inputs

FASTQ_ALIGN_STAR Outputs

VCF_ANNOTATE_SNPEFF Inputs

VCF_ANNOTATE_SNPEFF Outputs

VCF_ANNOTATE_ENSEMBLVEP Inputs

VCF_ANNOTATE_ENSEMBLVEP Outputs

BAM_MARKDUPLICATES_PICARD Inputs

BAM_MARKDUPLICATES_PICARD Outputs

BAM_SORT_STATS_SAMTOOLS Inputs

BAM_SORT_STATS_SAMTOOLS Outputs

PREPARE_ALIGNMENT Inputs

PREPARE_ALIGNMENT Outputs

SPLITNCIGAR Inputs

SPLITNCIGAR Outputs

RECALIBRATE Inputs

RECALIBRATE Outputs

DOWNLOAD_CACHE_SNPEFF_VEP Inputs

DOWNLOAD_CACHE_SNPEFF_VEP Outputs

PIPELINE_INITIALISATION Inputs

PIPELINE_INITIALISATION Outputs

PIPELINE_COMPLETION Inputs

PIPELINE_COMPLETION Outputs

ANNOTATION_CACHE_INITIALISATION Inputs

ANNOTATION_CACHE_INITIALISATION Outputs

PREPARE_GENOME Inputs

PREPARE_GENOME Outputs

VCF_ANNOTATE_ALL Inputs

VCF_ANNOTATE_ALL Outputs

Processes

MULTIQC Outputs

UNTAR Inputs

UNTAR Outputs

MOSDEPTH Inputs

MOSDEPTH Outputs

SEQ2HLA Inputs

SEQ2HLA Outputs

FASTQC Inputs

FASTQC Outputs

GFFREAD Inputs

GFFREAD Outputs

GUNZIP Inputs

GUNZIP Outputs

GATK4_COMBINEGVCFS Inputs

GATK4_COMBINEGVCFS Outputs

GATK4_INDEXFEATUREFILE Inputs

GATK4_INDEXFEATUREFILE Outputs

GATK4_VARIANTFILTRATION Inputs

GATK4_VARIANTFILTRATION Outputs

GATK4_CREATESEQUENCEDICTIONARY Inputs

GATK4_CREATESEQUENCEDICTIONARY Outputs

GATK4_SPLITNCIGARREADS Inputs

GATK4_SPLITNCIGARREADS Outputs

GATK4_HAPLOTYPECALLER Inputs

GATK4_HAPLOTYPECALLER Outputs

GATK4_INTERVALLISTTOOLS Inputs

GATK4_INTERVALLISTTOOLS Outputs

GATK4_BASERECALIBRATOR Inputs

GATK4_BASERECALIBRATOR Outputs

GATK4_APPLYBQSR Inputs

`NFCORE_RNAVAR` Inputs

`NFCORE_RNAVAR` Outputs

`RNAVAR` Inputs

`RNAVAR` Outputs

`BAM_STATS_SAMTOOLS` Inputs

`BAM_STATS_SAMTOOLS` Outputs

`FASTQ_ALIGN_STAR` Inputs

`FASTQ_ALIGN_STAR` Outputs

`VCF_ANNOTATE_SNPEFF` Inputs

`VCF_ANNOTATE_SNPEFF` Outputs

`VCF_ANNOTATE_ENSEMBLVEP` Inputs

`VCF_ANNOTATE_ENSEMBLVEP` Outputs

`BAM_MARKDUPLICATES_PICARD` Inputs

`BAM_MARKDUPLICATES_PICARD` Outputs

`BAM_SORT_STATS_SAMTOOLS` Inputs

`BAM_SORT_STATS_SAMTOOLS` Outputs

`PREPARE_ALIGNMENT` Inputs

`PREPARE_ALIGNMENT` Outputs

`SPLITNCIGAR` Inputs

`SPLITNCIGAR` Outputs

`RECALIBRATE` Inputs

`RECALIBRATE` Outputs

`DOWNLOAD_CACHE_SNPEFF_VEP` Inputs

`DOWNLOAD_CACHE_SNPEFF_VEP` Outputs

`PIPELINE_INITIALISATION` Inputs

`PIPELINE_INITIALISATION` Outputs

`PIPELINE_COMPLETION` Inputs

`PIPELINE_COMPLETION` Outputs

`ANNOTATION_CACHE_INITIALISATION` Inputs

`ANNOTATION_CACHE_INITIALISATION` Outputs

`PREPARE_GENOME` Inputs

`PREPARE_GENOME` Outputs

`VCF_ANNOTATE_ALL` Inputs

`VCF_ANNOTATE_ALL` Outputs

`MULTIQC` Outputs

`UNTAR` Inputs

`UNTAR` Outputs

`MOSDEPTH` Inputs

`MOSDEPTH` Outputs

`SEQ2HLA` Inputs

`SEQ2HLA` Outputs

`FASTQC` Inputs

`FASTQC` Outputs

`GFFREAD` Inputs

`GFFREAD` Outputs

`GUNZIP` Inputs

`GUNZIP` Outputs

`GATK4_COMBINEGVCFS` Inputs

`GATK4_COMBINEGVCFS` Outputs

`GATK4_INDEXFEATUREFILE` Inputs

`GATK4_INDEXFEATUREFILE` Outputs

`GATK4_VARIANTFILTRATION` Inputs

`GATK4_VARIANTFILTRATION` Outputs

`GATK4_CREATESEQUENCEDICTIONARY` Inputs

`GATK4_CREATESEQUENCEDICTIONARY` Outputs

`GATK4_SPLITNCIGARREADS` Inputs

`GATK4_SPLITNCIGARREADS` Outputs

`GATK4_HAPLOTYPECALLER` Inputs

`GATK4_HAPLOTYPECALLER` Outputs

`GATK4_INTERVALLISTTOOLS` Inputs

`GATK4_INTERVALLISTTOOLS` Outputs

`GATK4_BASERECALIBRATOR` Inputs

`GATK4_BASERECALIBRATOR` Outputs

`GATK4_APPLYBQSR` Inputs

`GATK4_APPLYBQSR` Outputs

`GATK4_BEDTOINTERVALLIST` Inputs

`GATK4_BEDTOINTERVALLIST` Outputs

`GATK4_MERGEVCFS` Inputs

`GATK4_MERGEVCFS` Outputs

`UMITOOLS_EXTRACT` Inputs

`UMITOOLS_EXTRACT` Outputs

`SAMTOOLS_SORT` Inputs

`SAMTOOLS_SORT` Outputs

`SAMTOOLS_MERGE` Inputs

`SAMTOOLS_MERGE` Outputs

`SAMTOOLS_IDXSTATS` Inputs

`SAMTOOLS_IDXSTATS` Outputs

`SAMTOOLS_FAIDX` Inputs

`SAMTOOLS_FAIDX` Outputs

`SAMTOOLS_INDEX` Inputs