Pipeline Inputs

This page documents all input parameters for the pipeline.

Other

`--aws_image_prefix`

Type: string | Optional

`--aws_queue`

Type: string | Optional

`--monochrome_logs`

Type: boolean | Optional

`--validate_params`

Type: boolean | Optional

Default: True

`--show_hidden_params`

Type: boolean | Optional

Input Options

`--fastq`

Type: string | Optional | Format: path

FASTQ files to use in the analysis.

This accepts one of three cases: (i) the path to a single FASTQ file; (ii) the path to a top-level directory containing FASTQ files; (iii) the path to a directory containing one level of sub-directories which in turn contain FASTQ files. In the first and second case, a sample name can be supplied with --sample. In the last case, the data is assumed to be multiplexed with the names of the sub-directories as barcodes. In this case, a sample sheet can be provided with --sample_sheet.

`--bam`

Type: string | Optional | Format: path

BAM or unaligned BAM (uBAM) files to use in the analysis.

This accepts one of three cases: (i) the path to a single BAM file; (ii) the path to a top-level directory containing BAM files; (iii) the path to a directory containing one level of sub-directories which in turn contain BAM files. In the first and second case, a sample name can be supplied with --sample. In the last case, the data is assumed to be multiplexed with the names of the sub-directories as barcodes. In this case, a sample sheet can be provided with --sample_sheet.

`--classifier`

Type: string | Optional

Kraken2 or Minimap2 workflow to be used for classification of reads.

Use Kraken2 for fast classification and minimap2 for finer resolution, see Readme for further info.

Default: kraken2

Allowed values:

kraken2
minimap2

`--analyse_unclassified`

Type: boolean | Optional

Analyse unclassified reads from input directory. By default the workflow will not process reads in the unclassified directory.

If selected and if the input is a multiplex directory the workflow will also process the unclassified directory.

Default: False

`--exclude_host`

Type: string | Optional | Format: file-path

A FASTA or MMI file of the host reference. Reads that align with this reference will be excluded from the analysis.

Sample Options

`--sample_sheet`

Type: string | Optional | Format: file-path

A CSV file used to map barcodes to sample aliases. The sample sheet can be provided when the input data is a directory containing sub-directories with FASTQ files.

The sample sheet is a CSV file with, minimally, columns named barcode,alias. Extra columns are allowed. A type column is required for certain workflows and should have the following values; test_sample, positive_control, negative_control, no_template_control.

`--sample`

Type: string | Optional

A single sample name for non-multiplexed data. Permissible if passing a single .fastq(.gz) file or directory of .fastq(.gz) files.

Reference Options

`--database_set`

Type: string | Optional

Sets the reference, databases and taxonomy datasets that will be used for classifying reads. Choices: ['ncbi_16s_18s','ncbi_16s_18s_28s_ITS', 'SILVA_138_1', 'Greengenes2_plus', 'Standard-8', 'PlusPF-8', 'PlusPFP-8']. Memory requirement will be slightly higher than the size of the database. Standard-8, PlusPF-8 and PlusPFP-8 databases require more than 8GB and are only available in the kraken2 approach.

This setting is overridable by providing an explicit taxonomy, database or reference path in the other reference options.

Default: Standard-8

Allowed values:

ncbi_16s_18s
ncbi_16s_18s_28s_ITS
SILVA_138_1
Greengenes2_plus
Standard-8
PlusPF-8
PlusPFP-8

`--store_dir`

Type: string | Optional | Format: directory-path

Where to store initial download of database.

database set selected will be downloaded as part of the workflow and saved in this location, on subsequent runs it will use this as the database.

Default: store_dir

`--database`

Type: string | Optional | Format: path

Not required but can be used to specifically override Kraken2 database [.tar.gz or Directory].

By default uses database chosen in database_set parameter.

`--taxonomy`

Type: string | Optional | Format: path

Not required but can be used to specifically override taxonomy database. Change the default to use a different taxonomy file [.tar.gz or directory].

By default NCBI taxonomy file will be downloaded and used.

`--reference`

Type: string | Optional | Format: file-path

Override the FASTA reference file selected by the database_set parameter. It can be a FASTA format reference sequence collection or a minimap2 MMI format index.

This option should be used in conjunction with the database parameter to specify a custom database.

`--ref2taxid`

Type: string | Optional | Format: file-path

Not required but can be used to specify a ref2taxid mapping. Format is .tsv (refname taxid), no header row.

By default uses ref2taxid for option chosen in database_set parameter.

`--taxonomic_rank`

Type: string | Optional

Returns results at the taxonomic rank chosen. In the Kraken2 pipeline: set the level that Bracken will estimate abundance at. Default: S (species). Other possible options are P (phylum), C (class), O (order), F (family), and G (genus).

Default: S

Allowed values:

S
G
F
O
C
P

Kraken2 Options

`--bracken_length`

Type: integer | Optional

Set the length value Bracken will use

Should be set to the length used to generate the kmer distribution file supplied in the Kraken database input directory. For the default datasets these will be set automatically. ncbi_16s_18s = 1000 , ncbi_16s_18s_28s_ITS = 1000 , PlusPF-8 = 300

`--bracken_threshold`

Type: integer | Optional

Set the minimum read threshold Bracken will use to consider a taxon

Bracken will only consider taxa with a read count greater than or equal to this value.

Default: 10

`--kraken2_memory_mapping`

Type: boolean | Optional

Avoids loading database into RAM

Kraken 2 will by default load the database into process-local RAM; this flag will avoid doing so. It may be useful if the available RAM memory is lower than the size of the chosen database.

Default: False

`--kraken2_confidence`

Type: number | Optional

Kraken2 Confidence score threshold. Default: 0.0. Valid interval: 0-1

Apply a threshold to determine if a sequence is classified or unclassified. See the kraken2 manual section on confidence scoring for further details about how it works.

Default: 0.0

Minimap2 Options

`--minimap2filter`

Type: string | Optional

Filter output of minimap2 by taxids inc. child nodes, E.g. "9606,1404"

Provide a list of taxids if you are only interested in certain ones in your minimap2 analysis outputs.

`--minimap2exclude`

Type: boolean | Optional

Invert minimap2filter and exclude the given taxids instead

Exclude a list of taxids from analysis outputs.

Default: False

`--keep_bam`

Type: boolean | Optional

Copy bam files into the output directory.

Default: False

`--minimap2_by_reference`

Type: boolean | Optional

Add a table with the mean sequencing depth per reference, standard deviation and coefficient of variation. It adds a scatterplot of the sequencing depth vs. the coverage and a heatmap showing the depth per percentile to the report

Default: False

`--min_percent_identity`

Type: number | Optional

Minimum percentage of identity with the matched reference to define a sequence as classified; sequences with a value lower than this are defined as unclassified.

Default: 90

`--min_ref_coverage`

Type: number | Optional

Minimum coverage value to define a sequence as classified; sequences with a coverage value lower than this are defined as unclassified. Use this option if you expect reads whose lengths are similar to the references' lengths.

Default: 0

Antimicrobial Resistance Options

`--amr`

Type: boolean | Optional

Scan reads for antimicrobial resistance or virulence genes

Reads will be scanned using abricate and the chosen database (--amr_db) to identify any acquired antimicrobial resistance or virulence genes found present in the dataset. NOTE: It cannot identify mutational resistance genes.

Default: False

`--amr_db`

Type: string | Optional

Database of antimicrobial resistance or virulence genes to use.

Default: resfinder

Allowed values:

resfinder
ecoli_vf
plasmidfinder
card
argannot
vfdb
ncbi
megares
ecoh

`--amr_minid`

Type: integer | Optional

Threshold of required identity to report a match between a gene in the database and fastq reads. Valid interval: 0-100

Default: 80

`--amr_mincov`

Type: integer | Optional

Minimum coverage (breadth-of) threshold required to report a match between a gene in the database and fastq reads. Valid interval: 0-100.

Default: 80

Report Options

`--abundance_threshold`

Type: number | Optional

Remove those taxa whose abundance is equal or lower than the chosen value.

To remove taxa with abundances lower than or equal to a relative value (compared to the total number of reads) use a decimal between 0-1 (1 not inclusive). To remove taxa with abundances lower than or equal to an absolute value, provide a number larger or equal to 1.

Default: 0

`--n_taxa_barplot`

Type: integer | Optional

Number of most abundant taxa to be displayed in the barplot. The rest of taxa will be grouped under the "Other" category.

Default: 9

Output Options

`--out_dir`

Type: string | Optional | Format: directory-path

Directory for output of all user-facing files.

Default: output

`--igv`

Type: boolean | Optional

Enable IGV visualisation in the EPI2ME Desktop Application by creating the required files. This will cause the workflow to emit the BAM files as well. If using a custom reference, this must be a FASTA file and not a minimap2 MMI format index.

Default: False

`--include_read_assignments`

Type: boolean | Optional

A per sample TSV file that indicates the taxonomy assigned to each sequence.

Default: False

`--output_unclassified`

Type: boolean | Optional

Output a FASTQ of the unclassified reads.

Default: False

Advanced Options

`--min_len`

Type: integer | Optional

Specify read length lower limit.

Any reads shorter than this limit will not be included in the analysis.

Default: 0

`--min_read_qual`

Type: number | Optional

Specify read quality lower limit.

Any reads with a quality lower than this limit will not be included in the analysis.

`--max_len`

Type: integer | Optional

Specify read length upper limit

Any reads longer than this limit will not be included in the analysis.

`--threads`

Type: integer | Optional

Maximum number of CPU threads to use in each parallel workflow task.

Several tasks in this workflow benefit from using multiple CPU threads. This option sets the number of CPU threads for all such processes.

Default: 4

Miscellaneous Options

`--disable_ping`

Type: boolean | Optional

Enable to prevent sending a workflow ping.

Default: False

`--help`

Type: boolean | Optional

Default: False

`--version`

Type: boolean | Optional

Display version and exit.

Default: False

This pipeline was built with Nextflow. Documentation generated by nf-docs v0.1.0 on 2026-01-23 17:27:31 UTC.