Pipeline Inputs
This page documents all input parameters for the pipeline.
Other
--aws_image_prefix
Type: string | Optional
--aws_queue
Type: string | Optional
--monochrome_logs
Type: boolean | Optional
--validate_params
Type: boolean | Optional
Default: True
--show_hidden_params
Type: boolean | Optional
Input Options
--fastq
Type: string | Optional | Format: path
FASTQ files to use in the analysis.
This accepts one of three cases: (i) the path to a single FASTQ file; (ii) the path to a top-level directory containing FASTQ files; (iii) the path to a directory containing one level of sub-directories which in turn contain FASTQ files. In the first and second case, a sample name can be supplied with
--sample. In the last case, the data is assumed to be multiplexed with the names of the sub-directories as barcodes. In this case, a sample sheet can be provided with--sample_sheet.
--bam
Type: string | Optional | Format: path
BAM or unaligned BAM (uBAM) files to use in the analysis.
This accepts one of three cases: (i) the path to a single BAM file; (ii) the path to a top-level directory containing BAM files; (iii) the path to a directory containing one level of sub-directories which in turn contain BAM files. In the first and second case, a sample name can be supplied with
--sample. In the last case, the data is assumed to be multiplexed with the names of the sub-directories as barcodes. In this case, a sample sheet can be provided with--sample_sheet.
--classifier
Type: string | Optional
Kraken2 or Minimap2 workflow to be used for classification of reads.
Use Kraken2 for fast classification and minimap2 for finer resolution, see Readme for further info.
Default: kraken2
Allowed values:
kraken2minimap2
--analyse_unclassified
Type: boolean | Optional
Analyse unclassified reads from input directory. By default the workflow will not process reads in the unclassified directory.
If selected and if the input is a multiplex directory the workflow will also process the unclassified directory.
Default: False
--exclude_host
Type: string | Optional | Format: file-path
A FASTA or MMI file of the host reference. Reads that align with this reference will be excluded from the analysis.
Sample Options
--sample_sheet
Type: string | Optional | Format: file-path
A CSV file used to map barcodes to sample aliases. The sample sheet can be provided when the input data is a directory containing sub-directories with FASTQ files.
The sample sheet is a CSV file with, minimally, columns named
barcode,alias. Extra columns are allowed. Atypecolumn is required for certain workflows and should have the following values;test_sample,positive_control,negative_control,no_template_control.
--sample
Type: string | Optional
A single sample name for non-multiplexed data. Permissible if passing a single .fastq(.gz) file or directory of .fastq(.gz) files.
Reference Options
--database_set
Type: string | Optional
Sets the reference, databases and taxonomy datasets that will be used for classifying reads. Choices: ['ncbi_16s_18s','ncbi_16s_18s_28s_ITS', 'SILVA_138_1', 'Greengenes2_plus', 'Standard-8', 'PlusPF-8', 'PlusPFP-8']. Memory requirement will be slightly higher than the size of the database. Standard-8, PlusPF-8 and PlusPFP-8 databases require more than 8GB and are only available in the kraken2 approach.
This setting is overridable by providing an explicit taxonomy, database or reference path in the other reference options.
Default: Standard-8
Allowed values:
ncbi_16s_18sncbi_16s_18s_28s_ITSSILVA_138_1Greengenes2_plusStandard-8PlusPF-8PlusPFP-8
--store_dir
Type: string | Optional | Format: directory-path
Where to store initial download of database.
database set selected will be downloaded as part of the workflow and saved in this location, on subsequent runs it will use this as the database.
Default: store_dir
--database
Type: string | Optional | Format: path
Not required but can be used to specifically override Kraken2 database [.tar.gz or Directory].
By default uses database chosen in database_set parameter.
--taxonomy
Type: string | Optional | Format: path
Not required but can be used to specifically override taxonomy database. Change the default to use a different taxonomy file [.tar.gz or directory].
By default NCBI taxonomy file will be downloaded and used.
--reference
Type: string | Optional | Format: file-path
Override the FASTA reference file selected by the database_set parameter. It can be a FASTA format reference sequence collection or a minimap2 MMI format index.
This option should be used in conjunction with the database parameter to specify a custom database.
--ref2taxid
Type: string | Optional | Format: file-path
Not required but can be used to specify a ref2taxid mapping. Format is .tsv (refname taxid), no header row.
By default uses ref2taxid for option chosen in database_set parameter.
--taxonomic_rank
Type: string | Optional
Returns results at the taxonomic rank chosen. In the Kraken2 pipeline: set the level that Bracken will estimate abundance at. Default: S (species). Other possible options are P (phylum), C (class), O (order), F (family), and G (genus).
Default: S
Allowed values:
SGFOCP
Kraken2 Options
--bracken_length
Type: integer | Optional
Set the length value Bracken will use
Should be set to the length used to generate the kmer distribution file supplied in the Kraken database input directory. For the default datasets these will be set automatically. ncbi_16s_18s = 1000 , ncbi_16s_18s_28s_ITS = 1000 , PlusPF-8 = 300
--bracken_threshold
Type: integer | Optional
Set the minimum read threshold Bracken will use to consider a taxon
Bracken will only consider taxa with a read count greater than or equal to this value.
Default: 10
--kraken2_memory_mapping
Type: boolean | Optional
Avoids loading database into RAM
Kraken 2 will by default load the database into process-local RAM; this flag will avoid doing so. It may be useful if the available RAM memory is lower than the size of the chosen database.
Default: False
--kraken2_confidence
Type: number | Optional
Kraken2 Confidence score threshold. Default: 0.0. Valid interval: 0-1
Apply a threshold to determine if a sequence is classified or unclassified. See the kraken2 manual section on confidence scoring for further details about how it works.
Default: 0.0
Minimap2 Options
--minimap2filter
Type: string | Optional
Filter output of minimap2 by taxids inc. child nodes, E.g. "9606,1404"
Provide a list of taxids if you are only interested in certain ones in your minimap2 analysis outputs.
--minimap2exclude
Type: boolean | Optional
Invert minimap2filter and exclude the given taxids instead
Exclude a list of taxids from analysis outputs.
Default: False
--keep_bam
Type: boolean | Optional
Copy bam files into the output directory.
Default: False
--minimap2_by_reference
Type: boolean | Optional
Add a table with the mean sequencing depth per reference, standard deviation and coefficient of variation. It adds a scatterplot of the sequencing depth vs. the coverage and a heatmap showing the depth per percentile to the report
Default: False
--min_percent_identity
Type: number | Optional
Minimum percentage of identity with the matched reference to define a sequence as classified; sequences with a value lower than this are defined as unclassified.
Default: 90
--min_ref_coverage
Type: number | Optional
Minimum coverage value to define a sequence as classified; sequences with a coverage value lower than this are defined as unclassified. Use this option if you expect reads whose lengths are similar to the references' lengths.
Default: 0
Antimicrobial Resistance Options
--amr
Type: boolean | Optional
Scan reads for antimicrobial resistance or virulence genes
Reads will be scanned using abricate and the chosen database (
--amr_db) to identify any acquired antimicrobial resistance or virulence genes found present in the dataset. NOTE: It cannot identify mutational resistance genes.
Default: False
--amr_db
Type: string | Optional
Database of antimicrobial resistance or virulence genes to use.
Default: resfinder
Allowed values:
resfinderecoli_vfplasmidfindercardargannotvfdbncbimegaresecoh
--amr_minid
Type: integer | Optional
Threshold of required identity to report a match between a gene in the database and fastq reads. Valid interval: 0-100
Default: 80
--amr_mincov
Type: integer | Optional
Minimum coverage (breadth-of) threshold required to report a match between a gene in the database and fastq reads. Valid interval: 0-100.
Default: 80
Report Options
--abundance_threshold
Type: number | Optional
Remove those taxa whose abundance is equal or lower than the chosen value.
To remove taxa with abundances lower than or equal to a relative value (compared to the total number of reads) use a decimal between 0-1 (1 not inclusive). To remove taxa with abundances lower than or equal to an absolute value, provide a number larger or equal to 1.
Default: 0
--n_taxa_barplot
Type: integer | Optional
Number of most abundant taxa to be displayed in the barplot. The rest of taxa will be grouped under the "Other" category.
Default: 9
Output Options
--out_dir
Type: string | Optional | Format: directory-path
Directory for output of all user-facing files.
Default: output
--igv
Type: boolean | Optional
Enable IGV visualisation in the EPI2ME Desktop Application by creating the required files. This will cause the workflow to emit the BAM files as well. If using a custom reference, this must be a FASTA file and not a minimap2 MMI format index.
Default: False
--include_read_assignments
Type: boolean | Optional
A per sample TSV file that indicates the taxonomy assigned to each sequence.
Default: False
--output_unclassified
Type: boolean | Optional
Output a FASTQ of the unclassified reads.
Default: False
Advanced Options
--min_len
Type: integer | Optional
Specify read length lower limit.
Any reads shorter than this limit will not be included in the analysis.
Default: 0
--min_read_qual
Type: number | Optional
Specify read quality lower limit.
Any reads with a quality lower than this limit will not be included in the analysis.
--max_len
Type: integer | Optional
Specify read length upper limit
Any reads longer than this limit will not be included in the analysis.
--threads
Type: integer | Optional
Maximum number of CPU threads to use in each parallel workflow task.
Several tasks in this workflow benefit from using multiple CPU threads. This option sets the number of CPU threads for all such processes.
Default: 4
Miscellaneous Options
--disable_ping
Type: boolean | Optional
Enable to prevent sending a workflow ping.
Default: False
--help
Type: boolean | Optional
Default: False
--version
Type: boolean | Optional
Display version and exit.
Default: False
This pipeline was built with Nextflow. Documentation generated by nf-docs v0.1.0 on 2026-01-23 17:27:31 UTC.