epi2me-labs/wf-metagenomics
Version: v2.14.2 ยท Taxonomic classification of metagenomic sequencing data.
Inputs
Other
| Name | Description | Type | Default | Required |
|---|---|---|---|---|
--aws_image_prefix |
n/a | string |
n/a | no |
--aws_queue |
n/a | string |
n/a | no |
--monochrome_logs |
n/a | boolean |
n/a | no |
--validate_params |
n/a | boolean |
True |
no |
--show_hidden_params |
n/a | boolean |
n/a | no |
Input Options
| Name | Description | Type | Default | Required |
|---|---|---|---|---|
--fastq |
FASTQ files to use in the analysis. | string |
n/a | no |
--bam |
BAM or unaligned BAM (uBAM) files to use in the analysis. | string |
n/a | no |
--classifier |
Kraken2 or Minimap2 workflow to be used for classification of reads. | string |
kraken2 |
no |
--analyse_unclassified |
Analyse unclassified reads from input directory. By default the workflow will not process reads in the unclassified directory. | boolean |
False |
no |
--exclude_host |
A FASTA or MMI file of the host reference. Reads that align with this reference will be excluded from the analysis. | string |
n/a | no |
Sample Options
| Name | Description | Type | Default | Required |
|---|---|---|---|---|
--sample_sheet |
A CSV file used to map barcodes to sample aliases. The sample sheet can be provided when the input data is a directory containing sub-directories with FASTQ files. | string |
n/a | no |
--sample |
A single sample name for non-multiplexed data. Permissible if passing a single .fastq(.gz) file or directory of .fastq(.gz) files. | string |
n/a | no |
Reference Options
| Name | Description | Type | Default | Required |
|---|---|---|---|---|
--database_set |
Sets the reference, databases and taxonomy datasets that will be used for classifying reads. Choices: ['ncbi_16s_18s','ncbi_16s_18s_28s_ITS', 'SILVA_138_1', 'Greengenes2_plus', 'Standard-8', 'PlusPF-8', 'PlusPFP-8']. Memory requirement will be slightly higher than the size of the database. Standard-8, PlusPF-8 and PlusPFP-8 databases require more than 8GB and are only available in the kraken2 approach. | string |
Standard-8 |
no |
--store_dir |
Where to store initial download of database. | string |
store_dir |
no |
--database |
Not required but can be used to specifically override Kraken2 database [.tar.gz or Directory]. | string |
n/a | no |
--taxonomy |
Not required but can be used to specifically override taxonomy database. Change the default to use a different taxonomy file [.tar.gz or directory]. | string |
n/a | no |
--reference |
Override the FASTA reference file selected by the database_set parameter. It can be a FASTA format reference sequence collection or a minimap2 MMI format index. | string |
n/a | no |
--ref2taxid |
Not required but can be used to specify a ref2taxid mapping. Format is .tsv (refname taxid), no header row. | string |
n/a | no |
--taxonomic_rank |
Returns results at the taxonomic rank chosen. In the Kraken2 pipeline: set the level that Bracken will estimate abundance at. Default: S (species). Other possible options are P (phylum), C (class), O (order), F (family), and G (genus). | string |
S |
no |
Kraken2 Options
| Name | Description | Type | Default | Required |
|---|---|---|---|---|
--bracken_length |
Set the length value Bracken will use | integer |
n/a | no |
--bracken_threshold |
Set the minimum read threshold Bracken will use to consider a taxon | integer |
10 |
no |
--kraken2_memory_mapping |
Avoids loading database into RAM | boolean |
False |
no |
--kraken2_confidence |
Kraken2 Confidence score threshold. Default: 0.0. Valid interval: 0-1 | number |
0.0 |
no |
Minimap2 Options
| Name | Description | Type | Default | Required |
|---|---|---|---|---|
--minimap2filter |
Filter output of minimap2 by taxids inc. child nodes, E.g. "9606,1404" | string |
n/a | no |
--minimap2exclude |
Invert minimap2filter and exclude the given taxids instead | boolean |
False |
no |
--keep_bam |
Copy bam files into the output directory. | boolean |
False |
no |
--minimap2_by_reference |
Add a table with the mean sequencing depth per reference, standard deviation and coefficient of variation. It adds a scatterplot of the sequencing depth vs. the coverage and a heatmap showing the depth per percentile to the report | boolean |
False |
no |
--min_percent_identity |
Minimum percentage of identity with the matched reference to define a sequence as classified; sequences with a value lower than this are defined as unclassified. | number |
90 |
no |
--min_ref_coverage |
Minimum coverage value to define a sequence as classified; sequences with a coverage value lower than this are defined as unclassified. Use this option if you expect reads whose lengths are similar to the references' lengths. | number |
0 |
no |
Antimicrobial Resistance Options
| Name | Description | Type | Default | Required |
|---|---|---|---|---|
--amr |
Scan reads for antimicrobial resistance or virulence genes | boolean |
False |
no |
--amr_db |
Database of antimicrobial resistance or virulence genes to use. | string |
resfinder |
no |
--amr_minid |
Threshold of required identity to report a match between a gene in the database and fastq reads. Valid interval: 0-100 | integer |
80 |
no |
--amr_mincov |
Minimum coverage (breadth-of) threshold required to report a match between a gene in the database and fastq reads. Valid interval: 0-100. | integer |
80 |
no |
Report Options
| Name | Description | Type | Default | Required |
|---|---|---|---|---|
--abundance_threshold |
Remove those taxa whose abundance is equal or lower than the chosen value. | number |
0 |
no |
--n_taxa_barplot |
Number of most abundant taxa to be displayed in the barplot. The rest of taxa will be grouped under the "Other" category. | integer |
9 |
no |
Output Options
| Name | Description | Type | Default | Required |
|---|---|---|---|---|
--out_dir |
Directory for output of all user-facing files. | string |
output |
no |
--igv |
Enable IGV visualisation in the EPI2ME Desktop Application by creating the required files. This will cause the workflow to emit the BAM files as well. If using a custom reference, this must be a FASTA file and not a minimap2 MMI format index. | boolean |
False |
no |
--include_read_assignments |
A per sample TSV file that indicates the taxonomy assigned to each sequence. | boolean |
False |
no |
--output_unclassified |
Output a FASTQ of the unclassified reads. | boolean |
False |
no |
Advanced Options
| Name | Description | Type | Default | Required |
|---|---|---|---|---|
--min_len |
Specify read length lower limit. | integer |
0 |
no |
--min_read_qual |
Specify read quality lower limit. | number |
n/a | no |
--max_len |
Specify read length upper limit | integer |
n/a | no |
--threads |
Maximum number of CPU threads to use in each parallel workflow task. | integer |
4 |
no |
Miscellaneous Options
| Name | Description | Type | Default | Required |
|---|---|---|---|---|
--disable_ping |
Enable to prevent sending a workflow ping. | boolean |
False |
no |
--help |
n/a | boolean |
False |
no |
--version |
Display version and exit. | boolean |
False |
no |
Configuration
| Name | Type | Default | Description |
|---|---|---|---|
database_sets.ncbi_16s_18s.reference |
string |
s3://ont-open-data/workflow-databases/wf-metagenomics-dbs/ncbi_16s_18s/ncbi_targeted_loci_16s_18s.fna |
n/a |
database_sets.ncbi_16s_18s.database |
string |
s3://ont-open-data/workflow-databases/wf-metagenomics-dbs/ncbi_16s_18s/ncbi_targeted_loci_kraken2.tar.gz |
n/a |
database_sets.ncbi_16s_18s.ref2taxid |
string |
s3://ont-open-data/workflow-databases/wf-metagenomics-dbs/ncbi_16s_18s/ref2taxid.targloci.tsv |
n/a |
database_sets.ncbi_16s_18s.taxonomy |
string |
s3://ont-open-data/workflow-databases/wf-metagenomics-dbs/ncbi_16s_18s/new_taxdump_2025-01-01.zip |
n/a |
database_sets.ncbi_16s_18s_28s_ITS.reference |
string |
s3://ont-open-data/workflow-databases/wf-metagenomics-dbs/ncbi_16s_18s_28s_ITS/ncbi_16s_18s_28s_ITS.fna |
n/a |
database_sets.ncbi_16s_18s_28s_ITS.database |
string |
s3://ont-open-data/workflow-databases/wf-metagenomics-dbs/ncbi_16s_18s_28s_ITS/ncbi_16s_18s_28s_ITS_kraken2.tar.gz |
n/a |
database_sets.ncbi_16s_18s_28s_ITS.ref2taxid |
string |
s3://ont-open-data/workflow-databases/wf-metagenomics-dbs/ncbi_16s_18s_28s_ITS/ref2taxid.ncbi_16s_18s_28s_ITS.tsv |
n/a |
database_sets.ncbi_16s_18s_28s_ITS.taxonomy |
string |
s3://ont-open-data/workflow-databases/wf-metagenomics-dbs/ncbi_16s_18s_28s_ITS/new_taxdump_2025-01-01.zip |
n/a |
database_sets.SILVA_138_1.reference |
string |
s3://ont-open-data/workflow-databases/wf-metagenomics-dbs/SILVA_138_1/silva.fna |
n/a |
database_sets.SILVA_138_1.database |
string |
s3://ont-open-data/workflow-databases/wf-metagenomics-dbs/SILVA_138_1/kraken2.tar.gz |
n/a |
database_sets.SILVA_138_1.ref2taxid |
string |
s3://ont-open-data/workflow-databases/wf-metagenomics-dbs/SILVA_138_1/seqid2taxid.map |
n/a |
database_sets.SILVA_138_1.taxonomy |
string |
s3://ont-open-data/workflow-databases/wf-metagenomics-dbs/SILVA_138_1/taxonomy.tar.gz |
n/a |
database_sets.Greengenes2_plus.reference |
string |
s3://ont-open-data/workflow-databases/wf-metagenomics-dbs/Greengenes2_plus/sequences_mm2format.fasta |
n/a |
database_sets.Greengenes2_plus.database |
string |
s3://ont-open-data/workflow-databases/wf-metagenomics-dbs/Greengenes2_plus/kraken2.tar.gz |
n/a |
database_sets.Greengenes2_plus.ref2taxid |
string |
s3://ont-open-data/workflow-databases/wf-metagenomics-dbs/Greengenes2_plus/sequences_mm2format.taxid.map |
n/a |
database_sets.Greengenes2_plus.taxonomy |
string |
s3://ont-open-data/workflow-databases/wf-metagenomics-dbs/Greengenes2_plus/taxdump.tar.gz |
n/a |
schema_ignore_params |
string |
show_hidden_params,validate_params,monochrome_logs,aws_queue,aws_image_prefix,database_sets,wf |
n/a |
wf.example_cmd |
array |
["--fastq \\'wf-metagenomics-demo/test_data\\'"] |
n/a |
wf.agent |
string |
n/a | n/a |
wf.container_sha |
string |
sha38db033c15c74ae7cac3c609fdfe2c8d2a53ef6f |
n/a |
wf.common_sha |
string |
shafdd79f8e4a6faad77513c36f623693977b92b08e |
n/a |
wf.container_sha_amr |
string |
shad8ebf2fc3b15d43612df71170bdd4d8669fe1731 |
n/a |
Workflows
| Name | Description | Entry |
|---|---|---|
| (entry) | n/a | yes |
minimap_pipeline |
n/a | no |
kraken_pipeline |
n/a | no |
run_common |
n/a | no |
run_amr |
n/a | no |
minimap_pipeline Inputs
| Name | Description |
|---|---|
samples |
n/a |
reference |
n/a |
ref2taxid |
n/a |
taxonomy |
n/a |
taxonomic_rank |
n/a |
common_minimap2_opts |
n/a |
output_igv |
n/a |
minimap_pipeline Outputs
| Name | Description |
|---|---|
abundance_table |
n/a |
lineages |
n/a |
alignment_reports |
n/a |
metadata_after_taxonomy |
n/a |
kraken_pipeline Inputs
| Name | Description |
|---|---|
samples |
n/a |
taxonomy |
n/a |
database |
n/a |
bracken_length |
n/a |
taxonomic_rank |
n/a |
kraken_pipeline Outputs
| Name | Description |
|---|---|
abundance_table |
n/a |
lineages |
n/a |
metadata_after_taxonomy |
n/a |
run_common Inputs
| Name | Description |
|---|---|
samples |
n/a |
host_reference |
n/a |
common_minimap2_opts |
n/a |
run_common Outputs
| Name | Description |
|---|---|
? |
n/a |
run_amr Inputs
| Name | Description |
|---|---|
input |
n/a |
amr_db |
n/a |
amr_minid |
n/a |
amr_mincov |
n/a |
run_amr Outputs
| Name | Description |
|---|---|
reports |
n/a |
Processes
| Name | Description |
|---|---|
minimap |
n/a |
minimapTaxonomy |
n/a |
extractMinimap2Reads |
n/a |
getAlignmentStats |
n/a |
run_kraken2 |
n/a |
run_bracken |
n/a |
output_kraken2_read_assignments |
n/a |
exclude_host_reads |
n/a |
fastcat |
n/a |
checkBamHeaders |
n/a |
validateIndex |
n/a |
mergeBams |
n/a |
catSortBams |
n/a |
sortBam |
n/a |
bamstats |
n/a |
move_or_compress_fq_file |
n/a |
split_fq_file |
n/a |
validate_sample_sheet |
Python script for validating a sample sheet. The script will write messages to STDOUT if the sample sheet is invalid. In case there are no issues, no message is emitted. The sample sheet will be published to the output dir. |
samtools_index |
n/a |
getParams |
n/a |
configure_igv |
n/a |
abricate |
n/a |
abricate_json |
n/a |
filter_references |
n/a |
abricateVersion |
n/a |
getVersions |
n/a |
getVersionsCommon |
n/a |
createAbundanceTables |
n/a |
publishReads |
n/a |
publish |
n/a |
makeReport |
n/a |
minimap Inputs
| Name | Type | Description |
|---|---|---|
val(meta), path(concat_seqs), path(stats) |
tuple |
n/a |
minimap Outputs
| Name | Type | Emit | Description |
|---|---|---|---|
val(meta) |
tuple |
n/a | n/a |
minimapTaxonomy Inputs
| Name | Type | Description |
|---|---|---|
val(meta), path(assignments) |
tuple |
n/a |
minimapTaxonomy Outputs
| Name | Type | Emit | Description |
|---|---|---|---|
val(meta) |
tuple |
n/a | n/a |
extractMinimap2Reads Inputs
| Name | Type | Description |
|---|---|---|
val(meta), path("alignment.bam"), path("alignment.bai"), path("bamstats"), val(n_unmapped) |
tuple |
n/a |
extractMinimap2Reads Outputs
| Name | Type | Emit | Description |
|---|---|---|---|
val(meta) |
tuple |
n/a | n/a |
getAlignmentStats Inputs
| Name | Type | Description |
|---|---|---|
val(meta), path("input.bam"), path("input.bam.bai"), path("bamstats") |
tuple |
n/a |
getAlignmentStats Outputs
| Name | Type | Emit | Description |
|---|---|---|---|
${meta.alias |
path |
n/a | n/a |
run_kraken2 Inputs
| Name | Type | Description |
|---|---|---|
val(meta), path("reads.fq.gz"), path(fastq_stats) |
tuple |
n/a |
run_kraken2 Outputs
| Name | Type | Emit | Description |
|---|---|---|---|
val(meta) |
tuple |
n/a | n/a |
run_bracken Inputs
| Name | Type | Description |
|---|---|---|
val(meta), path("kraken2.report"), path("kraken2.assignments.tsv") |
tuple |
n/a |
bracken_length.txt |
path |
n/a |
run_bracken Outputs
| Name | Type | Emit | Description |
|---|---|---|---|
val(meta) |
tuple |
n/a | n/a |
output_kraken2_read_assignments Inputs
| Name | Type | Description |
|---|---|---|
val(meta) |
tuple |
n/a |
output_kraken2_read_assignments Outputs
| Name | Type | Emit | Description |
|---|---|---|---|
val(meta), path("*_lineages.kraken2.assignments.tsv") |
tuple |
n/a | n/a |
exclude_host_reads Inputs
| Name | Type | Description |
|---|---|---|
val(meta), path(concat_seqs), path(fastcat_stats) |
tuple |
n/a |
exclude_host_reads Outputs
| Name | Type | Emit | Description |
|---|---|---|---|
val(meta), path("*.unmapped.fastq.gz"), path("stats_unmapped"), env(n_seqs_passed_host_depletion) |
tuple |
fastq |
n/a |
val(meta), path("*.host.bam"), path("*.host.bam.bai") |
tuple |
host_bam |
n/a |
val(meta), path("*.unmapped.bam"), path("*.unmapped.bam.bai") |
tuple |
no_host_bam |
n/a |
fastcat Inputs
| Name | Type | Description |
|---|---|---|
val(meta), path(input_src, stageAs: "input_src") |
tuple |
n/a |
fastcat Outputs
| Name | Type | Emit | Description |
|---|---|---|---|
val(meta), path("fastq_chunks/*.fastq.gz"), path("fastcat_stats") |
tuple |
n/a | n/a |
checkBamHeaders Inputs
| Name | Type | Description |
|---|---|---|
val(meta), path("input_dir/reads*.bam") |
tuple |
n/a |
checkBamHeaders Outputs
| Name | Type | Emit | Description |
|---|---|---|---|
val(meta), path("input_dir/reads*.bam", includeInputs: true), env(IS_UNALIGNED), env(MIXED_HEADERS), env(IS_SORTED) |
tuple |
n/a | n/a |
validateIndex Inputs
| Name | Type | Description |
|---|---|---|
val(meta), path("reads.bam"), path("reads.bam.bai") |
tuple |
n/a |
validateIndex Outputs
| Name | Type | Emit | Description |
|---|---|---|---|
val(meta), path("reads.bam", includeInputs: true), path("reads.bam.bai", includeInputs: true), env(HAS_VALID_INDEX) |
tuple |
n/a | n/a |
mergeBams Inputs
| Name | Type | Description |
|---|---|---|
val(meta), path("input_bams/reads*.bam") |
tuple |
n/a |
mergeBams Outputs
| Name | Type | Emit | Description |
|---|---|---|---|
val(meta), path("reads.bam"), path("reads.bam.bai") |
tuple |
n/a | n/a |
catSortBams Inputs
| Name | Type | Description |
|---|---|---|
val(meta), path("input_bams/reads*.bam") |
tuple |
n/a |
catSortBams Outputs
| Name | Type | Emit | Description |
|---|---|---|---|
val(meta), path("reads.bam"), path("reads.bam.bai") |
tuple |
n/a | n/a |
sortBam Inputs
| Name | Type | Description |
|---|---|---|
val(meta), path("reads.bam") |
tuple |
n/a |
sortBam Outputs
| Name | Type | Emit | Description |
|---|---|---|---|
val(meta), path("reads.sorted.bam"), path("reads.sorted.bam.bai") |
tuple |
n/a | n/a |
bamstats Inputs
| Name | Type | Description |
|---|---|---|
val(meta), path("reads.bam"), path("reads.bam.bai") |
tuple |
n/a |
bamstats Outputs
| Name | Type | Emit | Description |
|---|---|---|---|
val(meta), path("reads.bam"), path("reads.bam.bai"), path("bamstats_results") |
tuple |
n/a | n/a |
move_or_compress_fq_file Inputs
| Name | Type | Description |
|---|---|---|
val(meta), path(input) |
tuple |
n/a |
move_or_compress_fq_file Outputs
| Name | Type | Emit | Description |
|---|---|---|---|
val(meta), path("seqs.fastq.gz") |
tuple |
n/a | n/a |
split_fq_file Inputs
| Name | Type | Description |
|---|---|---|
val(meta), path(input) |
tuple |
n/a |
split_fq_file Outputs
| Name | Type | Emit | Description |
|---|---|---|---|
val(meta), path("fastq_chunks/*.fastq.gz") |
tuple |
n/a | n/a |
validate_sample_sheet Inputs
| Name | Type | Description |
|---|---|---|
sample_sheet.csv |
path |
n/a |
validate_sample_sheet Outputs
| Name | Type | Emit | Description |
|---|---|---|---|
path("sample_sheet.csv") |
tuple |
n/a | n/a |
samtools_index Inputs
| Name | Type | Description |
|---|---|---|
val(meta), path("reads.bam") |
tuple |
n/a |
samtools_index Outputs
| Name | Type | Emit | Description |
|---|---|---|---|
val(meta), path("reads.bam"), path("reads.bam.bai") |
tuple |
n/a | n/a |
getParams Outputs
| Name | Type | Emit | Description |
|---|---|---|---|
params.json |
path |
n/a | n/a |
configure_igv Inputs
| Name | Type | Description |
|---|---|---|
file-names.txt |
path |
n/a |
configure_igv Outputs
| Name | Type | Emit | Description |
|---|---|---|---|
igv.json |
path |
n/a | n/a |
abricate Inputs
| Name | Type | Description |
|---|---|---|
val(meta), path("input_reads.fastq.gz"), path("stats/") |
tuple |
n/a |
abricate Outputs
| Name | Type | Emit | Description |
|---|---|---|---|
val(meta) |
tuple |
n/a | n/a |
abricate_json Inputs
| Name | Type | Description |
|---|---|---|
val(meta) |
tuple |
n/a |
abricate_json Outputs
| Name | Type | Emit | Description |
|---|---|---|---|
val(meta) |
tuple |
n/a | n/a |
filter_references Inputs
| Name | Type | Description |
|---|---|---|
bam_flagstats/* |
path |
n/a |
filter_references Outputs
| Name | Type | Emit | Description |
|---|---|---|---|
path("reduced_reference.fasta.gz"), path("reduced_reference.fasta.gz.fai"), path("reduced_reference.fasta.gz.gzi") |
tuple |
n/a | n/a |
abricateVersion Inputs
| Name | Type | Description |
|---|---|---|
input_versions.txt |
path |
n/a |
abricateVersion Outputs
| Name | Type | Emit | Description |
|---|---|---|---|
versions.txt |
path |
n/a | n/a |
getVersions Outputs
| Name | Type | Emit | Description |
|---|---|---|---|
versions.txt |
path |
n/a | n/a |
getVersionsCommon Inputs
| Name | Type | Description |
|---|---|---|
versions.txt |
path |
n/a |
getVersionsCommon Outputs
| Name | Type | Emit | Description |
|---|---|---|---|
versions_all.txt |
path |
n/a | n/a |
createAbundanceTables Inputs
| Name | Type | Description |
|---|---|---|
lineages/* |
path |
n/a |
createAbundanceTables Outputs
| Name | Type | Emit | Description |
|---|---|---|---|
abundance_table_*.tsv |
path |
abundance_tsv |
n/a |
publishReads Inputs
| Name | Type | Description |
|---|---|---|
val(meta), path("reads.fq.gz"), path("ids.txt") |
tuple |
n/a |
publishReads Outputs
| Name | Type | Emit | Description |
|---|---|---|---|
${meta.alias |
path |
n/a | n/a |
publish Inputs
| Name | Type | Description |
|---|---|---|
path(fname), val(dirname) |
tuple |
n/a |
publish Outputs
| Name | Type | Emit | Description |
|---|---|---|---|
fname |
path |
n/a | n/a |
makeReport Inputs
| Name | Type | Description |
|---|---|---|
abundance_table.tsv |
path |
n/a |
alignment_stats/* |
path |
n/a |
lineages/* |
path |
n/a |
versions/* |
path |
n/a |
params.json |
path |
n/a |
amr/* |
path |
n/a |
makeReport Outputs
| Name | Type | Emit | Description |
|---|---|---|---|
${report_name |
path |
n/a | n/a |
Functions
| Name | Parameters | Returns | Description |
|---|---|---|---|
is_target_file |
file, extensions |
n/a | Check if a file ends with one of the target extensions. |
is_excluded |
p, margs |
n/a | Check if a file path is flagged for exclusion. |
add_run_IDs_and_basecall_models_to_meta |
ch, allow_multiple_basecall_models |
n/a | Take a channel of the shape [meta, reads, path-to-stats-dir \| null] (or [meta, [reads, index], path-to-stats-dir \| null] in the case of XAM) and extract the run IDs and basecall model, from the run_ids and basecaller files in the stats directory, into the metamap. If the path to the stats dir is null, add an empty list. |
add_number_of_reads_to_meta |
ch, input_type_format |
n/a | Take a channel of the shape [meta, reads, path-to-stats-dir \| null] and do the following: - For fastcat, extract the number of reads from the n_seqs file. - For bamstats, extract the number of primary alignments and unmapped reads from the bamstats.flagstat.tsv file. Then, add add these metrics to the meta map. If the path to the stats dir is null, set the values to 0 when adding them. If not set to 'fastq', input is assumed to be 'bam'. |
fastq_ingress |
arguments |
n/a | Take a map of input arguments, find valid FASTQ inputs, and return a channel with elements of [metamap, seqs.fastq.gz \| null, path-to-fastcat-stats \| null]. The second item is null for sample sheet entries without a matching barcode directory. The last item is null if fastcat was not run (it is only run on directories containing more than one FASTQ file or when stats: true). - "input": path to either: (i) input FASTQ file, (ii) top-level directory containing FASTQ files, (iii) directory containing sub-directories which contain FASTQ files - "sample": string to name single sample - "sample_sheet": path to CSV sample sheet - "analyse_unclassified": boolean. Whether to ingress unclassified (failed to demux) reads - "analyse_fail": boolean. Whether to ingress any sequence files contained in *_fail directories. - "stats": boolean whether to write the fastcat stats - "fastcat_extra_args": string with extra arguments to pass to fastcat - "required_sample_types": list of zero or more required sample types expected to be present in the sample sheet - "per_read_stats": boolean. If true, output a bgzipped TSV containing a summary of each read to fastcat_stats/per-read-stats.tsv.gz. - "fastq_chunk": null or a number of reads to place into chunked FASTQ files - "allow_multiple_basecall_models": emit data of samples that had more than one basecall model; if this is false, such samples will be emitted as [meta, null, null] The first element is a map with metadata, the second is the path to the .fastq.gz file with the (potentially concatenated) sequences and the third is the path to the directory with the fastcat statistics. The second element is null for sample sheet entries for which no corresponding barcode directory was found. The third element is null if fastcat was not run. |
xam_ingress |
arguments |
n/a | Take a map of input arguments, find valid (u)BAM inputs, and return a channel with elements of [metamap, reads.bam \| null, path-to-bamstats-results \| null]. The second item is null for sample sheet entries without a matching barcode directory or samples containing only uBAM files when keep_unaligned is false. The last item is null if bamstats was not run (it is only run when stats: true). - "input": path to either: (i) input (u)BAM file, (ii) top-level directory containing (u)BAM files, (iii) directory containing sub-directories which contain (u)BAM files - "sample": string to name single sample - "sample_sheet": path to CSV sample sheet - "analyse_unclassified": boolean. Whether to ingress unclassified (failed to demux) reads - "analyse_fail": boolean. Whether to ingress any sequence files contained in *_fail directories. - "stats": boolean whether to run bamstats - "keep_unaligned": boolean whether to include uBAM files - "return_fastq": boolean whether to convert to FASTQ (this will always run fastcat) - "fastcat_extra_args": string with extra arguments to pass to fastcat - "required_sample_types": list of zero or more required sample types expected to be present in the sample sheet - "per_read_stats": boolean. If true, output a bgzipped TSV containing a summary - "fastq_chunk": null or a number of reads to place into chunked FASTQ files - "allow_multiple_basecall_models": boolean. If true, emit data of samples that had more than one basecall model; if this is false, such samples will be emitted as [meta, null, null] The first element is a map with metadata, the second is the path to the .bam file with the (potentially merged) sequences and the third is the path to the directory with the bamstats statistics. The second element is null for sample sheet entries for which no corresponding barcode directory was found and for samples with only uBAM files when keep_unaligned: false. The third element is null if bamstats was not run. |
parse_arguments |
func_name, arguments, extra_kwargs |
n/a | Parse input arguments for fastq_ingress or xam_ingress. for details) the argument-parsing to be tailored to a particular ingress function) |
get_valid_inputs |
margs, extensions |
n/a | Find valid inputs based on the target extensions and return a branched channel with branches missing, files and dir, which are of the shape [metamap, input_path \| null] (with input_path pointing to a target file or a directory containing target files, respectively). missing contains sample sheet entries for which no corresponding barcodes were found. Checks whether the input is a single target file, a top-level directory with target files, or a directory containing sub-directories (usually barcodes) with target files. |
create_metamap |
arguments |
n/a | Create a map that contains at least these keys: [alias, barcode, type]. alias is required, barcode and type are filled with default values if missing. Additional entries are allowed. |
get_target_files_in_dir |
dir, extensions, margs, recursive |
n/a | Get all target files below this directory. |
get_sample_sheet |
sample_sheet, required_sample_types |
n/a | Check the sample sheet and return a channel with its rows if it is valid. in the sample sheet |
This pipeline was built with Nextflow. Documentation generated by nf-docs v0.2.0 on 2026-03-03 22:40:56 UTC.