Rust crate

Using FastQC as a library

The fastqc-rust crate exposes the same code that powers the fastqc binary, so you can run a full QC analysis from another Rust program, stream sequences through individual modules, or plug your own sequence reader into the existing pipeline.

Full API docs are on docs.rs/fastqc-rust.

Add the dependency

cargo add fastqc-rust

The crate name is fastqc-rust. The library target is exported as fastqc_rust (Rust converts the hyphen to an underscore for use statements).

The native-zlib feature is on by default and links system zlib for faster gzip decompression. To produce a fully static, pure-Rust build, disable default features:

cargo add fastqc-rust --no-default-features

Basic usage

Open a sequence file, run every module over each record, then write the standard FastQC outputs. This is the same pipeline the binary follows for each input.

For the one-liner equivalent that also handles multiple files, threading, and CASAVA grouping, call runner::run(&config, &files) directly. FastQCConfig mirrors the CLI flags one-to-one; see the struct docs for every field.

use std::fs::File;
use std::path::Path;
use fastqc_rust::config::FastQCConfig;
use fastqc_rust::modules::create_modules;
use fastqc_rust::report;
use fastqc_rust::sequence::open_sequence_file;

fn main() -> std::io::Result<()> {
    // Default config matches the CLI; override fields for non-default flags.
    let config = FastQCConfig::default();
    // Pass/warn/fail thresholds from the embedded limits.txt.
    let limits = config.load_limits()?;

    // Input file. Relative paths resolve against the process's current
    // working directory; absolute paths also work.
    // Format is detected from the extension: FASTQ/BAM/SAM/Fast5,
    // plus .gz and .bz2 transparently.
    let path = Path::new("sample.fastq.gz");
    let mut seq_file = open_sequence_file(&config, path)?;

    // The standard QC modules in the order they appear in the report.
    let mut modules = create_modules(&config, &limits);

    // Stream every record through every module.
    while let Some(seq) = seq_file.next() {
        let seq = seq?;
        for module in modules.iter_mut() {
            module.process_sequence(&seq);
        }
    }
    // Each module computes its results once the stream is done.
    for module in modules.iter_mut() {
        module.finalize();
    }

    // The display name shown inside the reports (usually the file name).
    let name = "sample.fastq.gz";

    // Generate the output files
    // fastqc_data.txt — full per-module data tables.
    let mut data = File::create("sample_fastqc_data.txt")?;
    report::text::write_fastqc_data(&modules, &mut data)?;

    // summary.txt — one PASS/WARN/FAIL line per module.
    let mut summary = File::create("sample_summary.txt")?;
    report::text::write_summary(&modules, name, &mut summary)?;

    // Standalone HTML report with charts embedded as base64 PNG.
    // Reused below so the zip writer doesn't re-render the charts.
    let html = report::html::generate_html_report(&modules, name, config.template)?;
    std::fs::write("sample_fastqc.html", &html)?;

    // _fastqc.zip — bundles the HTML, text outputs, and chart images
    // (the format MultiQC and other downstream tools expect).
    report::archive::create_zip_archive(
        &modules,
        name,
        "sample",
        Path::new("sample_fastqc.zip"),
        &html,
        config.svg_output,
        config.template,
    )?;

    Ok(())
}

Advanced

Stream sequences through the modules

For programmatic access to module results without writing report files, drive the pipeline yourself. create_modules returns the standard set in report order, then you feed each Sequence to every module and call finalize when done.

use std::path::Path;
use fastqc_rust::config::FastQCConfig;
use fastqc_rust::modules::create_modules;
use fastqc_rust::sequence::open_sequence_file;

fn analyse(path: &Path) -> std::io::Result<()> {
    let config = FastQCConfig::default();
    let limits = config.load_limits()?;
    let mut seq_file = open_sequence_file(&config, path)?;
    let mut modules = create_modules(&config, &limits);

    while let Some(seq) = seq_file.next() {
        let seq = seq?;
        for module in modules.iter_mut() {
            if seq.is_filtered && module.ignore_filtered_sequences() {
                continue;
            }
            module.process_sequence(&seq);
        }
    }

    for module in modules.iter_mut() {
        module.finalize();
    }

    for module in &modules {
        println!("{}: {}", module.name(), module.status());
    }
    Ok(())
}

Each QCModule exposes status(), raises_warning(), raises_error(), and write_text_report() on the finalised module. The full fastqc_data.txt body is produced by report::text::write_fastqc_data(&modules, writer).

Read sequences without running modules

open_sequence_file returns a Box<dyn SequenceFile> that handles FASTQ (plain, gzip, bzip2), BAM/SAM, and Fast5 with the same iteration interface. Use it directly if you only need parsing.

use std::path::Path;
use fastqc_rust::config::FastQCConfig;
use fastqc_rust::sequence::open_sequence_file;

let config = FastQCConfig::default();
let mut reader = open_sequence_file(&config, Path::new("sample.bam"))?;

while let Some(seq) = reader.next() {
    let seq = seq?;
    println!("{}\t{} bp", seq.id, seq.len());
}

Format detection follows the same rules as the CLI: file extension by default, overridden by config.sequence_format ("fastq", "bam", "sam", "bam_mapped", "sam_mapped").

Implement your own module

Any type implementing the QCModule trait can be fed sequences alongside the built-in modules. The trait has one required input method (process_sequence) and a handful of metadata and reporting methods.

use std::io::{self, Write};
use fastqc_rust::modules::{QCModule, ModuleStatus};
use fastqc_rust::sequence::Sequence;

struct GcCounter {
    gc: u64,
    total: u64,
}

impl QCModule for GcCounter {
    fn process_sequence(&mut self, seq: &Sequence) {
        for &b in &seq.sequence {
            if b == b'G' || b == b'C' { self.gc += 1; }
            self.total += 1;
        }
    }
    fn name(&self) -> &str { "Custom GC" }
    fn description(&self) -> &str { "GC fraction across all bases" }
    fn reset(&mut self) { self.gc = 0; self.total = 0; }
    fn raises_error(&self) -> bool { false }
    fn raises_warning(&self) -> bool { false }
    fn ignore_filtered_sequences(&self) -> bool { true }
    fn ignore_in_report(&self) -> bool { false }
    fn write_text_report(&self, w: &mut dyn Write) -> io::Result<()> {
        writeln!(w, "#GC fraction\t{:.4}", self.gc as f64 / self.total as f64)
    }
}

Modules with charts override chart_alt_text and generate_chart_svg; see the built-in modules in src/modules/ for examples.