High-Performance FASTX Parsing for Mojo — Zero-Copy to GPU
A high-throughput FASTQ parser written in Mojo. BlazeSeq targets several GB/s throughput from disk using zero-copy parsing, with owned records and GPU-friendly batching for read pipelines. It also supports streaming FASTA and samtools-style .fai index files (five- or six-column rows from faidx, index metadata only). Multithreaded gzip decompression uses rapidgzip (rapidgzip). Configurable validation is available — all through a single unified API.
✨ Key Features
- SIMD-accelerated scanning — Vectorized from the ground up using mojo SIMD first-class support.
- Three parsing modes — Choose your trade-off between speed and convenience:
views()— Zero-copy views (fastest, borrow semantics)records()— Owned records (thread-safe)batches()— Structure-of-Arrays for GPU upload
- Compile-time validation toggles — Enable/Disable ASCII/quality-range checks at compile time for maximum throughput
- Rapidgzip with parallel decoding — Gzipped FASTQ (
.fastq.gz) is decompressed in parallel across multiple threads for high throughput; tune with theparallelism. - FASTA and FAI — Streaming FASTA parsing and
.faiindex files; see the API reference forFastaParserandFaiParser.
Quick Start
Mojo package from repo (Pixi)
Use BlazeSeq as a Mojo dependency in your project. Install pixi first, then add BlazeSeq to your pixi.toml:
[dependencies] blazeseq = { git = "https://github.com/MoSafi2/BlazeSeq", branch = "main" }
Then run pixi install and use the full Mojo API (e.g. FastqParser, FastaParser, FaiParser, views(), batches(), GPU batching).
Python bindings (experimental)
Python bindings are available via a wheel-only package on PyPI. They are experimental and may change. Install with pip install blazeseq or uv pip install blazeseq. Usage and API are documented in python/README.md.
🛠 Usage examples
# FastqParser with and without validation pixi run mojo run examples/example_parser.mojo /path/to/file.fastq # GPU needleman-wunsch global alignment (requires GPU) pixi run mojo run examples/nw_gpu/main.mojo
Count reads and base pairs
from blazeseq import FastqParser, FileReader from pathlib import Path def main() raises: var parser = FastqParser(FileReader(Path("data.fastq")), "sanger") var reads = 0 var bases = 0 for record in parser.records(): reads += 1 bases += len(record) print(reads, bases)
Maximum speed (validation off)
from blazeseq import FastqParser, ParserConfig, FileReader from pathlib import Path def main() raises: comptime config = ParserConfig(check_ascii=False, check_quality=False) var parser = FastqParser[config=config](FileReader(Path("data.fastq")), "generic") for view in parser.views(): # zero-copy _ = len(view)
Batched (for GPU pipelines)
from blazeseq import FastqBatch from gpu.host import DeviceContext var ctx = DeviceContext() var parser = FastqParser(FileReader(Path("data.fastq")), schema="generic", batch_size=4096) for batch in parser.batches(): # batch is a FastqBatch (Structure-of-Arrays) var device_batch = batch.to_device(ctx) # GPU upload # Your GPU kernel, check examples
Reading gzip (rapidgzip, parallel decoding)
BlazeSeq uses RapidgzipReader for gzipped FASTQ. It performs parallel decompression: the compressed stream is split into chunks and multiple threads decode them concurrently resulting in much higher throughput than single-threaded readers through zlib or libdeflate .
from blazeseq import RapidgzipReader, FastqParser var reader = RapidgzipReader("data.fastq.gz", parallelism=4) # 0 = use all available cores. var parser = FastqParser(reader^, "illumina_1.8") for record in parser.records(): _ = record.id()
Architecture & Trade-offs
| Mode | Return Type | Copies Data? | Use When |
|---|---|---|---|
next_view() / views() |
FastqView |
No | Streaming transforms (QC, filtering) where you process and discard. Not thread-safe |
next_record() / records() |
FastqRecord |
Yes | Simple scripting, building in-memory collections |
next_batch() / batches() |
FastqBatch (SoA) |
Yes | GPU pipelines, parallel CPU operations |
Critical: FastqView spans are only valid until the next parser operation. Do not store them in collections or use after iteration advances.
Benchmarks
Throughput (file-based and in-memory) and comparison with needletail, seq_io, and kseq. See benchmark/README.md for commands and details.
Documentation
- API Reference: https://mosafi2.github.io/BlazeSeq/
- The site is generated with Modo (plain markdown from
mojo docoutput) and Astro Starlight. - Examples:
examples/directory includes parser usage, writer, and GPU alignment
Limitations
- No multi-line FASTQ support — Records must fit four lines (standard Illumina/ONT format)
- No current support for Paired-end reads (in progress)
- No random seek within FASTQ/FASTA streams — sequence parsers are sequential; use
MemoryReaderfor repeated scans..faiindex metadata is parsed separately withFaiParser. - Python package is wheel-only (no source build of the extension on install)
Testing
Run the test suite with pixi:
Tests use the same valid/invalid FASTQ corpus as BioJava, Biopython, and BioPerl FASTQ parsers. Multi-line FASTQ is not supported.
Project History
BlazeSeq is a ground-up rewrite of MojoFastTrim (archived MojoFastTrim), redesigned for:
- Unified parser architecture (one parser, three modes)
- GPU-oriented batch types
- Compile-time configuration
Acknowledgements
The parsing algorithm is inspired by the parsing approach of rust-based needletail. It was further optimized to use first-class SIMD support in mojo.
License
This project is licensed under the MIT License.
