This dataset describes the sequencing of 24 bacterial isolates from US FDA CFSAN in quadruplicate on a single PromethION flowcell using the latest Oxford Nanopore Technologies’ NO-MISS Isolate sequencing rapid barcoding kit and protocol, incorporating the universal bead-beating extraction method for robust DNA recovery across diverse bacterial species. The 24-strain dataset contains 15 distinct species representing the following genera: Bacillus, Cronobacter, Citrobacter, Enterobacter, Klebsiella, Listeria, Pseudomonas, Salmonella, Shigella, Staphylococcus, and Vibrio. This includes strains containing multiple plasmids and strains with a variety of antimicrobial resistance (AMR)-associated genes and mutations. Libraries were prepared using the Rapid Barcoding Kit 96 V14 (SQK-RBK114.96) and sequenced on a single PromethION 2 Integrated (P2i) device.
Raw signal data were basecalled on the P2i using the SUP@5.2.0 model. Resulting reads were analysed using the EPI2ME Bacterial & Fungal Genomes (wf-bacterial-genomes) workflow to assess genome assembly and characterisation performance across replicate samples. This dataset provides a benchmark for evaluating the reproducibility and throughput of the NO-MISS workflow when applied to bacterial isolate sequencing.
| Detail | Description |
|---|---|
| Sample Name | sample_1-sample_24 |
| Organism | bacteria |
| Molecule Type | DNA |
| Sample Type | liquid cultures |
| Technical replicates | 4 |
| Flow Cell replicates | 1 |
| Link to sample source | Not publicly available |
Sample preparation was performed according to NO-MISS, the Nanopore-only Microbial Isolate Sequencing Solution using universal bead-beating and the 96 sample Rapid Barcoding Kit (RBK).
| Detail | Description |
|---|---|
| Extraction | Universal bead-beating |
| Library Prep | NO-MISS |
| Kit | SQK-RBK114.96 |
Further preparation information such as sample storage suggestions can be found on the Oxford Nanopore Website.
Sequence data were generated using the following configuration:
| Detail | Description |
|---|---|
| Flow Cell | FLO-PRO114M |
| Device | P2i |
| Chemistry | R10.4.1 |
| Basecall Model | v5.2.0 SUP |
| MinKNOW Version | v25.11.0 |
The dataset is available for anonymous download, without login, from a public Amazon Web Services S3 bucket. The bucket is part of the Open Data on AWS project enabling sharing and analysis of a wide range of data. The data can be downloaded with the AWS CLI command:
aws s3 sync --no-sign-request s3://ont-open-data/nomiss_96BC_P2I_SUP_2026 nomiss_96BC_P2I_SUP_2026
See the tutorials page for information on downloading the dataset.
You can also browse and download the files in your web browser courtesy of 42basepairs.
| Folder name | Size | Description |
|---|---|---|
| raw | 1.6 TB | POD5 files |
| basecalls | 74 GB | BAM files |
| analysis | 157 GB | Workflow outputs |
The Bacterial & Fungal Genomes workflow was used for genome assembly and isolate characterisation. This analysis includes de novo assembly performed by Flye followed by polishing using medaka. Genome assemblies are subsequently annotated with Bakta, and plasmid contigs are identified using MOB-suite. As part of isolate analysis, the workflow also performs MLST based species typing, Salmonella serotyping using SeqSero2, genome identity based species assignment using Sourmash, and AMR gene annotation using ResFinder.
The workflow was run with the default parameters, additionally supplying the Flye assembler with target genome coverage at 50x (--flye_asm_coverage 50) and setting the genome size for coverage estimation to 10Mb (--flye_genome_size 10000000).
Analysis outputs are available. The analysis results are located in the S3 bucket under the prefix:
s3://ont-open-data/nomiss_96BC_P2I_SUP_2026/analysis
The analysis outputs include complete workflow results, including HTML reports for interactive data exploration and polished genome assemblies in FASTA format.
wf-bacterial-genomes-report.html provides an overall summary report containing read and assembly statistics, plasmid analysis, gene annotations, and a summary of AMR genes identified across all analysed samples.
The workflow also generates output files from individual analysis tools to support downstream analyses and reuse. These include annotated genome files in GFF3 and GBFF formats, detailed plasmid analysis results reporting identified chromosome and plasmid markers, and plasmid FASTA files. Additionally, the full ResFinder output provides detailed alignment statistics for any identified AMR genes.
Poster: Rapid whole-genome sequencing, de novo assembly, and characterisation of bacterial isolates - Learn how to perform rapid whole-genome sequencing, de novo assembly, and characterisation of bacterial isolates.
Poster: Rapid and scalable whole-genome microbial isolate sequencing - Another poster highlighting the end-to-end, scalable workflow for whole-genome sequencing of microbial isolates.
How to sequence microbial isolates with the NO-MISS workflow - Step-by-step guidance for sequencing microbial isolates using the NO-MISS workflow, from sample preparation through to data analysis.
Related Links
