The Oxford Nanopore Technologies full-length 16S amplicon method can be used to profile complex bacterial communities. This post describes a new 16S dataset release, using newly released 16S Primers included in the Microbial Amplicon Barcoding Kit 24 V14 (SQK-MAB114.24).
The dataset includes two well-characterised mock communities. The first is the ZymoBIOMICS Microbial Community DNA Standard, supplemented with genomic DNA (gDNA) from additional species to demonstrate improved inclusivity of the new 16S primers. These additional gDNA samples, sourced from ATCC, are Bifidobacterium adolescentis, Borrelia burgdorferi, Chlamydia trachomatis, and Gardnerella vaginalis.
The second is the ZymoBIOMICS Gut Microbiome Standard, which was sequenced using the same kit to further showcase performance in a more complex microbial background.
In addition to releasing the data, we also benchmarked classification performance across popular 16S workflows, comparing SQK-MAB114.24 to the previous-generation SQK-16S114.24 kit. Results from this benchmark including taxa recovery and error rates are presented in the analysis section.
Detail | Description |
---|---|
Sample Name(s) | ZymoBIOMICS Microbial DNA Standard supplemented with gDNA from additional species and ZymoBIOMICS Gut Microbiome Standard |
Organism | Bacterial, fungi |
Molecule Type | Amplicon DNA |
Sample Type | gDNA |
Biological replicates | 3 and 2 respectively |
Flow Cell replicates | 2 per sample |
Links to sample sources |
|
Sample preparation was performed according to the Microbial Amplicon Barcoding Kit 24 V14 Protocol published on the Oxford Nanopore Technologies website.
Detail | Description |
---|---|
Library Prep | Microbial Amplicon Barcoding Kit 24 V14 Protocol |
Kit | Microbial Amplicon Barcoding Kit 24 V14 (SQK-MAB114.24) |
Further preparation information such as sample storage suggestions can be found on the Oxford Nanopore Website.
Sequence data was generated using the following configuration:
Detail | Description |
---|---|
Flow Cell | FLO-MIN114 |
Device | GridION |
Chemistry | R10.4.1 |
Basecall Model | HAC and SUP v5.2.0 |
MinKNOW Version | 25.05.12 |
Dorado version (rebasecalling) | v1.1.1 |
The dataset is available for anonymous download, without login, from a public Amazon Web Services S3 bucket. The bucket is part of the Open Data on AWS project enabling sharing and analysis of a wide range of data. The data can be downloaded with the AWS CLI command:
aws s3 sync --no-sign-request s3://ont-open-data/zymo_16s_2025.09 zymo_16s_2025.09
See the tutorials page for information on downloading the dataset.
You can also browse and download the files in your web browser courtesy of 42basepairs.
Folder name | Size | Description |
---|---|---|
raw | 36 GB | POD5 files |
basecalls | 5.1 GB | BAM files |
analysis | 699 MB | Workflow outputs |
In order to demonstrate the improved performance of SQK-MAB114.24 over SQK-16S114.24, we sequenced multiple replicates of two mock microbial communities with both kits. One of these communities was the ZymoBIOMICS Microbial Community DNA Standard, supplemented with four species that have previously shown reduced sensitivity with earlier-generation 16S primers: Gardnerella vaginalis (classified as Bifidobacterium vaginale in Greengenes2/GTDB; therefore, grouped with Bifidobacterium adolescentis for the purposes of analysis), Bifidobacterium adolescentis, Borrelia burgdorferi, and Chlamydia trachomatis. The other was the ZymoBIOMICS Gut Microbiome Standard. Independent replicates were barcoded and sequenced on MinION Flow Cells and basecalled with the latest (v5.2.0) Dorado basecall models. For classification analysis, we evaluated three popular methods for long read 16S-based classification: Emu, wf-16s using the Kraken2 classifier, and wf-16s using the minimap2 (mm2) classifier.
For classification, we used full length 16S sequences from the Greengenes2 database release 2024.09, supplemented with Salmonella enterica and Sarcina perfringens reference sequences from GTDB release 226 as these were found to be missing or under-represented in the Greengenes2 full length sequences.
For both communities, the best performing workflow was SQK-MAB114.24 with classification by wf-16s using the minimap2 classifier, although very close results were observed with Emu.
Only wf-16s with the Kraken2 classifier had significant abundance of false-positive taxa (observed with both kits). We found that SQK-MAB114.24 was able to recover genera missing or highly under-represented in SQK-16S114.24 data, including our supplemented genera in the ZymoBIOMICS Microbial Community Standard and Akkermansia, Bacteroides, Bifidobacterium, and Prevotella in the ZymoBIOMICS Gut Microbiome Standard.
We also observed that Escherichia was less over-represented in SQK-MAB114.24 compared to SQK-16S114.24.
Analysis outputs are available. The analysis results are located in the S3 bucket under the prefix:
s3://ont-open-data/zymo_16s_2025.09/analysis
EPI2ME workflows used to generate analysis outputs:
Related Links