We are pleased to release a metagenomic dataset from deep sequencing of a mature compost pile, highlighting the capabilities of Oxford Nanopore Technologies in characterising highly diverse microbial communities. From 1.45 Tb of sequencing, we performed de novo assembly with metaMDBG and binned 5,598 metagenome-assembled genomes (MAGs) of medium or higher MIMAG quality, including 1,353 circularized, single contig MAGs. This release includes: basecalls; assembled contigs; binned genomes; and a poster from ASM Microbe 2025 highlighting a high degree of species-level novelty, strain-dependent antimicrobial resistance (AMR) profiles, and intragenic invertons revealed through long-read sequencing.
| Detail | Description |
|---|---|
| Sample Name | Compost |
| Organism | Microbial community |
| Molecule Type | gDNA |
| Sample Type | Compost |
| Biological replicates | 1 |
| Flow Cell replicates | 8 |
Sample preparation was performed according to protocols published on the main Oxford Nanopore Technologies’ website.
| Detail | Description |
|---|---|
| Extraction | MP Biomedicals FastDNA SPIN Kit for Soil (SKU 116560200-CF) |
| Library Prep | Ligation Sequencing Kit V14 |
| Kit | SQK-LSK114 |
Further preparation information such as sample storage suggestions can be found on the Oxford Nanopore Website.
Sequence data was generated using the following configuration. Each flow cell was run for 100 hours. Both HAC and SUP basecalls are available, but SUP basecalls were used for assembly and downstream analysis.
| Detail | Description |
|---|---|
| Flow Cell | FLO-PRO114M |
| Device | PromethION |
| Chemistry | R10.4.1 |
| Basecall Model | v5.0.0 SUP; v5.0.0 HAC |
The dataset is available for anonymous download, without login, from a public Amazon Web Services S3 bucket. The bucket is part of the Open Data on AWS project enabling sharing and analysis of a wide range of data. The data can be downloaded with the AWS CLI command:
aws s3 sync --no-sign-request s3://ont-open-data/compost_mgx_2026.04 compost_mgx_2026.04
See the tutorials page for information on downloading the dataset.
You can also browse and download the files in your web browser courtesy of 42basepairs.
| Folder name | Size | Description |
|---|---|---|
| raw | 17TB | POD5 files |
| basecalls | 1.1 Tb | BAM files |
| analysis | 15 Gb | Workflow outputs |
Analysis outputs are available. The analysis results are located in the S3 bucket under the prefix:
s3://ont-open-data/compost_mgx_2026.04/analysis
Other software used for analysis:
| Tool | Version |
|---|---|
| dorado | 0.8.3 |
| metaMDBG | 1.1 |
| checkM2 | 1.0.2 |
| minimap2 | 2.27 |
| SemiBin2 | 2.1.0 |
| MetaBAT2 | 2.17 |
| DASTool | 1.1.7 |
Below is an outline of the analysis workflow:
dna_r10.4.1_e8.2_400bps_sup@v5.0.0).--in-ont flag and default parameters.-x lr:hq and --secondary=no flags.--self-supervised, --sequencing-type=long_reads, and --environment=human_gut flags.--minContig 30000.See the results of our MAG binning approach towards another complex community, the ZymoBIOMICS Fecal Reference, in our Metagenomics Application Note. The corresponding Zymo Fecal dataset is also openly available here.
Related Links
