As with previous releases the new dataset is available for anonymous download from and Amazon Web Services S3 bucket. The bucket is part of the Open Data on AWS project enabling sharing and analysis of a wide range of data.

The data is located in the bucket at:

s3://ont-open-data/Q20_ULK_Cliveome/

See the tutorials page for information on downloading the dataset.

Basecalling

The dataset comprises the direct output of the sequencing device software MinKNOW, along with basecalls computed post-run using the research-grade bonito basecaller with the “Q20 early access model” as follows:

pip install ont-bonito==0.4.0
bonito download --models
bonito basecaller dna_r10.3_q20ea <read directory> | bgzip -c > basecalls.fa.gz

Only reads passing the default quality filter (average Q-score > 10) were processed by bonito, i.e. only those .fast5 files located within the fast5_pass MinKNOW output folder.

Data summary

The sequencing runs here represent data from pre-release versions of the sequencing and analysis components. Data throughput and quality do not reflect that of a released product.

The dataset comprises eight PromethION sequencing runs from our R&D lab using pre-release chemistry components and R10.3 flowcells. A separately prepared sample was run on each flowcells. The flowcells yielded between 10Gbases and 18Gbases with N50 read lengths between 60-95kb.

Basecalling accuracy was assessed by aligning the reads to the GRCh38 human reference using minimap2, and alignment statistics calculated using the stats_from_bam program from the pomoxis software package.

Basecalling accuracy distribution for Q20 (early access) CliveOME dataset.

Single-molecule read lengths for each of the eight flowcells.

Chris Wright

Senior Director, Customer Bioinformatics

Visium HD 3' mouse brain

June 25, 2025

2 min

Teloseq 12-plex Human and Cow cell lines

May 21, 2025

1 min

ZymoBIOMICS Fecal Reference metagenomic WGS

Sean McKenzie

May 06, 2025

1 min

Plasmid validation

April 16, 2025

2 min

10x Genomics single-cell transcriptomics human cell lines

February 23, 2025

1 min

Genome in a Bottle Data Release 2025.01

January 05, 2025

2 min

Quick Links

Workflows Open Data Contact

Information

Social Media

github twitter

© 2020 - 2025 Oxford Nanopore Technologies plc. All rights reserved. Registered Office: Gosling Building, Edmund Halley Road, Oxford Science Park, OX4 4DQ, UK | Registered No. 05386273 | VAT No 336942382. Oxford Nanopore Technologies, the Wheel icon, EPI2ME, Flongle, GridION, Metrichor, MinION, MinIT, MinKNOW, Plongle, PromethION, SmidgION, Ubik and VolTRAX are registered trademarks of Oxford Nanopore Technologies plc in various countries. Oxford Nanopore Technologies products are not intended for use for health assessment or to diagnose, treat, mitigate, cure, or prevent any disease or condition.

Q20 single-read accuracy with ultra-long CliveOME dataset

Data location

Basecalling

Data summary

Tags

Share

Chris Wright

Senior Director, Customer Bioinformatics

Related Posts

Q20 single-read accuracy with ultra-long CliveOME dataset

.css-3mxrie{box-sizing:border-box;margin:0;min-width:0;display:block;color:var(--theme-ui-colors-heading,#edf2f7);font-weight:bold;-webkit-text-decoration:none;text-decoration:none;margin-bottom:1rem;font-size:1.125rem;position:relative;}Data location

Basecalling

Data summary

Tags

Share

Chris Wright

Senior Director, Customer Bioinformatics

Related Posts

Data location