Plasmid validation

Published in Data Releases
April 16, 2025
2 min read
Plasmid validation

Overview

Oxford Nanopore Technologies’ rapid barcoding kit (RBK114) may be used to prepare multiplexed sequencing libraries from laboratory cloning plasmids. The EPI2ME bioinformatics workflow, wf-clone-validation, is a widely used tool that can use the RBK114-derived sequencing data (sequenced on MinION, GridION or even PromethION devices) to assemble complete plasmid genome sequences. The assembled plasmids can be used to verify that the correct insert has been cloned and can provide additional information on the integrity of the plasmid backbone.

The wf-clone-validation workflow is provided with minimal FASTQ format sequence to demonstrate a successful bioinformatics analysis; there have been requests for more workflow illustrative examples.

The dataset provided in this release contains sequencing data from 96 different plasmids that have been designed to address questions frequently raised when discussing plasmid sequencing and the associated bioinformatics.

  • Range of plasmid sizes from 2kb to 7kb
  • Duplicated inserts with sizes from 1kb to 2.5kb
  • Multimeric inserts - collections of trimers, tetramers, pentamers and hexamers
  • Low GC content inserts also containing various multimers
  • DAM and DCM methylation sites
  • Generic protein-coding plasmids
  • Complete plasmid duplications

The POD5 signal data is provided along with both HAC and SUP basecalls. The reference information for the plasmids and their inserts is also provided.

This data collection thus represents a variety of different plasmids that are of varying degrees of difficulty to assemble. Did you know that the Canu assembly method provided in wf-clone-validation is better at assembling shorter plasmids than the default Flye method? While Flye is the default assembler in wf-clone-validation, due to its continued support and performance, in most use cases users working with small plasmids (typically <= 3kb) may find the alternative Canu assembler more successful.

The dataset also contains laboratory artefacts - can you find sequence reads in any samples that do not appear to belong?

Dataset

The dataset is available for anonymous download, without login, from a public Amazon Web Services S3 bucket. The bucket is part of the Open Data on AWS project enabling sharing and analysis of a wide range of data. The data can be downloaded with the AWS CLI command:

aws s3 sync --no-sign-request s3://ont-open-data/plasmid_2025.04 plasmid_2025.04

See the tutorials page for information on downloading the dataset. You can also browse and download the files in your web browser courtesy of 42basepairs.

Folder nameSizeDescription
RAW254 GBPOD5/Flowcell files
Basecalls90 GBBAM files
Analysis1.6 GBWorkflow outputs

Sample

AttributeValue
Sample NameSample01-96
Organismsynthetic construct
Molecule TypeDNA
Sample Typeglycerol stock
Biological replicates2
Flow Cell replicates2
Link to sample sourceNot publicly available

Preparation

Sample preparation was performed according to protocols published on the main Oxford Nanopore Technologies’ website.

AttributeValue
ExtractionPlasmid Extraction
Library PrepSQK-RBK114
KitSQK-RBK114

Further preparation information such as sample storage suggestions can be found at https://nanoporetech.com/documentation/prepare.

Sequencing

Sequence data were generated using the following configuration:

AttributeValue
Flow CellFLO-MIN114
DeviceGridION
ChemistryR10.4.1
Basecall modeldna_r10.4.1_e8.2_400bps_sup@v5.0.0, dna_r10.4.1_e8.2_400bps_hac@v5.0.0
MinKNOW version6.2.6
Dorado version (for rebasecalling)0.9.1

Analysis

Analysis outputs are available. The analysis results are located in our S3 bucket and can be downloaded with the following command:

aws s3 sync --no-sign-request s3://ont-open-data/plasmid_2025.04/analysis analysis

EPI2ME workflows used to generate analysis outputs:

Other software used for analysis:

ToolVersion
dorado0.9.1

Tags

#datasets

Share

Table Of Contents

1
Overview
2
Dataset
3
Sample
4
Preparation
5
Sequencing
6
Analysis
7
Related Materials

Related Posts

10x Genomics single-cell transcriptomics human cell lines
February 23, 2025
1 min

Quick Links

WorkflowsOpen DataContact

Social Media

© 2020 - 2025 Oxford Nanopore Technologies plc. All rights reserved. Registered Office: Gosling Building, Edmund Halley Road, Oxford Science Park, OX4 4DQ, UK | Registered No. 05386273 | VAT No 336942382. Oxford Nanopore Technologies, the Wheel icon, EPI2ME, Flongle, GridION, Metrichor, MinION, MinIT, MinKNOW, Plongle, PromethION, SmidgION, Ubik and VolTRAX are registered trademarks of Oxford Nanopore Technologies plc in various countries. Oxford Nanopore Technologies products are not intended for use for health assessment or to diagnose, treat, mitigate, cure, or prevent any disease or condition.