Summary

Experiment summary
Input reads 16,004
Estimated cells 1,962
Mean reads per cell 8
Median UMI counts per cell 4
Median genes per cell 4
Barcode rank plot
Alignment / feature summary
Reads aligned 15,397
Reads mapping to genome 15,041
Supplementary 269
Unmapped 356
Unique genes detected 220
Unique isoforms detected 197
  • Input reads: The total number of reads in the input data.
  • Estimated cells: The estimated number of real cells identified by the workflow.
  • Mean reads per cell: The average number of reads per real cell.
  • Median UMI counts per cell: The median number of unique molecular identifiers (UMIs) per real cell.
  • Median genes per cell: The median number of unique genes identified per real cell.
Cells are ranked by read count in descending order on the x-axis, and the read count for each barcode is displayed on the y-axis. Only high quality barcodes are used to generate the rank plot (min qscore 15 and 100% match to the 10x whitelist)

The dashed line indicates the read count threshold that was determined by the workflow. Barcodes to the left of this point are considered "real cells", and those to the right are considered as non-cell barcodes and are not included in the downstream analysis.
  • Reads aligned: The total number of reads that were aligned to the reference genome sequence. This number excludes reads where the expected adapters were not found.
  • Reads mapping to genome: The number of primary alignments.
  • Supplementary: The number of supplementary alignments. These can be indicative of fusion genes or chimeric reads.
  • Unmapped: The number of reads that were not mapped to the reference genome.
  • Unique genes/isoforms detected: The total number of features identified across all cells.
Experiment summary
Input reads 4,870
Estimated cells 1,263
Mean reads per cell 4
Median UMI counts per cell 1
Median genes per cell 1
Barcode rank plot
Alignment / feature summary
Reads aligned 4,520
Reads mapping to genome 1,314
Supplementary 25
Unmapped 3,206
Unique genes detected 67
Unique isoforms detected 58
  • Input reads: The total number of reads in the input data.
  • Estimated cells: The estimated number of real cells identified by the workflow.
  • Mean reads per cell: The average number of reads per real cell.
  • Median UMI counts per cell: The median number of unique molecular identifiers (UMIs) per real cell.
  • Median genes per cell: The median number of unique genes identified per real cell.
Cells are ranked by read count in descending order on the x-axis, and the read count for each barcode is displayed on the y-axis. Only high quality barcodes are used to generate the rank plot (min qscore 15 and 100% match to the 10x whitelist)

The dashed line indicates the read count threshold that was determined by the workflow. Barcodes to the left of this point are considered "real cells", and those to the right are considered as non-cell barcodes and are not included in the downstream analysis.
  • Reads aligned: The total number of reads that were aligned to the reference genome sequence. This number excludes reads where the expected adapters were not found.
  • Reads mapping to genome: The number of primary alignments.
  • Supplementary: The number of supplementary alignments. These can be indicative of fusion genes or chimeric reads.
  • Unmapped: The number of reads that were not mapped to the reference genome.
  • Unique genes/isoforms detected: The total number of features identified across all cells.
Experiment summary
Input reads 4,825
Estimated cells 690
Mean reads per cell 7
Median UMI counts per cell 1
Median genes per cell 1
Barcode rank plot
Alignment / feature summary
Reads aligned 3,457
Reads mapping to genome 927
Supplementary 12
Unmapped 2,530
Unique genes detected 55
Unique isoforms detected 44
  • Input reads: The total number of reads in the input data.
  • Estimated cells: The estimated number of real cells identified by the workflow.
  • Mean reads per cell: The average number of reads per real cell.
  • Median UMI counts per cell: The median number of unique molecular identifiers (UMIs) per real cell.
  • Median genes per cell: The median number of unique genes identified per real cell.
Cells are ranked by read count in descending order on the x-axis, and the read count for each barcode is displayed on the y-axis. Only high quality barcodes are used to generate the rank plot (min qscore 15 and 100% match to the 10x whitelist)

The dashed line indicates the read count threshold that was determined by the workflow. Barcodes to the left of this point are considered "real cells", and those to the right are considered as non-cell barcodes and are not included in the downstream analysis.
  • Reads aligned: The total number of reads that were aligned to the reference genome sequence. This number excludes reads where the expected adapters were not found.
  • Reads mapping to genome: The number of primary alignments.
  • Supplementary: The number of supplementary alignments. These can be indicative of fusion genes or chimeric reads.
  • Unmapped: The number of reads that were not mapped to the reference genome.
  • Unique genes/isoforms detected: The total number of features identified across all cells.
Experiment summary
Input reads 4,970
Estimated cells 1,112
Mean reads per cell 4
Median UMI counts per cell 1
Median genes per cell 1
Barcode rank plot
Alignment / feature summary
Reads aligned 4,329
Reads mapping to genome 1,049
Supplementary 71
Unmapped 3,280
Unique genes detected 22
Unique isoforms detected 17
  • Input reads: The total number of reads in the input data.
  • Estimated cells: The estimated number of real cells identified by the workflow.
  • Mean reads per cell: The average number of reads per real cell.
  • Median UMI counts per cell: The median number of unique molecular identifiers (UMIs) per real cell.
  • Median genes per cell: The median number of unique genes identified per real cell.
Cells are ranked by read count in descending order on the x-axis, and the read count for each barcode is displayed on the y-axis. Only high quality barcodes are used to generate the rank plot (min qscore 15 and 100% match to the 10x whitelist)

The dashed line indicates the read count threshold that was determined by the workflow. Barcodes to the left of this point are considered "real cells", and those to the right are considered as non-cell barcodes and are not included in the downstream analysis.
  • Reads aligned: The total number of reads that were aligned to the reference genome sequence. This number excludes reads where the expected adapters were not found.
  • Reads mapping to genome: The number of primary alignments.
  • Supplementary: The number of supplementary alignments. These can be indicative of fusion genes or chimeric reads.
  • Unmapped: The number of reads that were not mapped to the reference genome.
  • Unique genes/isoforms detected: The total number of features identified across all cells.

Read summary

Read assignment summary

Full length Valid barcode Gene assigned Transcript assigned
Reads 15,552 13,774 6,015 4,458
% of_FL 100.00 88.57 38.68 28.67
  • Full length: Proportion of reads containing adapters in the expected configuration. Full-length reads are carried forward in the workflow to attempt to assign barcode/UMI.
  • Valid barcodes: Proportion of reads that have been assigned corrected cell barcodes and UMIs. All reads with valid barcodes are used in the subsequent stages of the workflow.
  • Gene assigned: Proportion of reads unambiguously assigned to a gene.
  • Transcript assigned: Proportion of reads unambiguously assigned a transcript.
Full length Valid barcode Gene assigned Transcript assigned
Reads 4,565 879 298 227
% of_FL 100.00 19.26 6.53 4.97
  • Full length: Proportion of reads containing adapters in the expected configuration. Full-length reads are carried forward in the workflow to attempt to assign barcode/UMI.
  • Valid barcodes: Proportion of reads that have been assigned corrected cell barcodes and UMIs. All reads with valid barcodes are used in the subsequent stages of the workflow.
  • Gene assigned: Proportion of reads unambiguously assigned to a gene.
  • Transcript assigned: Proportion of reads unambiguously assigned a transcript.
Full length Valid barcode Gene assigned Transcript assigned
Reads 3,687 244 108 81
% of_FL 100.00 6.62 2.93 2.20
  • Full length: Proportion of reads containing adapters in the expected configuration. Full-length reads are carried forward in the workflow to attempt to assign barcode/UMI.
  • Valid barcodes: Proportion of reads that have been assigned corrected cell barcodes and UMIs. All reads with valid barcodes are used in the subsequent stages of the workflow.
  • Gene assigned: Proportion of reads unambiguously assigned to a gene.
  • Transcript assigned: Proportion of reads unambiguously assigned a transcript.
Full length Valid barcode Gene assigned Transcript assigned
Reads 4,440 417 37 28
% of_FL 100.00 9.39 0.83 0.63
  • Full length: Proportion of reads containing adapters in the expected configuration. Full-length reads are carried forward in the workflow to attempt to assign barcode/UMI.
  • Valid barcodes: Proportion of reads that have been assigned corrected cell barcodes and UMIs. All reads with valid barcodes are used in the subsequent stages of the workflow.
  • Gene assigned: Proportion of reads unambiguously assigned to a gene.
  • Transcript assigned: Proportion of reads unambiguously assigned a transcript.

Adapter configuration



Full length reads are defined as those flanked by primers/adapters in the expected orientations: adapter1---cDNA---adapter2.

These full length reads can then be oriented in the same way and are used in the next stages of the workflow. If `full_length_only` is set to `false` reads with all primer configurations are analysed.

Every library prep will contain some level of non-standard adapter configuration artifacts. These are not used for subsequent stages of the workflow. These plots show the proportions of different adapter configurations within each sample, which can help in diagnosing library preparation issues. For most applications, the majority of reads should be full_length.

The adapters used to identify read segments vary slightly between the supported kits. They are:

3prime, multiome and visium kits:

  • Adapter1: Read1
  • Adapter2: TSO

5prime kit:

  • Adapter1: Read1
  • Adapter2: Non-Poly(dT) RT primer

Saturation

Sequencing saturation is an indication of how well the diversity of a library has been captured in an experiment. As sequencing depth increases, the number of detected genes and unique molecular identifiers (UMIs) will also increase at a rate that depends on the complexity of the input library. The curve gradient indicates the rate at which new genes or UMIs are being recovered; as saturation increases the the curve flattens

  • Gene saturation: Genes per cell as a function of read depth.
  • UMI saturation: UMIs per cell as a function of read depth.
  • Sequencing saturation: This metric is a measure of the proportion of reads that come from a previously observed UMI, and is calculated with the following formula: 1 - (number of unique UMIs / number of reads).

UMAP projections

This section presents various UMAP projections of the data. UMAP is an unsupervised algorithm that projects the multidimensional single cell expression data into 2 dimensions. This could reveal structure in the data representing different cell types or cells that share common regulatory pathways, for example. The UMAP algorithm is stochastic; analysing the same data multiple times with UMAP, using identical parameters, can lead to visually different projections. In order to have some confidence in the observed results, it can be useful to run the projection multiple times and so a series of UMAP projections can be viewed below.

No data for COX16

No data for AAGAB

No data for CD70

No data for NOGENE

No data for COX16

No data for AAGAB

No data for CD70

No data for NOGENE

No data for COX16

No data for AAGAB

No data for CD70

No data for NOGENE

No data for COX16

No data for AAGAB

No data for CD70

No data for NOGENE

No data for COX16

No data for AAGAB

No data for CD70

No data for NOGENE

No data for COX16

No data for AAGAB

No data for CD70

No data for NOGENE

No data for COX16

No data for AAGAB

No data for NOGENE

No data for COX16

No data for AAGAB

No data for NOGENE

No data for COX16

No data for AAGAB

No data for NOGENE

No data for COX16

No data for AAGAB

No data for NOGENE

No data for COX16

No data for AAGAB

No data for NOGENE

No data for COX16

No data for AAGAB

No data for NOGENE

Software versions

Name Version
pysam 0.22.0
parasail 1.2.3
pandas 2.0.3
rapidfuzz 2.13.7
scikit-learn 1.3.2
minimap2 2.24-r1122
samtools 1.20
bedtools v2.30.0
gffread 0.12.7
seqkit v2.9.0
stringtie 2.2.2

Workflow parameters

Key Value
fastq wf-single-cell/data/test_data/fastq/
bam None
out_dir wf-single-cell
sample_sheet None
sample None
single_cell_sample_sheet wf-single-cell/data/test_data/samples.test.csv
kit_config None
kit None
threads 4
full_length_only True
min_read_qual None
fastq_chunk 2500
barcode_adapter1_suff_length 10
barcode_min_quality 15
barcode_max_ed 2
barcode_min_ed_diff 2
gene_assigns_minqv 30
matrix_min_genes 1
matrix_min_cells 1
matrix_max_mito 100
matrix_norm_count 10000
genes_of_interest None
umap_n_repeats 3
expected_cells None
estimate_cell_count True
mito_prefix MT-
stringtie_opts -c 2
call_variants False
report_variants None
call_fusions False
ref_genome_dir wf-single-cell/data/test_data/refdata-gex-GRCh38-2020-A
ctat_resources None
epi2me_resource_bundle None
store_dir wf-single-cell/store_dir
resource_bundles {'gex-GRCh38-2024-A': {'10x': 'https://ont-exd-int-s3-euwst1-epi2me-labs.s3.amazonaws.com/wf-single-cell/refdata-gex-GRCh38-2024-A.tar.gz', 'ctat-lr-fusion': 'https://ont-exd-int-s3-euwst1-epi2me-labs.s3.amazonaws.com/wf-single-cell/ctat_genome_lib_10x_2024.tar.gz'}, 'gex-GRCh38-2024-A_chr_20-21': {'10x': 'https://ont-exd-int-s3-euwst1-epi2me-labs.s3.amazonaws.com/wf-single-cell/refdata-gex-GRCh38-2024-A_chr20_21.tar.gz', 'ctat-lr-fusion': 'https://ont-exd-int-s3-euwst1-epi2me-labs.s3.amazonaws.com/wf-single-cell/ctat_genome_lib_chr20_21_UyHq1cFI.tar.gz'}}