Pipelines

Data Processing Pipelines QC Metrics

Overview

Here we provide the quality control metrics used to evaluate data quality for each data processing pipeline. Each metric is listed along with links to the tool used to perform that particular analysis and a description of the metric. If you want to submit a QC metric for addition to a pipeline or for feedback, please contact us at data-help@humancellatlas.org.

Smart-seq2 Pipeline Metrics

The Smart-seq2 pipeline processes data generated from plate-based Smart-seq2 scRNA sequencing protocols (full transcript). The metrics below are generated from the quality control module of the pipeline.

Metric Program Details
RnaSeqMetrics CollectRnaSeqMetrics Metrics Definitions
DuplicationMetrics MarkDuplicates Metrics Definitions
AlignmentSummaryMetrics CollectMultipleMetrics Metrics Definitions
InsertSizeMetrics CollectMultipleMetrics Metrics Definitions
GcBiasMetrics,GcBiasDetailMetrics CollectMultipleMetrics Metrics Definitions
QualityYieldMetrics CollectMultipleMetrics Metrics Definitions
SequencingArtifactMetrics CollectMultipleMetrics Metrics Definitions

Optimus Pipeline Metrics

This pipeline processes genomic data generated from the 10x Genomics 3 prime v2 (and v3) assay. The metrics below are detected using Single Cell Tools (sctools).

Cell Metrics Program Details
n_reads SC Tools The number of reads associated with this entity. Metrics Definitions
noise_reads SC Tools Number of reads that are categorized by 10x Genomics Cell Ranger as "noise". Refers to long polymers, or reads with high numbers of N (ambiguous) nucleotides. Metrics Definitions
perfect_molecule_barcodes SC Tools The number of reads with molecule barcodes that have no errors. Metrics Definitions
reads_mapped_exonic SC Tools The number of reads for this entity that are mapped to exons. Metrics Definitions
reads_mapped_intronic SC Tools The number of reads for this entity that are mapped to introns. Metrics Definitions
reads_mapped_utr SC Tools The number of reads for this entity that are mapped to 3' untranslated regions (UTRs). Metrics Definitions
reads_mapped_uniquely SC Tools The number of reads mapped to a single unambiguous location in the genome. Metrics Definitions
reads_mapped_multiple SC Tools The number of reads mapped to multiple genomic positions with equal confidence. Metrics Definitions
duplicate_reads SC Tools The number of reads that are duplicates (see README.md for definition of a duplicate). Metrics Definitions
spliced_reads SC Tools The number of reads that overlap splicing junctions. Metrics Definitions
antisense_reads SC Tools The number of reads that are mapped to the antisense strand instead of the transcribed strand. Metrics Definitions
molecule_barcode_fraction_bases_above_30_mean SC Tools The average fraction of bases in molecule barcodes that receive quality scores greater than 30 across the reads of this entity. Metrics Definitions
molecule_barcode_fraction_bases_above_30_variance SC Tools The variance in the fraction of bases in molecule barcodes that receive quality scores greater than 30 across the reads of this entity. Metrics Definitions
genomic_reads_fraction_bases_quality_above_30_mean SC Tools The average fraction of bases in the genomic read that receive quality scores greater than 30 across the reads of this entity (included for 10x Cell Ranger count comparison). Metrics Definitions
genomic_reads_fraction_bases_quality_above_30_variance SC Tools The variance in the fraction of bases in the genomic read that receive quality scores greater than 30 across the reads of this entity (included for 10x Cell Ranger count comparison). Metrics Definitions
genomic_read_quality_mean SC Tools Average quality of Illumina base calls in the genomic reads corresponding to this entity. Metrics Definitions
genomic_read_quality_variance SC Tools Variance in quality of Illumina base calls in the genomic reads corresponding to this entity. Metrics Definitions
n_molecules SC Tools Number of molecules corresponding to this entity. See README.md for the definition of a Molecule. Metrics Definitions
n_fragments SC Tools Number of fragments corresponding to this entity. See README.md for the definition of a Fragment. Metrics Definitions
reads_per_molecule SC Tools The average number of reads associated with each molecule in this entity. Metrics Definitions
reads_per_fragment SC Tools The average number of reads associated with each fragment in this entity. Metrics Definitions
fragments_per_molecule SC Tools The average number of fragments associated with each molecule in this entity. Metrics Definitions
fragments_with_single_read_evidence SC Tools The number of fragments associated with this entity that are observed by only one read. Metrics Definitions
molecules_with_single_read_evidence SC Tools The number of molecules associated with this entity that are observed by only one read. Metrics Definitions
perfect_cell_barcodes SC Tools The number of reads whose cell barcodes contain no error. Metrics Definitions
reads_mapped_intergenic SC Tools The number of reads mapped to an intergenic region for this cell. Metrics Definitions
reads_mapped_too_many_loci SC Tools The number of reads that were mapped to too many loci across the genome and as a consequence, are reported unmapped by the aligner. Metrics Definitions
cell_barcode_fraction_bases_above_30_variance SC Tools The variance of the fraction of Illumina base calls for the cell barcode sequence that are greater than 30, across molecules. Metrics Definitions
cell_barcode_fraction_bases_above_30_mean SC Tools The average fraction of Illumina base calls for the cell barcode sequence that are greater than 30, across molecules. Metrics Definitions
n_genes SC Tools The number of genes detected by this cell. Metrics Definitions
genes_detected_multiple_observations SC Tools The number of genes that are observed by more than one read in this cell. Metrics Definitions
Gene Metrics Program Details
n_reads SC Tools The number of reads associated with this entity. Metrics Definitions
noise_reads SC Tools Number of reads that are categorized by 10x Genomics Cell Ranger as "noise". Refers to long polymers, or reads with high numbers of N (ambiguous) nucleotides. Metrics Definitions
perfect_molecule_barcodes SC Tools The number of reads with molecule barcodes that have no errors. Metrics Definitions
reads_mapped_exonic SC Tools The number of reads for this entity that are mapped to exons. Metrics Definitions
reads_mapped_intronic SC Tools The number of reads for this entity that are mapped to introns. Metrics Definitions
reads_mapped_utr SC Tools The number of reads for this entity that are mapped to 3' untranslated regions (UTRs). Metrics Definitions
reads_mapped_uniquely SC Tools The number of reads mapped to a single unambiguous location in the genome. Metrics Definitions
reads_mapped_multiple SC Tools The number of reads mapped to multiple genomic positions with equal confidence. Metrics Definitions
duplicate_reads SC Tools The number of reads that are duplicates (see README.md for definition of a duplicate). Metrics Definitions
spliced_reads SC Tools The number of reads that overlap splicing junctions. Metrics Definitions
antisense_reads SC Tools The number of reads that are mapped to the antisense strand instead of the transcribed strand. Metrics Definitions
molecule_barcode_fraction_bases_above_30_mean SC Tools The average fraction of bases in molecule barcodes that receive quality scores greater than 30 across the reads of this entity. Metrics Definitions
molecule_barcode_fraction_bases_above_30_variance SC Tools The variance in the fraction of bases in molecule barcodes that receive quality scores greater than 30 across the reads of this entity. Metrics Definitions
genomic_reads_fraction_bases_quality_above_30_mean SC Tools The average fraction of bases in the genomic read that receive quality scores greater than 30 across the reads of this entity (included for 10x Cell Ranger count comparison). Metrics Definitions
genomic_reads_fraction_bases_quality_above_30_variance SC Tools The variance in the fraction of bases in the genomic read that receive quality scores greater than 30 across the reads of this entity (included for 10x Cell Ranger count comparison). Metrics Definitions
genomic_read_quality_mean SC Tools Average quality of Illumina base calls in the genomic reads corresponding to this entity. Metrics Definitions
genomic_read_quality_variance SC Tools Variance in quality of Illumina base calls in the genomic reads corresponding to this entity. Metrics Definitions
n_molecules SC Tools Number of molecules corresponding to this entity. See README.md for the definition of a Molecule. Metrics Definitions
n_fragments SC Tools Number of fragments corresponding to this entity. See README.md for the definition of a Fragment. Metrics Definitions
reads_per_molecule SC Tools The average number of reads associated with each molecule in this entity. Metrics Definitions
reads_per_fragment SC Tools The average number of reads associated with each fragment in this entity. Metrics Definitions
fragments_per_molecule SC Tools The average number of fragments associated with each molecule in this entity. Metrics Definitions
fragments_with_single_read_evidence SC Tools The number of fragments associated with this entity that are observed by only one read. Metrics Definitions
molecules_with_single_read_evidence SC Tools The number of molecules associated with this entity that are observed by only one read. Metrics Definitions
number_cells_detected_multiple SC Tools The number of cells which observe more than one read of this gene. Metrics Definitions
number_cells_expressing SC Tools The number of cells that detect this gene. Metrics Definitions
Improve this page