Here we provide the quality control metrics used to evaluate data quality for each data processing pipeline. Each metric is listed along with links to the tool used to perform that particular analysis and a description of the metric. If you want to submit a QC metric for addition to a pipeline or for feedback, please contact us at data-help@humancellatlas.org.
The Smart-seq2 pipeline processes data generated from plate-based Smart-seq2 scRNA sequencing protocols (full transcript). The metrics below are generated from the quality control module of the pipeline.
Metric | Program | Details |
---|---|---|
RnaSeqMetrics | CollectRnaSeqMetrics | Metrics Definitions |
DuplicationMetrics | MarkDuplicates | Metrics Definitions |
AlignmentSummaryMetrics | CollectMultipleMetrics | Metrics Definitions |
InsertSizeMetrics | CollectMultipleMetrics | Metrics Definitions |
GcBiasMetrics,GcBiasDetailMetrics | CollectMultipleMetrics | Metrics Definitions |
QualityYieldMetrics | CollectMultipleMetrics | Metrics Definitions |
SequencingArtifactMetrics | CollectMultipleMetrics | Metrics Definitions |
HISAT2 Metrics | HISAT2 | HISAT2 alignment summary metrics |
RSEM Metrics | RSEM | Metrics from the RSEM cnt file |
This pipeline processes genomic data generated from the 10x Genomics 3 prime v2 (and v3) assay. The metrics below are detected using Single Cell Tools (sctools).
Cell Metrics | Program | Details |
---|---|---|
n_reads | SC Tools | The number of reads associated with this entity. Metrics Definitions |
noise_reads | SC Tools | Number of reads that are categorized by 10x Genomics Cell Ranger as "noise". Refers to long polymers, or reads with high numbers of N (ambiguous) nucleotides. Metrics Definitions |
perfect_molecule_barcodes | SC Tools | The number of reads with molecule barcodes that have no errors. Metrics Definitions |
n_mitochondrial_genes | SC Tools | The number of mitochondrial genes detected by this cell. Metrics Definitions |
n_mitochondrial_molecules | SC Tools | The number of molecules from mitochondrial genes detected for this cell. Metrics Definitions |
pct_mitochondrial_molecules | SC Tools | The percentage of molecules from mitochondrial genes detected for this cell. Metrics Definitions |
reads_mapped_exonic | SC Tools | The number of reads for this entity that are mapped to exons. Metrics Definitions |
reads_mapped_intronic | SC Tools | The number of reads for this entity that are mapped to introns. Metrics Definitions |
reads_mapped_utr | SC Tools | The number of reads for this entity that are mapped to 3' untranslated regions (UTRs). Metrics Definitions |
reads_mapped_uniquely | SC Tools | The number of reads mapped to a single unambiguous location in the genome. Metrics Definitions |
reads_mapped_multiple | SC Tools | The number of reads mapped to multiple genomic positions with equal confidence. Metrics Definitions |
duplicate_reads | SC Tools | The number of reads that are duplicates (see README.md for definition of a duplicate). Metrics Definitions |
spliced_reads | SC Tools | The number of reads that overlap splicing junctions. Metrics Definitions |
antisense_reads | SC Tools | The number of reads that are mapped to the antisense strand instead of the transcribed strand. Metrics Definitions |
molecule_barcode_fraction_bases_above_30_mean | SC Tools | The average fraction of bases in molecule barcodes that receive quality scores greater than 30 across the reads of this entity. Metrics Definitions |
molecule_barcode_fraction_bases_above_30_variance | SC Tools | The variance in the fraction of bases in molecule barcodes that receive quality scores greater than 30 across the reads of this entity. Metrics Definitions |
genomic_reads_fraction_bases_quality_above_30_mean | SC Tools | The average fraction of bases in the genomic read that receive quality scores greater than 30 across the reads of this entity (included for 10x Cell Ranger count comparison). Metrics Definitions |
genomic_reads_fraction_bases_quality_above_30_variance | SC Tools | The variance in the fraction of bases in the genomic read that receive quality scores greater than 30 across the reads of this entity (included for 10x Cell Ranger count comparison). Metrics Definitions |
genomic_read_quality_mean | SC Tools | Average quality of Illumina base calls in the genomic reads corresponding to this entity. Metrics Definitions |
genomic_read_quality_variance | SC Tools | Variance in quality of Illumina base calls in the genomic reads corresponding to this entity. Metrics Definitions |
n_molecules | SC Tools | Number of molecules corresponding to this entity. See README.md for the definition of a Molecule. Metrics Definitions |
n_fragments | SC Tools | Number of fragments corresponding to this entity. See README.md for the definition of a Fragment. Metrics Definitions |
reads_per_fragment | SC Tools | The average number of reads associated with each fragment in this entity. Metrics Definitions |
fragments_per_molecule | SC Tools | The average number of fragments associated with each molecule in this entity. Metrics Definitions |
fragments_with_single_read_evidence | SC Tools | The number of fragments associated with this entity that are observed by only one read. Metrics Definitions |
molecules_with_single_read_evidence | SC Tools | The number of molecules associated with this entity that are observed by only one read. Metrics Definitions |
perfect_cell_barcodes | SC Tools | The number of reads whose cell barcodes contain no error. Metrics Definitions |
reads_mapped_intergenic | SC Tools | The number of reads mapped to an intergenic region for this cell. Metrics Definitions |
reads_mapped_too_many_loci | SC Tools | The number of reads that were mapped to too many loci across the genome and as a consequence, are reported unmapped by the aligner. Metrics Definitions |
cell_barcode_fraction_bases_above_30_variance | SC Tools | The variance of the fraction of Illumina base calls for the cell barcode sequence that are greater than 30, across molecules. Metrics Definitions |
cell_barcode_fraction_bases_above_30_mean | SC Tools | The average fraction of Illumina base calls for the cell barcode sequences that are greater than 30, across molecules. Metrics Definitions |
n_genes | SC Tools | The number of genes detected by this cell. Metrics Definitions |
genes_detected_multiple_observations | SC Tools | The number of genes that are observed by more than one read in this cell. Metrics Definitions |
reads_unmapped | SC Tools | Reads that are non-transcriptomic |
emptydrops_FDR | dropletUtils | False Discovery Rate (FDR) for being a non-empty droplet; not included when running in single-nuclei mode |
emptydrops_IsCell | dropletUtils | Binarized call of cell/background based on predefined FDR cutoff; not included when running in single-nuclei mode |
emptydrops_Limited | dropletUtils | Indicates whether a lower p-value could be obtained by increasing the number of iterations; not included when running in single-nuclei mode |
emptydrops_LogProb | dropletUtils | The log-probability of observing the barcode’s count vector under the null model; not included when running in single-nuclei mode |
emptydrops_PValue | dropletUtils | Numeric, the Monte Carlo p-value against the null model; not included when running in single-nuclei mode |
emptydrops_Total | dropletUtils | Numeric, the total read counts for each barcode; not included when running in single-nuclei mode |
Gene Metrics | Program | Details |
---|---|---|
n_reads | SC Tools | The number of reads associated with this entity. Metrics Definitions |
noise_reads | SC Tools | Number of reads that are categorized by 10x Genomics Cell Ranger as "noise". Refers to long polymers, or reads with high numbers of N (ambiguous) nucleotides. Metrics Definitions |
perfect_molecule_barcodes | SC Tools | The number of reads with molecule barcodes that have no errors. Metrics Definitions |
reads_mapped_exonic | SC Tools | The number of reads for this entity that are mapped to exons. Metrics Definitions |
reads_mapped_intronic | SC Tools | The number of reads for this entity that are mapped to introns. Metrics Definitions |
reads_mapped_utr | SC Tools | The number of reads for this entity that are mapped to 3' untranslated regions (UTRs). Metrics Definitions |
reads_mapped_uniquely | SC Tools | The number of reads mapped to a single unambiguous location in the genome. Metrics Definitions |
reads_mapped_multiple | SC Tools | The number of reads mapped to multiple genomic positions with equal confidence. Metrics Definitions |
duplicate_reads | SC Tools | The number of reads that are duplicates (see README.md for definition of a duplicate). Metrics Definitions |
spliced_reads | SC Tools | The number of reads that overlap splicing junctions. Metrics Definitions |
antisense_reads | SC Tools | The number of reads that are mapped to the antisense strand instead of the transcribed strand. Metrics Definitions |
molecule_barcode_fraction_bases_above_30_mean | SC Tools | The average fraction of bases in molecule barcodes that receive quality scores greater than 30 across the reads of this entity. Metrics Definitions |
molecule_barcode_fraction_bases_above_30_variance | SC Tools | The variance in the fraction of bases in molecule barcodes that receive quality scores greater than 30 across the reads of this entity. Metrics Definitions |
genomic_reads_fraction_bases_quality_above_30_mean | SC Tools | The average fraction of bases in the genomic read that receive quality scores greater than 30 across the reads of this entity (included for 10x Cell Ranger count comparison). Metrics Definitions |
genomic_reads_fraction_bases_quality_above_30_variance | SC Tools | The variance in the fraction of bases in the genomic read that receive quality scores greater than 30 across the reads of this entity (included for 10x Cell Ranger count comparison). Metrics Definitions |
genomic_read_quality_mean | SC Tools | Average quality of Illumina base calls in the genomic reads corresponding to this entity. Metrics Definitions |
genomic_read_quality_variance | SC Tools | Variance in quality of Illumina base calls in the genomic reads corresponding to this entity. Metrics Definitions |
n_molecules | SC Tools | Number of molecules corresponding to this entity. See README.md for the definition of a Molecule. Metrics Definitions |
n_fragments | SC Tools | Number of fragments corresponding to this entity. See README.md for the definition of a Fragment. Metrics Definitions |
reads_per_molecule | SC Tools | The average number of reads associated with each molecule in this entity. Metrics Definitions |
reads_per_fragment | SC Tools | The average number of reads associated with each fragment in this entity. Metrics Definitions |
fragments_per_molecule | SC Tools | The average number of fragments associated with each molecule in this entity. Metrics Definitions |
fragments_with_single_read_evidence | SC Tools | The number of fragments associated with this entity that are observed by only one read. Metrics Definitions |
molecules_with_single_read_evidence | SC Tools | The number of molecules associated with this entity that are observed by only one read. Metrics Definitions |
number_cells_detected_multiple | SC Tools | The number of cells which observe more than one read of this gene. Metrics Definitions |
number_cells_expressing | SC Tools | The number of cells that detect this gene. Metrics Definitions |