University of Limerick
Browse

demuxSNP: supervised demultiplexing single-cell RNA sequencing using cell hashing and SNPs

Download (2.02 MB)
journal contribution
posted on 2024-12-13, 09:45 authored by Michael P. LynchMichael P. Lynch, Yufei Wang, Shannan Ho Sui, Laurent GattoLaurent Gatto, Aedin CulhaneAedin Culhane

Background: Multiplexing single-cell RNA sequencing experiments reduces sequencing cost and facilitates larger-scale studies. However, factors such as cell hashing quality and class size imbalance impact demultiplexing algorithm performance, reducing cost effectiveness. Findings: We propose a supervised algorithm, demuxSNP, which leverages both cell hashing and genetic variation between individuals (single-nucletotide polymorphisms [SNPs]). demuxSNP addresses fundamental limitations in demultiplexing methods that use only one data modality. Some cells may be confidently demultiplexed using probabilistic hashing methods. demuxSNP uses these data to infer the genotype of singlet and doublet clusters and predict on cells assigned as negative, uncertain, or doublet using a nearest-neighbor approach adapted for missing data. We benchmarked demuxSNP against hashing, genotype-free SNP and hybrid methods on simulated and real data from renal cell cancer. demuxSNP outperformed standalone hashing methods on low-quality hashing data benchmark, improved overall classifica?tion accuracy, and allowed more high RNA quality cells to be recovered. Through varying simulated doublet rates, we showed that genotype-free SNP and hybrid methods that leverage them were impacted by class size imbalance and doublet rate. demuxSNP’s supervised approach was more robust to doublet rate in experiments with class size imbalance. Conclusions: demuxSNP uses hashing and SNP data to demultiplex datasets with low hashing quality where biological samples are genetically distinct. Unassigned or negative cells with high RNA quality are recovered, making more cells available for analysis. Data simulation and benchmarking pipelines as well as processed benchmarking data for 5–50% doublets are publicly available. demuxSNP is available as an R/Bioconductor package (https://doi.org/doi:10.18129/B9.bioc.demuxSNP)

History

Publication

GigaScience 13, pp.1–12

Publisher

Oxford University Press GigaScience

Other Funding information

This project has been made possible in part by grant number CZF 2019-002,443 (Lead PI: Martin Morgan, Co-PI: A.C.C.) from the Chan Zuckerberg Initiative DAF, an advised fund of Silicon Valley Community Foundation of which A.C.C. and M.P.L. are grantees, as well as by startup funding from the School of Medicine, University of Limerick to A.C.C. In addition, this project was supported by the Assistant Secretary of Defense for Health Affairs endorsed by the US Department of Defense, Kidney Cancer Research Pro?gram (KCRP) through the FY21 Translational Research Partnership Award (W81XWH-21-1-0442, lead PI: Wayne A. Maraso) and FY21 Idea Development Award (W81XWH-21-1-0482, lead PI: Wayne A. Maraso) of which Y.W., A.C.C., and M.P.L. are grantees Wong Family Award and Kidney Cancer Association Trailblazer Award to Y.W

Also affiliated with

  • Health Research Institute (HRI)

Department or School

  • School of Medicine

Usage metrics

    University of Limerick

    Categories

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC