Robustness of single-cell RNA-seq for identifying differentially expressed genes. BMC Genomics 2023 Jul 03;24(1):371
Date
07/03/2023Pubmed ID
37394518Pubmed Central ID
PMC10316566DOI
10.1186/s12864-023-09487-yScopus ID
2-s2.0-85163598768 (requires institutional sign-in at Scopus site) 1 CitationAbstract
BACKGROUND: A common feature of single-cell RNA-seq (scRNA-seq) data is that the number of cells in a cell cluster may vary widely, ranging from a few dozen to several thousand. It is not clear whether scRNA-seq data from a small number of cells allow robust identification of differentially expressed genes (DEGs) with various characteristics.
RESULTS: We addressed this question by performing scRNA-seq and poly(A)-dependent bulk RNA-seq in comparable aliquots of human induced pluripotent stem cells-derived, purified vascular endothelial and smooth muscle cells. We found that scRNA-seq data needed to have 2,000 or more cells in a cluster to identify the majority of DEGs that would show modest differences in a bulk RNA-seq analysis. On the other hand, clusters with as few as 50-100 cells may be sufficient for identifying the majority of DEGs that would have extremely small p values or transcript abundance greater than a few hundred transcripts per million in a bulk RNA-seq analysis.
CONCLUSION: Findings of the current study provide a quantitative reference for designing studies that aim for identifying DEGs for specific cell clusters using scRNA-seq data and for interpreting results of such studies.
Author List
Liu Y, Huang J, Pandey R, Liu P, Therani B, Qiu Q, Rao S, Geurts AM, Cowley AW Jr, Greene AS, Liang MAuthors
Aron Geurts PhD Professor in the Physiology department at Medical College of WisconsinSridhar Rao MD, PhD Associate Professor in the Pediatrics department at Medical College of Wisconsin
MESH terms used to index this publication - Major topics in bold
Gene Expression ProfilingHumans
Induced Pluripotent Stem Cells
Sequence Analysis, RNA
Single-Cell Analysis