Trajectory inference is a common application of scRNA-seq data. However, it is often necessary to previously determine the origin of the trajectories, the stem or progenitor cells. In this work, we propose a computational tool to quantify pluripotency from single cell transcriptomics data. This approach uses the protein-protein interaction (PPI) network associated with the differentiation process as a scaffold and the gene expression matrix to calculate a score that we call differentiation activity. This score reflects how active the differentiation network is in each cell. We benchmark the performance of our algorithm with two previously published tools, LandSCENT (Chen et al., 2019) and CytoTRACE (Gulati et al., 2020), for four healthy human data sets: breast, colon, hematopoietic and lung. We show that our algorithm is more efficient than LandSCENT and requires less RAM memory than the other programs. We also illustrate a complete workflow from the count matrix to trajectory inference using the breast data set.
•ORIGINS is a methodology to quantify pluripotency from scRNA-seq data implemented as a freely available R package.
•ORIGINS uses the protein-protein interaction network associated with differentiation and the data set expression matrix to calculate a score (differentiation activity) that quantifies pluripotency for each cell.