A cross entropy test allows quantitative statistical comparison of t-SNE and UMAP representations
2023; Elsevier BV; Volume: 3; Issue: 1 Linguagem: Inglês
10.1016/j.crmeth.2022.100390
ISSN2667-2375
AutoresCarlos P. Roca, Oliver T. Burton, Julika Neumann, Samar Tareen, Carly E. Whyte, Václav Gergelits, Rafael Veiga, Stéphanie Humblet‐Baron, Adrian Liston,
Tópico(s)Cell Image Analysis Techniques
ResumoThe advent of high-dimensional single-cell data has necessitated the development of dimensionality-reduction tools. t-Distributed stochastic neighbor embedding (t-SNE) and uniform manifold approximation and projection (UMAP) are the two most frequently used approaches, allowing clear visualization of complex single-cell datasets. Despite the need for quantitative comparison, t-SNE and UMAP have largely remained visualization tools due to the lack of robust statistical approaches. Here, we have derived a statistical test for evaluating the difference between dimensionality-reduced datasets using the Kolmogorov-Smirnov test on the distributions of cross entropy of single cells within each dataset. As the approach uses the inter-relationship of single cells for comparison, the resulting statistic is robust and capable of identifying true biological variation. Further, the test provides a valid distance between single-cell datasets, allowing the organization of multiple samples into a dendrogram for quantitative comparison of complex datasets. These results demonstrate the largely untapped potential of dimensionality-reduction tools for biomedical data analysis beyond visualization.
Referência(s)