The link to the webinar I gave recently on nonlinear dimensionality reduction tools (eg. t-SNE, UMAP) is in the post below. In it, I do a deep dive into k-nearest neighborhood preservation between 2-D embedding space and original marker space, and come to the following points (in the datasets I have analyzed):
1. KNN preservation reveals that t-SNE and UMAP are not as precise as they may appear. This result comes directly out of averaging the neighborhood preservation per cell across a given dataset. These results, initially from work I did in 2018, is what led me to continue this project.
2. KNN preservation differs based on location on the map, with t-SNE and UMAP performance correlated in this regard. Larger "islands" have poorer neighborhood preservation than smaller islands or "corridors" if we're dealing with trajectory data.
3. PCA outperforms t-SNE and UMAP in terms of "k-farthest neighborhood" preservation, which is one very rough way of getting at global preservation. This serves as a sanity check, as we expect PCA to have the best global preservation, but I would not have otherwise known how well it does in comparison to UMAP, which at the time of emergence, claimed to have better global preservation than t-SNE.
While I can make some generalizations across datasets, I also have to note that some things depend on the dataset and data type (eg. CyTOF vs scRNA seq). Accordingly, you can use my KnnSleepwalk tool to critique the embeddings you're using for your data. I will link it in the comments below.
One of the key takeaways from this whole project is that you should be very careful if you decide to "gate" or do any sort of computation directly on the map, as opposed to using the original feature space, or the top n principal components if you're doing scRNA seq analysis.
Anyway, let me know if you have any questions.
?? ICYMI, you can now watch Tyler Burns, PhD’s talk about “The limits of dimensionality reduction tools for single-cell analysis” on our website. This webinar is a must-see for anyone interested in #singlecell analysis and #spatialomics. Feel free to tag your friends and colleagues!
Thank you to everyone who joined us, and a special thank you to Dr. Burns for sharing his expertise.
#spatialbiology
https://lnkd.in/eu8je9TY
The limits of dimensionality reduction tools for single-cell analysis
watershed.bio
We're thrilled Cerbrec will be participating! Register and view the full agenda here: https://utahlifesciencessummit.com/