Feature Extraction with KNN

davpinto.github.io

30 points by RicoElectrico a month ago · 4 comments

Reader

patcon a month ago

This reminds me the talk by Leland McInness (employed by a branch of Canadian NSA and creator of UMAP) on rethinking dimensional reduction:

Rethinking Unsupervised Learning https://youtu.be/v_jcu7rUqqE

He boiled all dimensionality reduction down to two approaches: matrix decomposition and k-nearest neighbors.

I'm seeing the newest contenders for top algorithms leaning harder into the KNN dynamics, and less into the more complex maths of UMAP. (PaCMAP[1] and LocalMAP[2], which run via an attractive-repulsive force simulation based on high dimensional KNN graph)

Based on the end of his talk, Leland seems to generally align that this is where the next innovations are: KNNs.

Some interesting provocations are coming out:

Assessing and improving reliability of neighbor embedding methods: a map-continuity perspective (Apr 2025) https://arxiv.org/html/2410.16608v2

[1]: https://youtu.be/sD-uDZ8zXkc

[2]: https://youtu.be/vmSUQ-4LT4Y

EDIT: that last paper is not as relevant as I recall on reviewing it further. So the point is mainly the take-away from Leland's expertise in his video

ibgeek a month ago

I'm not sure if I'm understanding correctly, but it reminds me of the kernel trick. The distances between the training samples and a target sample are computed, the distances are scaled through a kernel function, and the scaled distances are used as features.

https://en.wikipedia.org/wiki/Kernel_method

esafak a month ago

Fast? The author should remind us the computational complexity of calculating the kNN...

patcon a month ago

I agree, they should have said it, and I had to go looking too. but it may be that they're just leaving this to others, and talking to ppl who already think about KNN's?
http://ann-benchmarks.com/index.html
But yeah, my own research (not just that link) says HNSW's get KNN complexity down to O(log n) for search, and O(n log n) for building the whole index.

Settings

Feature Extraction with KNN

Keyboard Shortcuts