curse of dimensionality

Curse of dimensionality #

The curse of dimensionality is a catchprase used in statistics, machine learning, data science, etc to refer to the weird things that can happen in datasets with large numbers of features. Mathematically, these phenomena can usually be explained by “distance concentration”: as the dimension \(d\) grows, the variance of the pairwise distance between all pairs of distinct points tends to zero. I wrote a blog post about this and how to “prove” it using the weak law of large numbers, here.

Manifold explanation #

Want to write a post about Section 4.1 of this great survey of “the mathematics of modern deep learning”, which includes a really nice explanation of why neural networks don’t seem to suffer from the curse of dimensionality, despite general “universal approximation theorems” suggesting the opposite should be true.

Links and resources #

A good explanation of distance concentration and data sparsity, which is also sometimes lumped in with the curse of dimensionality.
The curse also shows up in high-dimensional Pareto fronts.
Another nice discussion of some weird things that happen in high-dimensional space, including a mention of kissing numbers.
Why interpolation is doomed by the curse of dimensionality.