[Preprint] Neural collapse in the orthoplex regime
New preprint out! Neural collapse is a phenomenon in deep learning where features of a classifier converge to the vertices of a simplex as training progresses. We studied this phenomenon where the number of classes exceeds the dimension–so this emergent structure is no longer a simplex, but a spherical code.
A super interesting idea from this work is the surprising role of temperature–in machine learning one usually approximates the argmax function using the softmax modulated by the temperature. We find that this also governs a sharp phase transition in neural collapse: two different families of spherical code emerge depending on whether the loss function is “hot” or “cold”. This work was done as part of the AMS’ mathematical research community (MRC) on explainable, interpretable, and adversarial AI in 2024.