Settings

Theme

Detecting Overfit Layers without any Data

twitter.com

1 points by charleshmartin 11 days ago · 1 comment

Reader

charleshmartinOP 11 days ago

If you train a model for too long, it may overfit it's training data. Not surprising, this has been know for like forever. But did you know you can detect the signatures of overfitting in the layer weight matrices directly, without needing access to any data (train or test) ?

In our recent paper (with hari kishan prakash ), - : - , we show this explicitly in 2 different classic grokking experiments. And the overfitting we see is very different from what has been seen before!

paper: https://arxiv.org/abs/2602.02859

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection