Settings

Theme

Source-Optimal Training Is Transfer-Suboptimal

arxiv.org

1 points by ceh123 a month ago · 1 comment

Reader

ceh123OP a month ago

This paper is a theoretical analysis showing that the ridge regularization that optimizes the source task almost never optimizes transfer performance. Interestingly, in high SNR regimes (low noise) the optimal regularization for pre-training is higher than the task specific optimal regularization, and in low SNR regimes (high noise) it’s better to regularize less than you would if you were just optimizing for that task.

Although the proofs are in the world of (L2-SP) ridge regression, experiments were run using an MLP on MNIST and CNN on CIFAR-10 and suggest the SNR-regularization relationship persists in non-linear networks.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection