torch.randperm isn't fully random and it can bias your trillion token-training run 😱 Today we're releasing ModernBERT, a new SOTA encoder-only model series. In this thread however, I'll share how torch.randperm (temporarily) put a wrench in the works (1/10) https://t.co/5a9zXj7rNU

1 min read Original article ↗

Post

Don't miss what's happening

People on X are the first to know.

Log inSign up