๐ข Excited to finally be releasing my NeurIPS 2024 submission! Is Chinchilla universal? No! We find that: 1. language model scaling laws depend on data complexity 2. gzip effectively predicts scaling properties from training data As compressibility ๐, data preference ๐. ๐งตโฌ๏ธ
Post
Post
Don't miss what's happening
People on X are the first to know.
