The Pile: An 800GB Dataset of Diverse Text for Language Modeling [pdf] pile.eleuther.ai 1 points by nixtaken 5 years ago · 1 comment Reader PiP Save dang 5 years ago https://news.ycombinator.com/item?id=25607809