BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding

78 points by liviosoares 7 years ago · 7 comments

Reader

pacala 7 years ago

Fantastic!

> The code and pre-trained model will be available at https://goo.gl/language/bert. Will be released before the end of October 2018.

zimzim 7 years ago

"We demonstrate the importance of bidirectional pre-training for language representations". can some one help me understand what bidirectional and pre-trained means?

willwill100 7 years ago

* bidirectional - build representations of the current word by looking into both the future and the past
* pre-trained - train on lots of language modelling data (e.g. billions of words of wikipedia) and then train on the task you really care about but starting from the parameters learnt from the language modelling task.

wodenokoto 7 years ago

Is this comparable to fast.ai's ULMfit, which also promises an unsupervised pretrained model, that can be tuned to best state-of-the-art NLP tasks?

jiuren 7 years ago

The big picture is similar. But ULMfit uses amd-lstm for the language modeling, bert uses masked LM instead. Bert has some other tricks like sentence prediction as well.

Settings

BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding

Keyboard Shortcuts