Settings

Theme

Arrows of Time for Large Language Models

arxiv.org

6 points by tianlong 2 years ago · 4 comments

Reader

nyoncore 2 years ago

Isn't it obvious that since LLM are trained to predict the next word they do better than to predict the previous one?

  • frotaur 2 years ago

    In the paper it is mentioned that the LLMs predicting the previous token are indeed pre-trained in this way, so it is not true that the difference is obvious.

tianlongOP 2 years ago

There is a link with entropy creation?

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection