Why aren't LLMs trained on action / cause+effect data vs. just analytical stuff?

3 points by purplerabbit a year ago · 4 comments · 1 min read

Stupid question, but if we want models that are capable of doing things (agents) vs just spitting out interesting content, why isn't anyone training them on data that represents actions?

Models are incredible at generating analytical / blog-ish / stack overflowish content, but suck at doing things that are complex enough that they require iteration.

For instance: If we want models that can handle complex projects, why don't we record actions taken in the execution of complex projects, and train models on that? Or if we want models that can use a browser competently, why don't we train models on screenshots + action descriptions? (Or is this what was done with o1, which is why it seems to have unprecedented capabilities?)

Is the problem just getting high-quality data? I know we've got internet dumps full of blog-ish content, but no big, easy-to-gather dumps of high-quality information about actions or chains of actions and their effects over time

(I'm sure there are tons of framing problems in this question -- sorry)

dtagames a year ago

What you're describing isn't how GPT training works. Mostly, they work on next token prediction without having any understanding of what those tokens actually mean. It works well for text and images but it can't lead to a reproducible set of steps.

I wrote an article[0] about it recently that you might enjoy.

[0] Something From Nothing | A Painless Approach to Understanding AI

https://medium.com/gitconnected/something-from-nothing-d755f...

purplerabbitOP a year ago

The tokens could describe a sequence of actions and their consequences vs. blog / forum type content
- dtagames a year ago
  
  Unfortunately, what a token describes is exactly what an LLM doesn't understand. As I explain in my article linked previously, procedural steps with determinate outcomes need procedural, traditional code rather than predictive LLM results.
  If you just want something to predict the next best step or likely outcome, LLMs can already do that by fine-tuning on the kind of data you're talking about.
  FYI, today's LLMs aren't trained on blog and forum type content as you mention but actually contain millions of books, academic papers, and other legit sources. Then, they're fine-tuned by a specific industry or company to include actual papers and data from their industry.

wmf a year ago

This is starting to happen; they're calling them Large Action Models.

Settings

Why aren't LLMs trained on action / cause+effect data vs. just analytical stuff?

Keyboard Shortcuts