Why aren't LLMs trained on action / cause+effect data vs. just analytical stuff?
Stupid question, but if we want models that are capable of doing things (agents) vs just spitting out interesting content, why isn't anyone training them on data that represents actions?
Models are incredible at generating analytical / blog-ish / stack overflowish content, but suck at doing things that are complex enough that they require iteration.
For instance: If we want models that can handle complex projects, why don't we record actions taken in the execution of complex projects, and train models on that? Or if we want models that can use a browser competently, why don't we train models on screenshots + action descriptions? (Or is this what was done with o1, which is why it seems to have unprecedented capabilities?)
Is the problem just getting high-quality data? I know we've got internet dumps full of blog-ish content, but no big, easy-to-gather dumps of high-quality information about actions or chains of actions and their effects over time
(I'm sure there are tons of framing problems in this question -- sorry) What you're describing isn't how GPT training works. Mostly, they work on next token prediction without having any understanding of what those tokens actually mean. It works well for text and images but it can't lead to a reproducible set of steps. I wrote an article[0] about it recently that you might enjoy. [0] Something From Nothing | A Painless Approach to Understanding AI https://medium.com/gitconnected/something-from-nothing-d755f... The tokens could describe a sequence of actions and their consequences vs. blog / forum type content Unfortunately, what a token describes is exactly what an LLM doesn't understand. As I explain in my article linked previously, procedural steps with determinate outcomes need procedural, traditional code rather than predictive LLM results. If you just want something to predict the next best step or likely outcome, LLMs can already do that by fine-tuning on the kind of data you're talking about. FYI, today's LLMs aren't trained on blog and forum type content as you mention but actually contain millions of books, academic papers, and other legit sources. Then, they're fine-tuned by a specific industry or company to include actual papers and data from their industry. This is starting to happen; they're calling them Large Action Models.