Don't Pivot into AI Research
maged.comNot all kinds of scaling requires high up front capital costs. Some of us are interested in exploring fundamental contours of intelligence here, and there are lots of fascinating threads to pull on.
Nothing wrong with exploring motivations though as this article does. Maybe the job market trends they speak of will be true but I hope that's not a primary motivation. There's exploration to do!
I think folks are misreading the article. Its not claiming that AI research isn’t valuable, its advocating for not pivoting into AI research. Research in any field requires a giant surplus of passion and talent. Go to where your passion and talent can bear fruit, not where your spreadsheet of pros and cons and risk adjusted pay grade lead you.
People are too obssessed with "status" with these types of decisions.
At the end of the day, if you go into any field chasing money or status, and if you fail, you will really be left with nothing.
So go into AI research if that's what you are interested in. If not, don't do it. Simple as that.
> Scale beats all else. The best performance improvements come from increasing scale, rather than incremental insights in novel architectures.
...until the next novel architecture is discovered, which won't happen without said AI research.
Yep. Where I work there are _a lot_ of efforts underway in that direction. Put simply, standard transformers are great, but they’re very expensive, both to train and to do inference with. They also need enormous datasets. We need architectures that are compute and sample efficient, and friendly to hardware. A standard transformer ticks none of these checkboxes, and research is needed to actually be able to make money with these models. And because profits depend on this research, it’s going to bear fruit. The field is vast and relatively unplowed.
Exactly! Quality can be very important too, as a series of interestingly advanced small models have shown.
I really disagree with this. There are lots of topics in AI research that are not just "make the best possible model on this task assuming you have unlimited compute".
I won't say I totally disagree, but I think the article overstates the position.
Two of the most important developments for ANN's were backprop and then the deep network training method. Without those we could only evolve ANN's which is much less efficient than the training methods.
If the scale argument held 100% then teams that purely focused on evolving ANN's (this is the "search" method of creating ANN's) to their target would be in the lead, but I'm not aware of anyone doing that other than smaller/hobby size projects.