In the speed vs. intelligence spectrum, I've always figured the smarter AI models would outperform faster ones in all but the most niche cases. On paper and in benchmarks, that often holds true. But I've learned -- at least for me personally -- that faster models bring way more utility to the table.
"Agentic" Coding
Man, I hate the word "agentic." I find it to be trendy and pretentious, but "AI-assisted coding" is a mouthful with too many syllables, so I'll begrudgingly stick with "agentic" coding for now.
When it comes to agentic coding, the instinct is to pick a smart model that can make clever changes to your code. You know, the kind that thinks out loud, noodles on an idea for a bit, then writes code and stares at it for a bit. On the surface, that seems like the way to go -- but for me, it's often a productivity killer.
I've got ADHD, and even on meds, my attention span is flaky at best. Waiting for a model to "think" while it performs a task makes me lose focus on what it's even doing. (This is also why I don't believe in running agents in parallel, or huge plan-think-execute loops like what Google Antigravity does). My brain wanders off, and by the time the model's done, I've got to context-switch back to review its work. Then I ask for changes, wait again, and repeat the cycle. It's slow, painfully boring, and doesn't even guarantee I'll get what I want. Those little dead-air moments are what ruin it for me.
Sweet Spot: Leaf Node Edits
I took a break from agentic coding for a while and went back to writing code by hand. No waiting, no boredom, and I always got exactly what I needed (usually). But then I realized something -- not all coding tasks need deep thought. My buddy Alex likes to call them "leaf node edits": small, trivial changes that are more mechanical than cerebral. Think splitting functions, renaming stuff (when Rename symbol doesn't cut it), or writing HTML and Markdown. These are perfect to delegate to AI because failing at tasks like these are rarely consequential, and mistakes are easier to spot.
I think the trick really here is to not rely on AI to think or make architectural or design decisions. Do the planning and heavy lifting yourself; just use the tool for fast, broad-sweeping autocomplete. It's less like hiring a programmer and more like extending your typing speed.
Cursor's Composer 1
I once wrote about how Dumb Cursor is the best Cursor and that Cursor peaked with its first Composer release (I don't know why they chose to name two distinctly different features the same, but who knows anything these days). I take it all back. Composer is back with a vengeance, and it's fast. Aggressively fine-tuned for parallel tool calling, it flies through making changes -- even if it's not that smart.
It makes silly mistakes and sometimes spits out vibe-codey slop (think: too many inline comments, ignoring :=, or overusing try/except where it's not needed) -- but because it's so dang quick, it's a joy to use. For those leaf node edits, speed beats smarts every time.
Gemini Flash et al
I've also tinkered with other fast models like Gemini Flash. It's cheap and decently smart, but it's just too unreliable for me. Google's API endpoints randomly conk out, I've found that it struggles with tool calling, and it'll hallucinate if you stuff too much into its context (which it touts as obnoxiously large). I'm sure there are workarounds -- but I don't want to fuss with it. My goal with agentic coding is low-friction help, not a side project to debug the tool itself.
Then there are superfast inference providers like Cerebras, Sambanova, and Groq. They let you run open-weight, smart models (think Qwen or Kimi) at lightning speed. If I weren't already using Cursor, I'd probably go back to Roo or Crush with these. I just don't want to be managing multiple providers, API keys, and strict rate limiting -- it feels like a hassle, kinda defeating the purpose of a fast model.
Speed for my sanity
At the end of the day, my brain craves tools that keep up with me, not ones that make me wait. Faster models might not be the smartest, but for leaf node edits and mechanical tasks, I find them to be much more palatable. I'd rather iterate quickly and fix small goofs than sit through a slow model's deep thoughts.
Turns out, speed isn't just a feature for me -- it's a necessity.