Two Months of Vibe-Coding: Scala, Constraints, Trust and Shipping

11 min read Original article ↗

Voytek Pituła

I’ve been programming for almost 15 years now, last 5 on mission-critical software in the finance domain. I know the stakes, tradeoffs and implications — most of my daily job is about those. Yet for the last two months I went full-throttle on AI-assisted coding across personal projects, OSS work, and professional tasks.

The Turning Point

I’ve been playing with agentic coding since the Junie release around May 2025. The results were varying — sometimes great, sometimes completely useless. Everything changed around December when the tools got significantly better. Now I mostly use Claude Code, occasionally Junie. I tried Codex but it was completely subpar.

Scala Works Fine, Thanks for Asking

One thing people were mentioning last year is that AI struggles with niche languages, including Scala. It doesn’t.

Scala 3 syntax works without issues. Same with implicits, type-level programming, and macros. Cross-compilation between JVM and JS? No problems. sbt? Also fine. I asked it to learn Laminar from the docs and it found Waypoint on its own without any struggle. I’ve thrown obscure libraries at it like decisions4s and it handles them pretty well. Having checked-out sources around helps but isn’t necessary — it just goes to docs or GitHub if it doesn’t know something.

I don’t use significant whitespaces and don’t plan to. I’ve heard AI can get confused with it but never tested this myself. It for sure doesn’t get confused with good old curly braces.

Harness, Not Prompts

If there’s one thing I want you to take away from this article, it’s this: testing harness is the most important thing for vibe-coding. Not prompt engineering, not fancy plugins, just constraining your AI outside AI toolchain.

I’m calling it harness because it’s not only tests. It’s tests, types, linters, and any other automated checks you can put in place. The more you rely on AI, the more harness you need. AI has a blindspot: it’s very much local-reasoning-focused and can easily miss the big picture. Integration tests are your guardrails.

But here’s what might be the most impactful part: libraries and frameworks are harness too. They enforce structure and safety by design. When AI generates code using a well-designed framework, it’s constrained to patterns that make sense.

And the same applies for the language itself— Scala’s immutable-first standard library is a harness on its own. AI can’t accidentally mutate a shared state when there’s nothing to mutate. Add scalafmt for consistent formatting and sbt-tpolecat with -Werror to turn the compiler into your most powerful linter - it catches issues AI can easily miss.

For anything UI-related, e2e testing is a blessing. AI can handle writing Selenium tests perfectly well, and it gives it huge autonomy in fixing its own mistakes. When I built SSBudget (more on it below), developing e2e tests at the very beginning was one of the best decisions I made. Bugs happened — of course they did. So I just made a list of them, gave it to Claude, asked to cover them with tests and fix. Done.

Bottom line: invest in constraints as much as you can. “Liberties constrain; constraints liberate” hits once again.

The Review Problem

Let me be blunt: reviews are a real pain in the ass, and their diligence has become a function of code importance. This is probably the biggest bottleneck right now.

For UI code, if it looks OK in the browser, I don’t give a crap how it’s expressed under the hood. I couldn’t write good frontend code in the first place, so I can’t meaningfully review it either. I just accept it and move on (at least the html part).

For OSS I use CodeRabbit and it helps a lot with PR traffic. At work I’ve used Claude for semi-manual reviews — it’s OK but not enough.

My rough estimate is that ~50% of AI output needs major refinements or followups, and only ~40% is OK with minor changes (the remaining ~10% is a total throwaway). The review burden adds up fast.

What I’ve Actually Shipped

A lot of people praise the AI revolution without any hard data to back it up. So below is what I was able to do in the last two months or so.

SSBudget —my new personal budget tracker:

I’ve built it in a week of evenings while playing Brotato or watching Netflix. Fully vibecoded. Tech stack: Scala 3, Laminar, Tapir, Selenium, SQLite.

What did AI handle? Everything, really.

  • Backend — my comfort zone, handled fine enough.
  • Frontend with Laminar — stuff I can barely do, I didn’t use Laminar before but have general Scala.js experience.
  • Passkey authentication — stuff I definitely couldn’t do without significant learning; just knowing where to start would have taken me days.
  • Selenium tests — stuff I would never do because it’s a pain in the ass.
  • It even handled video creation with banners and pauses for the promo based on the e2e tests. (You can see it in the readme)

Business4s ecosystem:

All of those were significant developments waiting to be done for months. And I try to keep the quality bar there quite high.

Moreover, I also used it to refine a lot of GitHub issues. Historically, when I spotted a problem, I usually wrote 1–3 sentences just to record it. Now I can give those sentences to Claude and get a meaningful comprehensive description with a solution design draft.

Personal projects:

  • SSBudget described above
  • My personal reading tooling: transformed my string-and-tape Python scripts into a Flask app in a few days. I never used Python for anything bigger before.

Work: A lot of tasks, but with much more nuance when it comes to AI usage — maybe 50% of OSS effectiveness.

Design, Not Just Coding

What’s equally important: I use AI for design equally with coding. I ask it to brainstorm ideas, plan work, refine technical details and architecture. The output design docs are good for both humans and for later agentic coding sessions. But those designs need a lot more back-and-forth than actual coding, significantly more iteration and that's were I spend most of my time now. Once the design is solid, though, implementation is rarely a problem.

I’ve planned not just particular features but also bigger multi-phase projects. One example from work: a feature that required two prior big enabler refactorings. AI helped plan the whole sequence. Another example: exploring the revival of abtesstr and chatOps4s projects— redefining goals, roadmpa, product-market-fit, APIs, architecture.

And the gain is not only quantitative but also qualitative — I’ve never been so diligent about my work. Previously I would just start coding stuff and see what comes. Now AI forces me to write my thoughts and plans down, and it makes a big difference.

The Mindset Shift

Do I miss organic coding? Yes and no. I still write maybe 10–20% of code where problems are genuinely hard and AI can’t get it. But here’s what changed: solving problems gives me much more satisfaction than typing ever did, and the satisfaction shifted from crafting code to designing systems.

I multiplied my effectiveness by leaps and bounds and the tradeoff is worth it.

But — and this is crucial — knowing what to build became much more important than ever before. I could easily build stuff that makes no sense in the grand scheme of things. Producing code got very cheap, but every line of code is still a cost to maintain. You really need to know what makes sense, what features are important, what is the long-term strategy, what approaches to use, what will be the impact on the team and maintenance. Product-mindset and broad perspective are key.

Here’s my current worry: without strong software engineering fundamentals — and I don’t mean coding, I mean real long-term engineering thinking — AI-assisted coding is a certain disaster. Everyone now has access to Ferraris, but the brakes and controls are still in our heads.

And here’s another thing people don’t talk enough about: using AI is a skill in itself. You need to proactively train it the same way you trained your functional programming expertise or system design skills. It’s not just “write a prompt and magic happens”. You develop intuition for what to ask, how to structure tasks, when to intervene, when to let it run. I’m still learning this, and probably will for the foreseeable future.

The Time Revolution

This deserves its own section because it’s probably the biggest practical change for me.

I used to need large uninterrupted blocks of time to be productive. Getting into the zone, loading context into my head, building momentum — that took time. A 30-minute slot between meetings was useless. An evening when family might need me any moment was not worth starting anything serious.

That’s gone now. AI holds the context and does the grunt work while I think about the next thing, which allows me to split work into smaller chunks. Fifteen minutes of my time can yield hours worth of AI output — enough to review what it produced, give it the next task, and move on. Those fragmented evening slots when my family doesn’t need me are suddenly productive.

I adapted my workflow to exploit this: I try to have 2–3 AI-compatible tasks running all the time, so at least something is always in progress. This sometimes requires multiple clones of the same repository when I’m working on two features simultaneously.

The scope of what’s possible also expanded dramatically. Missing expertise is no longer a blocker — I would have never implemented passkeys’ support in a toy project because of how much learning it required, but now I just describe what I need and iterate until it works.

How I Actually Work

Here’s the thing about my setup: it’s deliberately boring.

I never bother with prompt crafting. Over time I’ve built some intuition about what to mention, but I’m not doing it explicitly. I just write what I need and hope for the best. Instructions from CLAUDE.md are notoriously ignored. Even instructions from /plan can be missed. I don’t use MCPs, plugins, skills, sub-agents or any of that. They’re too non-deterministic and bleeding edge. I put a lot of effort into eliminating noise in my life in general, and those tools fall into the noise category for now. I’ll probably start using them soon when they are reliable and give significant leverage.

And there is one constant battle: verbosity. AI loves to be verbose — in docs, code comments, code itself. “Keep succinct”, “eliminate repetition”, “remove redundant comments” are probably my most repeated phrases.

Can You Actually Trust It?

You almost never can trust AI with anything — everything comes down to reviews, harness, and diligence.

UI is my only exception — if it looks OK, I accept it. But I don’t build UIs for work or for anything serious, so my bar is low there. Anything else has to be treated with the utmost care.

Technical debt will bury you if you don’t keep things under control. And sensitive code and credentials require even more discipline than before — if you kept your secrets in a repo before, you will have an even bigger problem now.

Where This Goes

I honestly don’t know how people will get enough expertise to use AI reliably in the future. Learning from more experienced colleagues is now more important than ever. And leaving juniors without supervision is more dangerous than ever. The fundamentals that let you judge whether AI output makes sense — those still need to come from somewhere.

And a friendly advice from me: don’t follow the hype. AI is here to stay, and it’s transforming how we build software but don’t jump on every new thing. I ignored Copilot, Cursor, Google Gemini, Ollama, local models, early Claude Code and other tools for a long time. I’m currently ignoring Moltbook (OpenClaw). You have limited time and resources in your day. Using tools when they become “boring”, mainstream, proven — is usually the most effective strategy.

If you’re a developer on the fence about AI tooling: it works, it’s not hype, but it requires a different kind of vigilance than writing code yourself, and it’s a new skill you have to build.

The shift is real though. We are going from typing code to writing designs, from crafting to directing, from coding to shipping. I’m still adapting, but I’m delivering more than ever.