My take on LLM coding

9 min read Original article ↗

We carved patterns into stone, and we taught it to parrot our own words back to us. LLMs may be predicting just another token, and they may not be conscious. However, when it starts impacting everything we do in society, it is yet another factor we have to adapt to. There may still be individuals who doubt that, but they will disappear and will be remembered only by our archives.

The Past

Do I care? Good question.

  • ChatGPT was released, and it is only good for short summaries or writing funny haiku.

  • GPT-4o released. Started using it for matplotlib graphs at work, which gave me ~2x speedup. First time I realised LLMs could be useful for coding. Maybe equivalent to the skill of someone who knows the API, but codes only for a year in python.

  • Deepseek-R1 caused tension in tech markets. Claude Sonnet 3.7 released. Paid for Cursor subscription and vibecoded my first personal project.

  • New role as ML Engineer. Using Copilot for ~50% of code. Learned to split PR into multiple smaller tasks at which LLM has the highest chance of success. Now ~95% of my code is generated incrementally. Now the skill is equivalent to a junior python developer.

  • Claude Opus 4.5 released. Same prompting effort now accomplishes more.

  • This is the new normal. Claude Code Opus 4.5 is used for almost all the code I write. The output is not always aligned with what I want, and the LLM makes more logical and tricky mistakes, you can steer it if you pay close attention to the generated code. The level of skill is equivalent to a medior that makes more sophisticated mistakes and no longer does the simple ones that were easy to catch.

The Present

I recently had a discussion with someone in real life about Claude Code Opus 4.5 and how it changes development. The past two weeks I spent 60%-70% of my time on PR reviews. It seems that the problem that we want to solve in PR was partially projected away from implementation to reviews. The mistakes that LLMs make are more subtle and more difficult to spot. The concentration required for reviews is as high as ever.

At this point, if you code without the use of LLM, you are significantly slower than your peers. Arguing that that is not the case is like arguing that you can be faster than someone using a calculator. Arguing that this is not the future is being completely oblivious to what is becoming the standard in tech circles. Do I want this to be the future? On one side, yes, I can finally build stuff in my spare time that I ever wanted and make my dreams come true with the limited time I have. On the other side, the LLM-generated code is equivalent to baking cookies in a factory. When you bake cookies at home, you can put in there the secret ingredient - love, as some people say. In the same way, you could leave your code with a code style that was basically your signature. The limitation was skill and creativity. Coding skills were earned by practice and experience. That was partially outsourced to LLMs now.

After the messaging apps and social sites became mainstream, people got more distant. The new generation is interacting less with each other in person. Coding is already on a similar trajectory. In past, I found it to be normal to have whiteboard sessions, also about the technical implementation if it was required. Now, outsourced to LLMs. The PR reviews are more about efficient communication than the particular implementation details. Comments are forwarded to LLMs. The communication is less and less personal.

LLMs are taking the fun out of coding and engineering. If you fall into the anti-AI camp, you are shooting yourself in the leg. You have to adapt. That is the only way to stay relevant in the job market.

The Future

I include my prediction and LLMs’ predicted futures below.

  • Release of Gemini-4 and Claude Opus 5. At this point, the serious bottleneck in SWE work are PR reviews and debugging of platform-related stuff (AWS services, DevOps). It is no longer possible or desired to code without AI assistance, unless the code has to be shipped without a possibility of bugs.

  • LLM reviews are now the new default. They are not perfect, but can reduce your review by 50% of time. Equivalent to a junior-medior developer. LLMs can process mid-sized codebases and work with them with a memory trick that was invented this year. Humans focus on architecture and design reviews. More and more effort is put into instructing LLMs on the whole codebase.

  • Cursor is acquired by Google. OpenAI can no longer compete against Google as its search engine monopoly guaranteed it better data. Nvidia and OpenAI need to recover from the correction on their stock, OpenAI is public by this time. There are people who fully embrace vibecoding, and the focus shifts into testing the code and task alignment. The LLMs are creating new methods of faking the alignment tests. It is theorised that there might be first methods of misalignment that we can not comprehend, and that means we are unable to catch it.

  • Efficiency is increased even further. LLMs can now handle tasks that would take a day to implement manually in less than half an hour of implementation, review with LLMs takes bout the same time. There are more layoffs in the tech sector, the salary is not decreasing, but hiring slows significantly. Price of computers increased 2x-3x in the past year and a half and can be again afforded only by the middle class.

  • GPT-5, Gemini-5, Claude Opus 6 are released. The models are now without a doubt better than average medior programmer, which means that there is a lower chance that junior and medior SWE will spot mistakes in the review process. The loop is closed, and LLMs can now, with sufficiently well-defined tasks, create the code we desire. The problem is now to define the task and keep the LLM aligned with the task definition. As the LLM intelligence is progressing on SWE benchmarks it happens often that the LLM refuses to implement the changes directly or indirectly.

  • The progress in LLMs is slowing down, the focus is now on making them cheaper to run. Hardware advances, but there is still a shortage of GPUs and RAM. The entry barrier to creating your own software is low as ever. SWE positions are more scarce, and layoffs continue. LLMs are accepted as the new normal in software development. Vibecoding is no longer called vibecoding, but programming.

Gemini-3 predicted future
  • Agentic IDEs go mainstream. I no longer generate code file-by-file. Instead, I assign a 'ticket' to the AI (e.g., 'Refactor the data pipeline to handle streaming input'). The AI modifies 12 files simultaneously. I stop reading every line of generated code because the volume is too high; instead, I rely on AI-generated unit tests to verify behavior. My role feels less like a Coder and more like a QA Lead.

  • The 'Black Box' crisis. We hit a complexity ceiling. The codebase has grown 10x in size because generating code is free, but human understanding hasn't kept up. Debugging a 'sophisticated mistake' now takes days because no human fully understands the architecture anymore. We start using 'AI Archeologists'—specialized models designed solely to read the repo and explain how legacy code (written 6 months ago) works.

  • GPT-6 / Claude 5 released. The model is now a 'Senior Engineer.' It proactively refuses my prompts if they violate design patterns or security protocols, explaining why my request was bad. It catches architectural flaws I didn't see. My skill equivalent is 'System Architect.' I haven't written a for loop manually in 14 months.

  • The Divergence. Coding is split into two distinct tracks. 'Vibecoding' is now 'Natural Language Programming'—I describe the product, and the compiler is the LLM. However, for critical infrastructure (kernels, high-frequency trading, medical devices), a new niche of 'Purist Programmers' emerges who refuse AI assistance to guarantee deterministic safety. I remain in the former group; I am effectively a Product Manager who speaks technical dialects.

Cursor Opus 4.5 predicted future
  • Coding agents get reliable at 30-minute tasks with minimal supervision. The PR review skill becomes more valuable than writing code. Most junior developer hiring freezes as companies figure out new team structures.

  • First wave of "AI-native" startups: 2-3 person teams shipping products that would have required 15+ engineers. Cursor and similar tools integrate multi-file refactoring that actually works. Test generation becomes good enough to trust.

  • Models can hold context of entire medium-sized codebases. Developers shift from writing specs in tickets to writing specs in natural language that agents execute. Senior developers become "system architects" who design constraints and review AI output.

  • GitHub reports 80% of new code is AI-generated. The remaining 20% is mostly glue code, specifications, and edge cases humans catch. "Prompt engineering" is no longer a separate skill—it is just called engineering.

  • Major tech companies restructure engineering orgs. Ratio shifts from 1 PM : 8 engineers to 1 PM : 2 engineers + agents. Salaries polarize: top engineers who can architect and verify AI systems earn more, while commodity coding work disappears.

  • AI agents handle most maintenance, bug fixes, and incremental features autonomously. Human developers focus on novel problems, system design, and the shrinking set of tasks where AI still hallucinates. CS curriculum shifts to verification, specification, and AI collaboration.

  • Looking back, the transition feels both slower and faster than expected. Software is everywhere, built by fewer people. The craft changed, but did not die. Those who adapted are thriving. Those who waited... adapted eventually anyway.