Settings

Theme

What is agentic engineering?

simonwillison.net

165 points by lumpa 16 days ago · 108 comments

Reader

maxbond 16 days ago

I don't think we should be making this distinction. We're still engaged in software engineering. This isn't a new discipline, it's a new technique. We're still using testing, requirements gathering, etc. to ensure we've produced the correct product and that the product is correct. Just with more automation.

  • ssgodderidge 16 days ago

    I agree, partly. I feel the main goal of the term “agentic engineering” is to distinguish the new technique of software engineering from “Vibe Coding.” Many felt vibe coding insinuated you didn’t know what you were doing; that you weren’t _engineering_.

    In other words, “Agentic engineering” feels like the response of engineers who use AI to write code, but want to maintain the skill distinction to the pure “vibe coders.”

    • zx8080 16 days ago

      > “Agentic engineering” feels like the response of engineers who use AI to write code, but want to maintain the skill distinction to the pure “vibe coders.”

      If there's such. The border is vague at most.

      There're "known unknowns" and "unknown unknowns" when working with systems. In this terms, there's no distinction between vibe-coding and agentic engineering.

      • simonw 16 days ago

        My definition to "vibe coding" is the one where you prompt without ever looking at the code that's being produced.

        The moment you start paying attention to the code it's not vibe coding any more.

        Update: I added that definition to the article: https://simonwillison.net/guides/agentic-engineering-pattern...

        • zx8080 16 days ago

          What if you review 50%? Or 10%? Or only 1%, is it not vibe coding yet?

          Where is the borderline?

          • simonw 16 days ago

            I think the borderline is when you take responsibility for the code, and stop blaming the LLM for any mistakes.

            That's the level of responsibility I want to see from people using LLMs in a professional context. I want them to take full ownership of the changes they are producing.

            • zx8080 15 days ago

              Sounds good, however the bar is probably too far and far too idealistic.

              The effects of vibecoding destroys trust inside teams and orgs, between engineers.

              • ssgodderidge 15 days ago

                As would the effects of shipping unverified, untested code pre-agents existing. Bad quality will always erode trust.

                The problem with LLM-based coding is that the speed it can generate code (whether good or bad) is much faster than before.

            • Toutouxc 15 days ago

              And are you not seeing that level of responsibility?

            • maxbond 15 days ago

              I don't blame the agent for mistakes in my vibe coded personal software, it's always my fault. To me it's like this:

              80%+: You don't understand the codebase. Correctness is ensured through manual testing and asking the agent to find bugs. You're only concerned with outcomes, the code is sloppy.

              50%: You understand the structure of the codebase, you are skimming changes in your session, but correctness is still ensured mostly through manual testing and asking the agent to review. Code quality is questionable but you're keeping it from spinning out of control. Critically, you are hands on enough to ensure security, data integrity, the stuff that really counts at the end of the day.

              20%-: You've designed the structure of the codebase, you are writing most of the code, you are probably only copypasting code from a chatbot if you're generating code at all. The code is probably well made and maintainable.

              • Toutouxc 15 days ago

                I feel like there’s one more dimension. For me, 95%+ of code that I ship has been written (i.e. typed out) by a LLM, but the architecture and structure, down to method and variable names, is mine, and completely my responsibility.

          • 000ooo000 16 days ago

            Have to consult the Definition Engineers to find out

  • skydhash 16 days ago

    My preferred definition of software engineering is found in the first chapter of Modern Software Engineering by David Farley

      Software engineering is the application of an empirical, scientific approach to finding efficient, economic solutions to practical problems in software.
    
    As for the practitioner, he said that they:

      …must become experts at learning and experts at managing complexity
    
    For the learning part, that means

      Iteration
      Feedback
      Incrementalism
      Experimentation
      Empiricism
    
    For the complexity part, that means

      Modularity
      Cohesion
      Separation of Concerns
      Abstraction
      Loose Coupling
    
    Anyone that advocates for agentic engineering has been very silent about the above points. Even for the very first definition, it seems that we’re no longer seeking to solve practical problems, nor proposing economical solutions for them.
    • simonw 16 days ago

      That definition of software engineering is a great illustration of why I like the term agentic engineering.

      Using coding agents to responsibly and productively build good software benefits from all of those characteristics.

      The challenge I'm interested in is how we professionalize the way we use these new tools. I want to figure out how to use them to write better software than we were writing without them.

      See my definition of "good code" in a subsequent chapter: https://simonwillison.net/guides/agentic-engineering-pattern...

      • skydhash 16 days ago

        I’ve read the chapter and while the description is good, there’s no actual steps or at least a general direction/philosophy on how to get there. It does not need to be perfect, it just needs to be practical. Then we could contrast the methodology with what we already have to learn the tradeoffs, if they can be combined, etc…

        Anything that relates to “Agentic Engineering” is still hand-wavey or trying to impose a new lens on existing practices (which is why so many professionals are skeptical)

        ADDENDUM

        I like this paragraph of yours

        We need to provide our coding agents with the tools they need to solve our problems, specify those problems in the right level of detail, and verify and iterate on the results until we are confident they address our problems in a robust and credible way.

        There’s a parallel that can be made with Unix tools (best described in the Unix Power Tools) or with Emacs. Both aim to provide the user a set of small tools that can be composed and do amazing works. One similar observation I made from my experiment with agents was creating small deterministic tools (kinda the same thing I make with my OS and Emacs), and then let it be the driver. Such tools have simple instructions, but their worth is in their combination. I’ve never have to use more than 25 percent of the context and I’m generally done within minutes.

    • esafak 16 days ago

      You can do these things with AI, especially if you start off with a repo that demonstrates how, for the agent to imitate. I do suggest collaborating with the agent on a plan first.

  • simonw 16 days ago

    Yeah, I see agentic engineering as a sub-field or a technique within software engineering.

    I entirely agree that engineering practices still matter. It has been fascinating to watch how so many of the techniques associated with high-quality software engineering - automated tests and linting and clear documentation and CI and CD and cleanly factored code and so on - turn out to help coding agents produce better results as well.

  • archagon 16 days ago

    Actually, if you defer all your coding decisions to agents, then you're not doing engineering at all. You don't say you're doing "contractor engineering" when you pay some folks to write your app for you. At that point, you are squarely in the management field.

    • maxbond 15 days ago

      If you're producing a technological artifact and you are ensuring it has certain properties while working within certain constraints, then in my mind you're engineering and it's a question of the degree of rigor. Engineers in the "hard engineering" fields (eg mechanical engineers, civil engineers) a rule don't build the things they design, they spend a lot of time managing/working with contractors.

      • Peritract 15 days ago

        > If you're producing a technological artifact and you are ensuring it has certain properties while working within certain constraints, then in my mind you're engineering

        This covers every level of management in tech companies.

        • maxbond 15 days ago

          Not really, upper levels of management are more concerned with strategic decisions, they aren't making sure certain invariants are upheld.

      • archagon 15 days ago

        I’m pretty sure engineers in those professions need to know the physical/mathematical properties of their designs inside and out. The contractors are not involved in that and have limited autonomy.

        I would not want to drive over a vibe-coded bridge.

    • imtringued 14 days ago

      The fact that simonw is so eager to drop the word "software" in software engineer and keep the word "engineer" reeks of ego.

      You're not the engineer anymore, but you're still responsible for creating software. Why drop the most important word and keep the ego stroking word?

      • simonw 14 days ago

        Because in order to distinguish what we are doing from vibe coding we need the word that sounds more impressive.

  • vidarh 15 days ago

    I think the automation makes a significant difference though. I'm building a tool that is self-improving, and I use "building" for a reason: I've written about 5 lines of it, to recover from early failures. Other than that, I've been reviewing and approving plans that the system has executed itself. Increasingly I'm not even doing that. Instead I'm writing requirements, reviewing high level specs, let the system generate its own work items and test plans, execute them, verify the test plan was followed. Sometimes I don't even read past the headline of the plan.

    I've read a reasonable proportion of the code. Not everything is how I'd like it to be, but regularly I'll tell the system to generate a refactoring plan (with no details, that's up to the agent to figure out), and it does, and they are systematically actually improving the quality.

    We're not quite there yet, but I plan to build more systems with it that I have no intention of writing code for.

    This might sound like "just" vibe coding. But the difference to me is that there are extensive test plans, and a wide range of guard rails, a system that rewards gradually refining hard requirements that are validated.

neonbrain 16 days ago

The term feels broken when adhering to standard naming conventions, such as Mechanical Engineering or Electrical Engineering, where "Agentic Engineering" would logically refer to the engineering of agents

  • simonw 16 days ago

    Yeah, Armin Ronacher has been calling it "agentic coding" which does at least make it clear that it's not a general engineering thing, but specifically a code related thing.

  • pamelafox 16 days ago

    I think “agent engineering” could refer to the latter, if a distinction needs to be made. I do get what you’re saying, but when I heard the term, I personally understood its meaning.

  • victorbjorklund 15 days ago

    Lots of things already violate it. The normal Site Reliability Engineer isn’t building the tools for Site Reliability but rather applies the tools to other software.

  • ares623 16 days ago

    Agentic Management doesn't have quite the same ring to it.

    • jfim 16 days ago

      That's kind of how it feels though. I get the impression I'm micro managing various Claude code instances in multiple terminals.

sigbottle 16 days ago

There should be more willingness to have agents loudly fail with loud TODOs rather than try and 1 shot everything.

At the very least, agentic systems must have distinct coders and verifiers. Context rot is very real, and I've found with some modern prompting systems there are severe alignment failures (literally 2023 LLM RL levels of stubbing out and hacking tests just to get tests "passing"). It's kind of absurd.

I would rather an agent make 10 TODO's and loudly fail than make 1 silent fallback or sloppy architectural decision or outright malicious compliance.

This wouldn't work in a real company because this would devolve into office politics and drudgery. But agents don't have feelings and are excellent at synthesis. Have them generate their own (TEMPORARY) data.

Agents can be spun off to do so many experiments and create so many artifacts, and furthermore, a lot more (TEMPORARY) artifacts is ripe for analysis by other agents. Is the theory, anyways.

The effectively platonic view that we just need to keep specifying more and more formal requirements is not sustainable. Many top labs are already doing code review with AI because of code output.

jbethune 16 days ago

I think there is a meaningful distinction here. It's true that writing code has never been the sole work of a software engineer. However there is a qualitative difference between an engineer producing the code themselves and an engineer managing code generated by an LLM. When he writes there is "so much stuff" for humans to do outside of writing code I generally agree and would sum it up with one word: Accountability. Humans have to be accountable for that code in a lot of ways because ultimately accountability is something AI agents generally lack.

  • nlawalker 16 days ago

    I think within the industry and practice there's going to be a renewed philosophical and psychological examination of exactly what accountability is over the next few years, and maybe some moral reckoning about it.

    What makes a human a suitable source of accountability and an AI agent an unsuitable one? What is the quantity and quality of value in a "throat to choke", a human soul who is dependent on employment for income and social stature and is motivated to keep things from going wrong by threat of termination?

aewens 16 days ago

“It’s not vibe coding, it’s agentic engineering”

From Kai Lentit’s most recent video: https://youtu.be/xE9W9Ghe4Jk?t=260

  • simonw 16 days ago

    Thanks for the reminder, I should add a note about vibe coding to this piece.

    • DonHopkins 15 days ago

      Except for the bad advice about using VIM. Emacs FTW! I even named my cat Emacs.

redhale 15 days ago

As someone who works with real licensed engineers (electrical, civil), I wish we would use the term "agentic software engineering" to describe this. Omitting "software" here betrays a very SWE-centric mindset.

Agents are coming for the other engineering disciplines as well.

danieltanfh95 16 days ago

Agentic engineering is working from documentation -> code and automating the translation process via agents. This is distinct from the waterfall process which describes the program, but not the code itself, and waterfall documentation cannot be translated directly to code. Agent plans and session have way more context and details that are not captured in waterfall due to differences in scope.

nclin_ 16 days ago

I've discovered recently as code gets cheaper and more reliable to generate that having the LLM write code for new elements in response to particular queries, with context, is working well.

Kind of like these HTML demos, but more compact and card-like. Exciting the possibilities for responsive human-readable information display and wiki-like natural language exploration as models get cheaper.

jdlyga 16 days ago

Sure, you could argue it's like writing code that gets optimized by the compiler for whatever CPU architecture you're using. But the main difference between layers of abstraction and agentic development is the "fuzzyness" of it. It's not deterministic. It's a lot more like managing a person.

pamelafox 16 days ago

I’ve been using the term “agentic coding” more often, because I am always shy to claim that our field rises to the level of the engineers that build bridges and rockets. I’m happy to use “agentic engineering” however, and if Simon coins it, it just might stick. :) Thanks for sharing your best practices, Simon!

iamcreasy 16 days ago

Is there any article explaining how AI tools are evolving since the release of ChatGPT? Everything upto MCP makes sense to me - but since then it feels like there is not clear definition on new AI jergons.

kevintomlee 16 days ago

the practice of developing software with the assistance of coding agents.

Spot on.

ChrisArchitect 16 days ago

Previously on the guide Agentic Engineering Patterns:

https://news.ycombinator.com/item?id=47243272

aryehof 15 days ago

Agentic Coding or perhaps Agentic Software Development is far more real and appropriate . Calling it engineering is better left to those wanting to impress family and peers.

Aafi04 16 days ago

Curious how this evolves when agents start retaining memory across projects. Feels like that could change how we think about the tool loop.

deadbabe 16 days ago

I think we all know what Agentic engineering is, the question is when should it not be used instead of classical engineering?

righthand 16 days ago

How is this different than Prompt Engineering?

  • roncesvalles 16 days ago

    I think prompt engineering is obsolete at this point, partly because it's very hard to do better than just directly stating what you want. Asking for too much tone modification, role-playing or output structuring from LLMs very clearly degrades the quality of the output.

    "Prompt engineering" is a relic of the early hypothesis that how you talk to the LLM is gonna matter a lot.

  • simonw 16 days ago

    Prompt engineering didn't imply coding agents. That's the big difference: we are now using tools write and execute the code, which makes for massively more useful results.

  • giancarlostoro 16 days ago

    Prompt engineering was coined before tooling like Claude Code existed, when everyone copied and pasted from chatgpt to their editor and back.

    Agentic coding highlights letting the model directly code on your codebase. I guess its the next level forward.

    I keep seeing agentic engineering more even in job postings, so I think this will be the terminology used to describe someone building software whilst letting an AI model output the code. Its not to be confused with vibe coding which is possible with coding agents.

  • ares623 16 days ago

    "Prompt" was derogatory /s

mmastrac 16 days ago

After three months of seeing what agentic engineering produces first-hand, I think there's going to be a pretty big correction.

Not saying that AI doesn't have a place, and that models aren't getting better, but there is a seriously delusional state in this industry right now..

P-MATRIX 16 days ago

The skepticism makes sense to me. The core issue isn't wrong outputs—it's that there's no standard way to see what the agent was actually doing when it produced them. Without some structured view of tool call patterns, norm deviations, behavioral drift, verification stays manual and expensive. The non-determinism problem and the observability problem feel like the same problem to me.

itsTyrion 15 days ago

very simple answer: it's mostly marketing fluff that embodies "2 (many) wrongs make 1 right (enough)"

allovertheworld 16 days ago

Staring at your phone while waiting for your agent to prompt you again. Code monkey might actually be real this time

CuriouslyC 16 days ago

The halo effect in action.

techpression 16 days ago

I mean agents as concept has been around since the 70s, we’ve added LLMs as an interface, but the concept (take input, loop over tools or other instructions, generate output) are very very old.

Claude gave a spot on description a few months back,

The honest framing would be: “We finally have a reasoning module flexible enough to make the old agent architectures practical for general-purpose tasks.” But that doesn’t generate VC funding or Twitter engagement, so instead we get breathless announcements about “agentic AI” as if the concept just landed from space.

habinero 15 days ago

Markdown engineers try to rebrand yet again.

TheAtomic 16 days ago

Agents are ? and the answer is circular, "agents run tools in a loop." And this guy knows things?! No. BS.

AdieuToLogic 16 days ago

The premise is flawed:

  Now that we have software that can write working code ...
While there are other points made which are worth consideration on their own, it is difficult to take this post seriously given the above.
  • simonw 16 days ago

    If you haven't seen coding agents produce working code you've not been paying attention for the past 3-12 months.

    • Eufrat 16 days ago

      I get the impression there’s a very strong bimodal experience of these tools and I don’t consider that an endorsement of their long-term viability as they are right now. For me, I am genuinely curious why this is. If the tool was so obviously useful and a key part of the future of software engineering, I would expect it to have far more support and adoption. Instead, it feels like it works for selected use cases very well and flounders around in other situations.

      This is not an attack on the tech as junk or useless, but rather that it is a useful tech within its limits being promoted as snake oil which can only end in disaster.

      • simonw 16 days ago

        My best guess is that the hype around the tooling has given the false impression that it's easy to use - which leads to disappointment when people try it and don't get exactly what they wanted after their first prompt.

        • Eufrat 16 days ago

          I think you and a lot of people have spent a lot of energy getting as much out of these models as you can and I think that’s great, but I agree that it’s not what they’re being sold as and there is plenty of space for people to treat these tools more conservatively. The idea that is being paraded around is that you can prompt the AI and the black box will yield a fully compliant, secure and robust product.

          Rationality has long since gone out of the window with this and I think that’s sorta the problem. People who don’t understand these tools see them as a way to just get rid of noisome people. The fact that you need to spend a fair amount of money, fiddle with them by cajoling them with AGENTS.md, SKILL.md, FOO.md, etc. and then having enough domain experience to actually know when they’re wrong.

          I can see the justification for a small person shop spending the time and energy to give it a try, provided the long-term economics of these models makes them cost-effective and the model is able to be coerced into working well for their specific situation. But we simply do not know and I strongly suspect there’s been too much money dumped into Anthropic and friends for this to be an acceptable answer right now as illustrated by the fact that we are seeing OKRs where people are being forced to answer loaded questions about how AI tooling has improved their work.

    • AdieuToLogic 16 days ago

      > If you haven't seen coding agents produce working code you've not been paying attention for the past 3-12 months.

      If you believe coding agents produce working code, why was the decision below made?

        Amazon orders 90-day reset after code mishaps cause
        millions of lost orders[0]
      
      0 - https://www.businessinsider.com/amazon-tightens-code-control...
      • erklik 16 days ago

        Good journalism would include : https://www.aboutamazon.com/news/company-news/amazon-outage-...

        I find it somewhat overblown.

        Also, I think there's a difference between working code and exceptionally bug-free code. Humans produce bugs all the time. I know I do at least.

        • AdieuToLogic 15 days ago

          > Good journalism would include ...

          The link you provided begins with the declaration:

            Written by Amazon Staff
          
          I am not a journalist and even I would question the "good journalism would include" assertion given the source provided.

          > I find it somewhat overblown.

          As I quoted in a peer comment:

            Dave Treadwell, Amazon's SVP of e-commerce services, told 
            staff on Tuesday that a "trend of incidents" emerged since 
            the third quarter of 2025, including "several major" 
            incidents in the last few weeks, according to an internal 
            document obtained by Business Insider. At least one of 
            those disruptions were tied to Amazon's AI coding assistant 
            Q, while others exposed deeper issues, another internal 
            document explained.
            
            Problems included what he described as "high blast radius 
            changes," where software updates propagated broadly because 
            control planes lacked suitable safeguards. (A control plane 
            guides how data flows across a computer network).
          
          If the above is "overblown", then the SVP has done so. I have no evidence to believe this is the case however.

          Do you?

          • erklik 11 days ago

            > I am not a journalist and even I would question the "good journalism would include" assertion given the source provided.

            You've misunderstood. I was saying good journalism would include both sides, and hopefully primary sources alongside the reporting, so readers can evaluate both.

            > If the above is "overblown", then the SVP has done so. I have no evidence to believe this is the case however.

            It says "at least one of those disruptions were tied to Amazon's AI coding assistant Q, while others exposed deeper issues." You initially cited this article as evidence that coding agents don't produce working code. But the SVP is describing a broader trend of deployment and control plane failures, most of which are classic infrastructure problems that predate AI tooling entirely. You're attributing a systemic operational failure to AI code generation when even your own source doesn't support that.

            More fundamentally, your original argument was that the premise "software can write working code" is flawed. One company having incidents, where some of those incidents involved AI tooling doesn't prove that. Humans cause production incidents every single day. By your logic, the existence of any bug would prove humans can't write working code either.

      • simonw 16 days ago

        You appear to be confusing "produce working code" with "exclusively produce working code".

        • AdieuToLogic 16 days ago

          > You appear to be confusing "produce working code" with "exclusively produce working code".

          The confusion is not mine own. From the article cited:

            Dave Treadwell, Amazon's SVP of e-commerce services, told 
            staff on Tuesday that a "trend of incidents" emerged since 
            the third quarter of 2025, including "several major" 
            incidents in the last few weeks, according to an internal 
            document obtained by Business Insider. At least one of 
            those disruptions were tied to Amazon's AI coding assistant 
            Q, while others exposed deeper issues, another internal 
            document explained.
            
            Problems included what he described as "high blast radius 
            changes," where software updates propagated broadly because 
            control planes lacked suitable safeguards. (A control plane 
            guides how data flows across a computer network).
          
          It appears to me that "Amazon's SVP of e-commerce services" desires producing working code and has identified the ramifications of not producing same.
          • simonw 16 days ago

            That's why I'm writing a guide about how to use this stuff to produce good code.

            • AdieuToLogic 14 days ago

              > That's why I'm writing a guide about how to use this stuff to produce good code.

              Consider the halting problem[0]:

                In computability theory, the halting problem is the problem
                of determining, from a description of an arbitrary computer
                program and an input, whether the program will finish
                running, or continue to run forever. The halting problem is
                undecidable, meaning that no general algorithm exists that
                solves the halting problem for all possible program–input
                pairs.
              
              Essentially, it identifies that mathematics cannot prove an arbitrary program will or will not terminate based on the input given to it. So if math cannot express a solution to this conundrum, how can any mathematical algorithm generate solutions to arbitrary problems which can be trusted to complete (a.k.a. "halt")?

              Put another way, we all know "1 + 2 = 3" since elementary school. Basic math assumed everyone knows.

              Imagine an environment where "1 + 2" 99% of the time results in "3", but may throw a `DivisionByZeroException`, return NaN[1], or rewrite the equation to be "PI x r x r".

              Why would anyone trust that environment to reliably do what they instructed it to do?

              0 - https://en.wikipedia.org/wiki/Halting_problem

              1 - https://en.wikipedia.org/wiki/NaN

              • simonw 13 days ago

                I find the challenge of using LLMs to usefully write software despite their non-deterministic nature to be interesting and deserving of study.

                • AdieuToLogic 13 days ago

                  I get the appeal and respect the study you are engaging.

                  A meta-question I posit is; at what point does the investment in trying to get "LLMs to usefully write software despite their non-deterministic nature" become more than solving the problems at hand without using those tools?

                  For the purpose of the aforementioned, please assume commercial use as opposed to academic research.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection