Pull requests are dead, long live pull requests

These days when I’m reviewing a pull request it feels like I’m talking to an AI bot, because I am. When I ask questions like “Why did you decide to do it this way?” I usually get a response (from a human) that has all the tells of AI-speak. But I’m not mad at the human, I’m mad at my tools.

I can’t ask the human “Why did you decide to use this library?” because the human didn’t make that decision, the AI did. So the human is going to turn around and ask the AI, “Why did you decide to use this library?” And then paste the answer to me. Pre-AI, you asked the developer, not their manager, questions during code review, so why are we asking the manager now? Let’s cut out the middleman.

All the authors of the code should be involved in the review. What I don’t want is to ask a question, and then an AI agent is spun up to try to reverse-engineer the reason something was done. That’s completely wrong. When we ask a human why, we are interrogating their internal thought process; we need information that was not included with the pull request.

With AI, that calculus has changed.

An AI’s chain-of-thought is externally auditable. Your interactions with the AI are externally auditable. Let’s get that context into the review and into source control.

Should we commit every token produced in the making of a pull request? Absolutely not. There’s too much noise. Instead, we should take inspiration from Random Labs’ idea of episodes where agents return transcripts of successful tool calls and conclusions.

Alongside the code involved in a pull request needs to be a transcript of decisions made and why (or why not). This scales with agent swarms: a pre-pull-request step shall concatenate these decision logs and perform some final formatting.

Inline comments in code are okay but they are problematic for a few reasons:

Changes to source files, even for comments generally requires a re-run of a build pipeline: linting, compilation, and tests
These transcripts are going to be way larger than is generally acceptable for an inline comment
Comments generally talk about how code works now and not all the decisions that went into it.

Code reviews need to be wildly different than before.

When reviewers make comments, the developer’s AI needs to receive them directly and be allowed to decide whether to take action in the form of patches and replies. This is possible now with with a combination of GitHub/GitLab webhooks and their associated MCP servers. The developer can configure their AI to ask for permission first or to go ahead and do what they think is right. Very similar to what happens with Devin AI or Claude Code via GitLab Duo. The (human) developer still has the opportunity to make their own comments and changes. This will greatly reduce the time from a reviewer asking a question to getting results. How often have you made a comment that wasn’t addressed until hours or days later? Or not at all?

For reviewers, they need to be able to interrogate pull requests in the same way they interrogate codebases. Context is king, so the decision log being checked in alongside the code is vital for this purpose.

What I mean is that I need to be able to cleanly interact with a pull request in the exact same way I do now, plus I need to have a window alongside it for interacting with Codex or Claude in the same way as it’s integrated into an IDE for development. I’m not sure there’s a clean tool for this just yet, so I’m begging someone to make one. I’m sick of creating a new git branch just to squash the developer’s branch into it just so I can ask the AI its opinion.

I want to be able to privately go back and forth with AI asking questions about best practices, unknown libraries, and unfamiliar languages in the diff before I decide a code review comment is warranted.

And if a comment is warranted, then this could indicate a gap in the decision log that should be updated before the pull request is accepted.

With seamless AI integration into the review process, it should be much easier for reviewers to go so far as to suggest a straight-up patch.

And while I’m on my soapbox, please add all our design documents/decision records, architectural diagrams to source control in an LLM parseable format like Markdown or Mermaid.

Context is king.

Edit 3/26/2025

Many readers have the impression that I’m condoning or outright advocating that we delegate our entire thinking to the machine, and that’s taking things quite a bit too far. If the AI decided to accomplish a task in a certain way, and the developer carefully read through and verified that it did what they needed (with all the associated considerations for performance, security, …), then did they even need to ask themselves “Why this way instead of some other way?” Maybe. Maybe not. It gets the job done up to our standards. If a reviewer then asks “Why this approach?” Then it’s really a question for the AI, isn’t it? It wasn’t “What does this code do?”, which the developer should be able to answer.

“Why this library?” “Because it gets the job done?” Maybe the reviewer (me) should have been more specific, like “This library introduces a new third party dependency and we already can achieve something equivalent with <other thing>”, but wanting to know someone’s reasons for doing something before implying they might have done something wrong is just good manners.