Ask HN: Line by Line Agentic Coding
I've been suckered into running claude code and codex independently and vaguely scanning the output before moving on. I don't love the code-base familiarity that produces.
Anybody figure out a neat harness setup where one goes file by file, method by method with the agent? Looking for suggestions and what the SOTA is for this sort of thing. You could use a separate project, which sounds nightmaroish given the ecosystem evolving so fast and devs that are let the LLM make beheaded chickens seemm orderly and rational. Or you can use one of the two to create a common source of context that includes code base mapping and depending on your mileage vectorization of the code itself, which I implement as an init command in oopencode as a first step with a project. The skills specs are universal, agents are too, the rest I am not 100% sure but its mostly just markdown descriptions of things until you get into advanced use cases with SDKs so you can set up a skill or agent using a skill+tool to go "file by file, method by method" then write a bash script or find a bloated multimodel harness to have each use it for your purposes. I like to use Pi (https://pi.dev/), and I recently got it to make an approval extension for itself. It has a lot of documentation built-in for the agent to modify the behavior of the app. I got it to display all proposed file change diffs and bash commands and made it so I can either approve the action or deny it with a message for it. It was surprisingly easy to tell it to modify things things the diff viewing algorithm or syntax highlighting for the diffs. For context, here is the extension it made me: https://gist.github.com/tripplyons/ec953181707b6813d4be9e934... haha i just downloaded pi after browsing HN for a bit. i'm really trying to get off the anthropic train before the subsidies explode and we're left holding onto codebases we don't understand and can't afford. What models do you use for the diff level edits? My concerns would be speed, though I'm coming from letting opus ruminate for minutes at a time so my heuristics may be off. I use a $200/mo OpenAI Codex sub. Throughout the workday I run an average of 2 concurrent agents of GPT 5.5 with high reasoning on fast mode and use less than half of my subscription usage. For more interactive/active usage you might be better off using the low reasoning level, but I have usually found high to be a good balance of intelligence and generation speed.