Coding Workflow with LLM on Larger Projects

There’s a common belief in the programming world: LLMs collapse once a project gets too big. They’re fine for small snippets, bug fixes, or a single file. But when the codebase grows beyond a few thousand lines, the story goes, they lose track of context and start generating messy, conflicting code.

I wanted to test if that’s really true.

Today, I’m already at 10,000+ lines of code in my current project — a system with fairly complex logic and multiple interconnected modules — and the workflow is still holding up. I can make changes, extend functionality, and refactor without everything falling apart. It’s slower than “just letting the LLM code,” but it works.

Here’s the process that makes it possible.

Why the Naïve Approach Fails

My first experiments were chaotic. I tried dumping the entire design document into an LLM and asking it to “just build the system.”

It worked fine for a while, but every change introduced new bugs. When the codebase is generated in one big pass, there’s no structure, no traceability, and no way to reason about the bigger picture. After a few iterations, the project was harder to fix than to restart.

That experience made something very clear: if you want to use an LLM on large projects, you need a structured workflow.

The Workflow That Scales

Here’s the approach I now follow for larger projects:

Design First, Generate Second

I hand-write a DESIGN document that explains features, requirements, and goals.

The LLM is then used to fill in details or translate design into implementation notes, not to invent the project from scratch.

2. Architecture as a Living Document

Early on, I generate an ARCHITECTURE file with data structures, data flow, and module responsibilities.

Before any code is written, I review this plan. It’s my way of seeing how the LLM intends to solve the problem. As code evolves, this architecture doc evolves with it — staying in sync is critical.

3. Small, Independent Iterations

Each new feature starts with: pick one task, propose an implementation plan, review it, then code. Every coding stage must leave the project in a working state. This prevents accumulation of broken, half-integrated code.

4. Systematic Review and Correction

After generation, I review the code, run it, and log all corrections.

Each correction is added to a CODING-GUIDELINE.md file — a rulebook that grows over time.

Example: “Never hardcode resource paths” or “Always register entities through the factory.”

5. Optimize and Refactor Frequently

I don’t wait until the end to clean things up. I regularly use the LLM to refactor, deduplicate, and minimize the code.

This ensures the project stays readable and reduces entropy as it scales.

6. Documentation and Code Stay in Lockstep
Every time something changes, I update the DESIGN and ARCHITECTURE docs.

These docs act as a shared map for me and the LLM. Without them, the tool will eventually drift.

7. Commit Often
Each session ends with a commit. This enforces discipline and provides checkpoints I can roll back to.

Why This Works

This workflow turns the LLM into a tool for code generation and transformation, not a co-developer.

The human drives the direction (design, goals, architecture decisions).
The LLM accelerates execution (drafting boilerplate, suggesting structures, refactoring code).
The docs ensure alignment between intent and implementation.

Instead of collapsing at 10k+ lines, the system remains navigable, adaptable, and consistent. The LLM never “remembers” the entire codebase — but it doesn’t need to. By feeding it the right docs and context at each step, I give it exactly what it needs to operate.

Lessons for Anyone Using LLMs on bigger projects

Don’t skip documentation. DESIGN and ARCHITECTURE docs are not overhead; they’re the glue that holds everything together.
Iterate in working states. Broken intermediate stages pile up into chaos.
Treat the LLM as a coding tool, not a teammate. It’s not reasoning about the system — it’s transforming text based on your instructions.
Start fresh context often — it gets slower on longer context and eventually start hallucinating.
Write code yourself to keep in touch with it — make small features and
Codify corrections. Every mistake is a chance to write a new rule in your guideline file. Over time, the LLM improves because you constrain it better.
Consistency beats speed. Yes, it’s slower than “just letting it code.” But that’s why the project is still alive at 10k+ lines, instead of abandoned after the first big refactor.

Example CODING-GUIDELINES.md Workflow

I’m currently using Claude 3.7 Sonnet together with Windsurf editor, and here’s how I integrate the guideline file:

I don’t feed the guideline file to the LLM on every single prompt. Instead, I reference it at the start of implementing a new feature and again at the end.
This avoids a common pitfall: if the LLM sees all refactoring rules upfront, it may start “cleaning up” code before the feature is even complete. By holding back, I keep generation focused on building first and refactoring later.
All prompts for a given feature are run within a single context window so the LLM “remembers” what changes it just made. This prevents it from undoing or re-introducing old patterns.

A generalized excerpt from my guideline file looks like this:

# Beginning of coding  promptImplement feature X
- refer to ARCHITECTURE and DESIGN for reference 
- check for any existing code that can be used
- stop after each working feature milestone for testing, provide clear testing instructions
# Aftermatch prompt 1  
- move all new code that can be common to [place for utilities?]
- move all new data structures to [structures directory]
- check if you can do something simpler
- check if function/classes names reflect their purpose
# Aftermatch prompt 2 
- refactor code, do minimal changes
- reduce code 
- update ARCHITECTURE and DESIGN if there were changes in the logic or architectural changes

Good luck