Building a coding agent in Swift from scratch

98 points by vanyaland 3 months ago · 29 comments

Reader

I think this is a good learning project, based in a long perusal of the github repo. One suggestion: don’t call the CLI component of the project ‘claude’ - that seems like asking for legal takedown problems.

vanyalandOP 3 months ago

Good point, I'll rename the binary. Thanks for actually going through the repo.

bensyverson 3 months ago

I built a Swift library called Operator [0] to run the core agent loop, if it would save anyone time.

[0]: https://github.com/bensyverson/Operator

scuff3d 3 months ago

I'm reading the first of the blog posts. I've never actually seen any Swift code before, but looking at the package definition I'm struck by how much it looks like Zig. I've never heard Andrew Kelly call Swift out as an influence, but it seems some Swift DNA is in Zig.

Also, brave calling it swift-claude-code given Anthropics behavior.

vanyalandOP 3 months ago

Interesting observation on package definitions. Languages borrow good ideas from each other all the time, and the ecosystem is better for it.
On the naming fair point, already renamed the CLI binary after an earlier comment here. The repo name is more about discoverability.
- scuff3d 3 months ago
  
  Agreed. I wasn't saying it's a bad thing, just interesting.
  The thing that really stuck out to me was the dot syntax:`.SomeVariable`. I'm guessing those are enum accesses where the compiler can figure out what it is from context? It's all over Zig, and it seriously screws with me lol. I know it's a me problem, but I can never keep straight what's what.

dostick 3 months ago

It’s not quite clear that this project is- there’s no “Claude code” a program. There’s tui/gui app, harness, prompts, and LLM. so this is a harness part?

vanyalandOP 3 months ago

It's the harness/orchestration layer — the part that runs the agent loop, dispatches tool calls, and manages context.

nhubbard 3 months ago

How practically could we drop in Apple Intelligence once it's using Gemini as its core for a 100% local AI agent in a box?

NitpickLawyer 3 months ago

IIUC Gemini will run in Apple's cloud infra, not on device. The only "gemini" local model is really old by today's standards, and is not that smart for local inference (newer open source models are better).
- nhubbard 3 months ago
  
  That's what I figured. Some day eventually it will be possible. Until then, it's only LM Studio or Ollama as a potential hookup.
  I've got some ideas inspired by this project. It's promising.

faangguyindia 3 months ago

I built my agent in python since agent is CLI.

I used python+rich, but window resize wrecks UI layout

This isn't the issue with nodejs based stuff.

lm2s 3 months ago

Interesting, I'm also building one in Swift :D Seems like a good learning experience.

podlp 3 months ago

I’m also working on agents in Swift with the AFM, just having it locally already installed is a huge selling point. I think narrowly-focused agents with good tooling and architecture could accomplish quite a bit, with tradeoffs in speed and cost. But I’m under the assumption that local models (like frontier models) will only get better with time
zingar 3 months ago

What is the appeal of swift for this project? Is it just what you know?

maxbeech 3 months ago

the interesting design tension i ran into building in this space is context management for longer sessions. the model accumulates tool call history that degrades output quality well before you hit the hard context limit - you start seeing "let me check that again" loops and increasingly hedged tool selection.a few things that helped: (1) summarizing completed sub-task outputs into a compact working-memory block that replaces the full tool call history, (2) being aggressive about dropping intermediate file read results once the relevant information has been extracted, and (3) structuring the initial system prompt so the model has a clear mental model of what "done" looks like before it starts exploring.the swift angle is actually a nice fit - the structured concurrency model maps well to the agent loop, and the strong type system makes tool schema definition less error-prone than JSON string wrangling in most other languages.

vanyalandOP 3 months ago

Yeah, this is basically what I ran into too. I actually wrote about this in Stage 6 (https://ivanmagda.dev/posts/s06-context-compaction/) I went with your option (1): once history crosses a token threshold, the agent asks the model to summarize everything so far, then swaps the full history for that summary. Keeps the context window clean, though you do lose the ability to go back and reference exact earlier tool outputs.
The hard part was picking when to trigger it. Too early and you're throwing away useful context. Too late and the model's already struggling. I ended up just using a simple token count — nothing clever, but it works.
And yeah, the Swift angle was genuinely fun. Defining tool schemas as Codable structs that auto-generate JSON schemas at compile time, getting compiler errors instead of runtime API failures is a huge win.
dostick 3 months ago

So that’s what it is! I was wondering why reducing context and summarising still makes it make mistakes and forget the steering. And couldn’t find explanation to why it starts ignoring instructions when context is not full at all. How did you find that tool call is what degrades it? Isn’t this a biggest problem there is and not just “design tension”?

Settings

Building a coding agent in Swift from scratch

Keyboard Shortcuts