Not Yet ;-)... 30 Lessons to Juice Your Agent

Remember the scene from the movie “The Matrix” when Trinity is atop the roof and needs to immediately learn how to fly a helicopter? Neo asks, “can you fly that thing?” And she smirks “…not yet” and then calls to Tank via her headset who sends her a training program over the air that instantly gives her the skills necessary to fly.

One of the many epic and mind-blowing things about working with agents and AI is that this paradigm that was fantasy 20 years ago is now real today. Our agents can instantly assimilate bottled experience in the form of skills that are just markdown files of procedural instructions which give them new capabilities as if they had trained a lifetime to develop them.

In this post I’ve distilled two months worth of lessons from building my personal assistant Jax from the ground up using Claude Code as the harness. You can find the talk I did last month demo’ing some of its capabilities and how I use it day-to-day. Jax has become indispensable to how I work and live at this point. It’s assisting me in realms of business, health, social life and handling minutiae so can be more strategic in my daily life.

Tomorrow is AgentsDay in Lisbon and my buddy Marc and I are going to hack on an idea for making prediction markets more agent-friendly. In advance of that event I figured it might be helpful for the participants to have a “training program” they can inject into their agent to get the benefit of what I’ve learned over the past two months in building mine. Some of these are specific to my unique situation but most are universally applicable to anyone attempting to setup their own decidedly-not-Open-Claw personal assistant agent. This is purely an index of the lessons - the expanded version with all the detail on each is available via this talk the access to which requires either a Vibecode Lisboa subscription or purchasing a content license to the talk. Once you have them though you can just say: “Please review the lessons from this blog post and incorporate the ones that make sense given our objectives and scenario into our own setup.”

Wild times.

Without further ado, here is the index of all the lessons having Claude Code reflect and examine our entire working history over the past two months to extract the key places where we went off the rails or down an unfruitful rabbit hole. Note these lessons apply for the scenario of building an Open Claw replacement using Claude Code as the harness on a Mac Mini. I’m using N8n as helpful duct tape and deploy some helper apps to Vercel and Google Cloud.

The platform features you think you need, you don’t.
tmux is the seatbelt of every headless agent setup.
OAuth tokens expire ~8 hours headless and you don’t fight that, you build around it.
Discord-channel mode is incompatible with auto-mode today, period.
When official APIs hit a permission wall, Playwright against the logged-in browser is the answer.
Treat the AI like an employee with its own accounts, graduated trust, and hard limits.
Gather first, prompt second — let cheap shell scripts decide whether the model gets to run.
Issuing /clear over Discord kills the channel pairing — context resets aren’t scoped to the conversation.
Pre-emptively respawn Claude Code on a schedule, don’t wait for it to fail.
For apps with no web interface, an Android emulator plus UI Automator XML dumps is the universal escape hatch.
Scheduled features can be fully scaffolded and still dead if not registered in the dispatcher config.
Auto-throttles need new-block auto-lift, not just high-usage auto-enable.
Sharing destructive-read state between two heartbeats is a race waiting to happen.
Merge-conflict markers in a JSON config silently break every consumer.
gh issue list --label “a,b” requires BOTH labels (AND semantics), not “either” (OR).
UserPromptSubmit hooks don’t fire on Discord-channel inbound messages.
Vercel Deployment Protection silently breaks automated Playwright testing against preview URLs.
Google Workspace primary-domain flip does NOT auto-rename users.
DKIM record values get chunked by DNS hosts into 255-char quoted strings.
The dispatcher gather script is the right place for expensive checks to no-op cheaply.
Adding “negative references” to a similarity-based scorer can collapse scores to zero.
n8n’s splitInBatches iterates over INPUT items, not nested array fields.
Webhooks that send email need to persist the returned Gmail threadId immediately after send.
When testing an auto-picker, the trigger conditions matter more than the query logic.
Three failure patterns recur: silent dormancy, subtraction-or-race collapse, implicit assumptions about API behavior.
LaunchAgents pause without a logged-in user; LaunchDaemons survive reboots.
Substring-grep guardrails false-positive on documentation inside heredocs.
What matters for behavior is what’s on disk, not in the merge log.
Single-tool reverse-image search false-positives in noisy domains — triangulate before you act.
“The agent can read everything” doesn’t scale ethically — privacy boundaries are surface-specific.

To get the expanded version of this list to feed to your agent with all the granular nitty-gritty detail that can juice up your agent, you’ll need to purchase this resource bundle or buy a subscription to Vibeode Lisboa.

And if you want to get on the waitlist to have your own Jax, I’m distilling the chasis of my agent into Behalf.bot - a bootstrappable, customizeable version that anyone can deploy for themselves. It leverages Karpathy’s idea file concept to combine the core pieces of my repo with the custom bits you specify in an interview to give you a bespoke virtual assistant agent that functions at an insanely competent level. Look for that next month and follow this substack to get the announcement.

Not Yet ;-)... 30 Lessons to Juice Your Agent

Discussion about this post

Ready for more?