We built a trading bot in two hours. Now my wife and I trade together.

13 min read Original article ↗

My wife and I used to trade from separate Robinhood accounts. One of us would buy something, tell the other about it later, and hope we hadn't done the same thing twice. We were running two portfolios in parallel with no shared language.

Robinhood is fine for one person. It doesn't do anything with two.

So I built something that did. It took roughly two hours.

Why we built this

We had the accounts. We had the positions. What we couldn't do was look at both at once. No API to query, no shared history, no way to ask what the combined picture looked like without manually reconciling two apps on two phones. One of us would research something; the other would make a separate call the same day.

We needed an account that spoke HTTP.

Most retail traders don't know that Charles Schwab has a free, fully documented developer API. You register on their developer portal, wait a day or two for approval, and you have OAuth2 access to your own brokerage account: quotes, positions, order placement, transaction history, all of it. We transferred both our Robinhood accounts to Schwab. The whole migration took a week, including the ACATS transfer.

What we built

The trading core is about 1,500 lines of Python with zero pip dependencies. The Telegram bot that wraps it is another 1,400. Getting to a working first trade took roughly two hours.

The CLI has about twenty commands: quote, research, theme, buy, sell, pnl, audit, and more. It talks directly to Schwab's REST API, with no third-party trading library in the way.

A risk reviewer that runs before every order. Hard limits live here in Python, not in a prompt: 2× maximum leverage, $250k maximum notional per order, a 10% daily-loss kill switch, a 20% drawdown kill switch. These cannot be overridden, regardless of what the AI says.

A Telegram bot that wraps the CLI and serves as the shared UI. Both of us are in the same group, both can ask questions and see the same answers. Claude Sonnet does the reasoning, with a hard daily AI spend cap of $5.

OAuth tokens live in Cloudflare KV. Every order is written to a local JSONL file first, then to Cloudflare D1. The ledger never loses data, even if the database is temporarily unreachable.

The one decision worth stealing

The Reviewer is code, not a prompt.

The dangerous failure mode in any agent system is irreversible AI action. It happens when nobody puts a real checkpoint between "recommend" and "execute." A language model confident enough to say "buy 200 shares of NVDA" should not be the one asking "but is this within our risk limits?"

So the Reviewer runs in Python before any call to Schwab's order API. It fetches the current portfolio snapshot, estimates the post-trade leverage, checks every hard limit, logs its verdict to the database, and either lets the order through or kills it with a reason. The LLM can propose. The code decides if it is safe to proceed.

Propose → Review → Confirm → Execute: the AI proposes, the code reviews, you confirm, Schwab executes.

The order flow runs one direction: Telegram, to Claude, to the Reviewer, to Schwab. The Reviewer runs before the Confirm button appears. If limits are exceeded, there is no button and no order.

The decisions that follow from it

Once the Reviewer is in place as immutable code, a few other things fall into place.

Large orders do not get rejected when they hit the per-order share cap. The system runs a full account-level risk check on the total quantity first, then splits into cap-compliant legs, then gates each leg through the Reviewer independently before submitting to Schwab. Valid large trades go through. Genuinely oversized ones do not.

There is a conviction sub-portfolio: a logical higher-risk book inside the same account, with its own wider per-order and concentration limits. The account-wide ruin guards still apply to every order placed against it: leverage ceiling, kill switches, buying power check. Same floor on the dangerous things, broader ceiling on the rest.

Exit rules run automatically. Hard stop at 8%, trailing stop, time stop after five days, take-profit at 15%. Deterministic code, no LLM in the loop. New exposure is the opposite: always waits for a Confirm tap in Telegram before anything goes to Schwab. The asymmetry is intentional. An automatic stop is fine. An automatic entry is not.

The research prompts can improve over time. But there is a written contract in the codebase: prompts can change, soft parameters can tighten. The Reviewer, the Trader code, and every hard limit cannot. Any proposed change needs passing tests and human sign-off. One-tap rollback if something does not work out.

Every Schwab API call and AI call is logged by role, with a daily cap on AI spend. The costs command shows exactly what the week's research cost, broken down by who asked for what. If the cap is hit, AI spend stops. The money the system costs is as visible as the money it manages.

A decision not to trade is logged to the same ledger as the orders that went through. Every pass, every rejected thesis, every Reviewer block has a record. You can ask why the system did not trade just as easily as why it did.

How we use it

We have a shared Telegram group: me, my wife, and the bot.

Telegram group conversation showing a $30K NBIS buy request, the bot's position sizing and heads-up, the Reviewer aborting the first attempt, a retry that goes through, and a META research brief.
The bot breaks down position size, flags concentration risk, and shows its reasoning before asking to confirm. The first NBIS attempt was blocked by the Reviewer and resubmitted. The META brief at the bottom came from the model's stored knowledge; the Schwab token had expired and live data was unavailable. Numbers are illustrative; this is a sandbox environment.

She'll message the group asking about a stock. The bot fetches a brief: price, analyst targets, thematic context, risk flags. Both of us read the same answer in the same thread. We talk about it. If we want to do something, she proposes a trade. The bot sizes it, runs the Reviewer, and shows the verdict before anything is sent to Schwab.

Telegram group showing order confirmation for 1/1 META 17 submitted, then a pending orders table with 39 META shares and 5 NVDA shares queued for Monday's open.
After confirmation, the bot shows what went in and checks back on open orders. Here, four orders are queued for Monday's open because the market was closed. Both of us see the same status. Numbers are illustrative; this is a sandbox environment.

She taps Confirm. The order goes to Schwab. Both of us see the fill confirmation in the group. Both of us can check P&L and see the same picture.

The money conversation stopped being a black box.

What changed

Not the returns. I want to be honest about that. I have no idea whether this makes us better investors. Markets are hard. Anyone who says their system reliably beats them is either lying or has not run it long enough.

Before, investing was something one of us did and mentioned to the other after the fact. Now it's a shared conversation. She asks a question, I read the same answer, we disagree on something, we ask a follow-up, and we decide together or not at all.

The AI is the third voice in that conversation. It does the research, runs the numbers, surfaces what the Reviewer raises. We make the call.

That was worth the two hours.

What's live now

The system didn't stop after this post went up. Here's what runs today, on top of the Reviewer and the Telegram flow above.

  • An autopilot loop. A scheduled cycle watches the market, the positions, and the account state. Deterministic exits (hard stop, trailing stop, time stop, take-profit) fire with no model involved. New ideas go through a Scanner, then a Strategist that proposes, then a Critic that only ever sees the proposal and the market data, never the Strategist's reasoning, and tries to kill it. Anything both of them clear still needs your tap before it touches the account.
  • Trade plans, not just single orders. Ask for something like "split 10K across four chip names" and the bot builds one plan across all four legs, dry-runs the Reviewer on each, and shows you the whole thing before a single share moves. One Confirm executes every leg in sequence. Plans expire on their own: five minutes for a Telegram request, thirty for an autopilot entry. Neither ever fires on timeout.
  • Long-term memory, in two tiers. Every decision and event gets written to an append-only log, no exceptions. A curator only promotes something to a lesson after it has shown up in at least three similar cases, and it expires after 45 days unless it holds up again. A smaller layer of consolidated beliefs merges duplicates, treats a contradiction differently depending on the market regime instead of overwriting it, and fades with time. Ask /why NVDA in the group and it pulls its own decision history on that name.
  • A self-review that can't rewrite history. Every entry locks its thesis, invalidation point, and conviction before the outcome exists, in a log with no edit function. The post-trade grade scores the decision and the outcome separately, so a bad call that got lucky and a good call that got unlucky land in different buckets instead of both counting as a plain win or loss. A calibration report checks whether the times it said 70% confident actually won about 70% of the time.
  • Prompts and settings that can propose their own changes. The Strategist and Critic prompts, and seventeen soft parameters, can be proposed for change by the system itself, against a written contract listing exactly what's evolvable and what never is: the Reviewer, the Trader, any hard limit. A candidate runs against real historical cases in shadow first, with no money and no live decision affected. A human still has to tap approve, and there's a one-command rollback to any earlier version.

All of it runs in paper mode today. The master switch and the auto-entry switch are both off by default. Risk-reducing exits are the only thing allowed to act without a human tap. That's the design, not a placeholder.

What we're still working on

The autopilot loop has been running in paper mode, not against real money. Turning on live auto-entry, even at tiny size, is next, and it isn't happening until the paper track record earns it.

The memory system is honest about being young. It won't write a lesson until it has seen the same pattern three times, and nothing sticks around past 45 days unless it happens again. Right now it has opinions on almost nothing, which is what a system that hasn't earned any lessons yet should look like.

Self-evolving prompts exist in code, not yet in daily use. A changed prompt has to run in shadow against old decisions and beat the current version before a human even sees a rollback button. We haven't run that cycle enough times to have a real before-and-after story.

Macro and news awareness (FOMC, CPI, earnings calendars, headlines) is built and wired in, but off until we've confirmed a data provider and a budget for it. Live web search for the Critic's research sits behind the same kind of flag.

The pattern across all of it is the same: paper first, tiny real size second, bigger only if the numbers hold up.

What doesn't work

There's still no backtest against historical price data. There is now a paper-trading mode, the autopilot default, plus a shadow-eval harness that runs a candidate prompt against past decisions before it ships. Both check forward from today; a real multi-year backtest is still on the list.

Schwab's refresh token expires after seven days. Once a week, one of us visits a URL, logs in, and re-authorizes. It takes thirty seconds. It is also annoying in the precise way that a real production OAuth2 implementation in a licensed brokerage is annoying. They require it for compliance, and there is no way around it.

The Telegram bot now runs as an always-on background worker on Render, not the Mac mini. The Mac is local-dev only; if it's off, nothing breaks.

We have only run this in roughly normal market conditions. The kill switches exist and have been tested in development. Whether they behave well under a real crash, I do not know. Untested is untested.

What else is out there

schwagent is the closest project I found: same stack (Python + Schwab + Telegram + LLM), open source. It is much larger: 35+ technical indicators, options strategies (wheel, iron condors, covered calls), Monte Carlo backtesting, a web dashboard. It is a serious system.

Ours is intentionally smaller. No indicators, no backtesting, no options. Zero pip dependencies for the core. We wanted something we could deploy in an afternoon and trust because we'd read every line of it.

If you want full autonomy, read the agent-run trading fund post on HN. There the AI makes every call, with no human in the loop. The author documented every dead strategy with the same rigor as the wins. That discipline is admirable. We were building something different: a tool for two people to think together, with the AI helping and the humans deciding.

Reflections

I did not go into this knowing how to build a trading system. I knew Python, I had worked with APIs before, and I had spent years thinking about where human judgment should sit inside an automated loop. That was the product at Phantom Auto, and it is the product at mixus. Trading turned out to be the same design problem at a smaller scale.

Schwab's API is well-documented and easier to use than I expected. Claude is good at reading financial context and structuring a brief that two people can actually talk about. The hard part was the line-drawing: deciding what the system should refuse to do, encoding it in Python, and not moving it when the AI confidently recommended otherwise.

I haven't open-sourced the code yet. The trading core has no external dependencies and could run on most machines, but the OAuth flow and Cloudflare infrastructure are specific to how we have set things up. If you want it, email me at [email protected]. If there is enough interest I will clean it up and publish it properly.

If you have built something similar, or transferred from Robinhood to Schwab for API access, happy to compare notes.


We built this to be in the same conversation about money. Schwab's API made that possible, and it was free the whole time.


Compute helped me draft and proof this story.