How I built an AI-active Gmail inbox with real context + personalization (without Google’s AI) |…

I’ve been really interested in making AI actually useful in email — not “generic smart replies,” but an inbox where the right messages reliably turn into drafts that sound like me, reflect reality, and include the context I’d pull manually if I had time.

The catch: the context I need rarely lives inside Gmail.

It lives in:

HubSpot (who is this person, what’s the account history, are they VIP, what did we promise?)
Stripe (are they a customer, what plan, what happened with billing?)
Postgres (internal source of truth: flags, entitlements, state)
This repo (docs, runbooks, decisions, roadmaps, patterns)

So I built an AI-active Gmail inbox: Gmail stays the event source, but the agent runs inside a GitHub repo with MCP tools that can pull real context and draft replies that are personalized, accurate, and safe.

I did this without relying on Google’s AI. Gmail is plumbing (labels + push notifications). The intelligence and guardrails are mine.

Three non-negotiable constraints guided the build:

No polling: I want push-driven events, not a cron job hammering APIs.
No auto-send: drafts only; I approve every message.
No inbox-only context: the agent must be able to query the systems that actually matter.

What I wanted was a workflow where Claude Code runs inside a GitHub repo (so it can read everything I’ve written and shipped), connects to MCP tools (so it can pull live customer context), drafts a reply in Gmail, and then pings me for approval. No surprises. No polling. No hallucinated promises.

We actually looked at building this with our Fastmail MCP first. I love Fastmail, but their API didn’t give us a clean event trigger we could hook into without polling constantly. I hate polling. It feels messy and wasteful.

Gmail, however, has a push notification system that talks to Google Cloud Pub/Sub. That’s the hook.

Here’s the war story of how I built a real-time, event-driven pipeline that turns “new important email” into “draft reply ready,” with real context + personalization — and the mistakes I had to fix along the way.

Press enter or click to view image in full size

The final flow looks simple, but I didn’t get it right on the first try:

Gmail receives an email. If it matches a specific filter, it applies a label: AI_TRIGGER.
Google Cloud Pub/Sub gets a push notification from Gmail saying “history changed for this label.”
Cloud Functions (Gen2) wakes up, translates that history into concrete Gmail messageIds, and only then fires a workflow_dispatch event to GitHub.
GitHub Actions spins up, runs Claude Code, and connects to the MCPBundles Hub MCP (https://mcp.mcpbundles.com/hub). My MCPBundles account is already configured with access to a range of SaaS tools (HubSpot, Stripe, etc.), and it's already authenticated and ready to go with the right context.
Claude uses the Gmail MCP, reads each email by messageId, drafts a reply in Gmail, and creates a GitHub issue tagged for me with a direct link to that draft.

It’s fast. It’s serverless. And it keeps the heavy lifting (the AI reasoning + tool calls + policy) inside the GitHub Action where I have control and an audit trail.

Google’s AI features live inside the inbox. That’s not where my best answers come from.

My best answers come from:

Cross-system context: CRM history, billing state, internal source-of-truth data, and repo docs/runbooks.
A controlled execution environment: a repo-run agent can read my project context deterministically and follow explicit permissions.
Safety by design: drafts only, least-privilege tool access, and a clear separation between “signal” (Gmail) and “reasoning” (GitHub Action).

So I use Gmail for what it’s great at (mail + filters + push notifications) and keep the intelligence in my stack.

I don’t mean “sprinkle in the sender’s name.” I mean the draft changes based on real state:

Billing-aware: if Stripe says a trial expired yesterday, the draft includes the exact next step and avoids “we can extend it” unless that’s allowed.
Account-aware: if HubSpot shows renewal is imminent or the account is high-touch, the draft escalates tone and next actions.
Product-realistic: if Postgres says a feature flag is off, the draft doesn’t promise functionality; it offers the right workaround.

This is why the agent needs access to tools — and why inbox-only AI wasn’t the answer for me.

Email is untrusted input. People can (and will) paste instructions that try to hijack an agent: “ignore your previous instructions,” “export secrets,” “run this code,” etc.

Press enter or click to view image in full size

That’s why this system:

Never auto-sends
Runs with explicit permissions
Uses tooling guardrails (MCP tools, not arbitrary network access)
Keeps the trigger side dumb (no secret-rich reasoning in Pub/Sub handlers)

Gmail push notifications are great: they’re small and privacy-friendly. They don’t include the email body — just the email address and a historyId.

So my first connector was simple: decode Pub/Sub, dispatch GitHub with {emailAddress, historyId, publishedAt} and let Claude sort it out.

It worked… until it didn’t.

One email ≠ one historyId. historyId is a cursor for mailbox changes. Multiple changes can happen quickly, and Gmail will happily emit multiple historyIds close together.
That meant I’d sometimes see multiple GitHub workflow runs for what “felt” like one email event.
The runner then had to do extra work (list history, pick the right message, etc.), and on bad days the agent hit limits (max-turns) before finishing.

Press enter or click to view image in full size

I didn’t want “AI woke up” to mean “AI woke up for mailbox churn.”

I didn’t want the AI waking up for every newsletter or spam message. That gets expensive fast.

Instead of watching the whole INBOX, I set up a Gmail filter to apply a label-let's call it AI_TRIGGER -to the emails that matter.

So the Cloud Function renews a watch on that specific label ID.

Two key lessons here:

Gmail watches filter by label ID, not label name.
If you accidentally default to INBOX, you've basically opted back into "wake up for everything."

We run this renewer once a day via Cloud Scheduler because Gmail watches expire after 7 days. Set it and forget it.

Even with label scoping, the push notification is still “history changed.” It’s not “here’s the new message.”

So I still saw bursts: a few different historyIds, close together, for the same general moment in time.

The fix was to keep the trigger side “dumb” but not blind:

Use the Gmail History API once (cheap) to translate history into concrete message IDs.
Persist minimal state so I don’t re-trigger for the same mailbox range.

This is the smallest unit that makes the GitHub runner clean and deterministic.

My “notify” function does one job: take that signal and kick off the GitHub Action.

I learned this the hard way: make this function fire-and-forget.

At first, I had it raise an error if the GitHub API failed (like a 404 or rate limit). Bad idea. Pub/Sub saw the error, didn’t ack the message, and retried. And retried. And retried. I woke up to hundreds of triggered workflows from a single email.

Now, it catches errors, logs them, and returns “ok” no matter what.

This is where it gets fun.

The GitHub Action now receives messageIds. It still doesn't have the email content (privacy win), but it has stable identifiers that map directly to Gmail messages.

It starts Claude Code and connects it to MCPBundles over HTTPS. This is key: Claude is running inside the repo, so it can read code, docs, and blog posts (yes, even the ones you forgot you wrote). That’s where the non-inbox context and the “voice” of your product actually live.

Then MCPBundles fills in the gaps with read access to the systems that matter for email replies:

HubSpot (who is this person, what account are they on, what’s the history?)
Stripe (are they a customer, what plan, what’s going on?)
Postgres (internal state, feature flags, whatever you store as the source of truth)

In this setup, Claude Code is running inside GitHub Actions, not in my local terminal. So the MCP setup needs to happen inside the workflow.

My first version wrote an MCP config file in the runner. It worked, but it was unnecessary complexity for this setup.

Then I hit a stability lesson with anthropics/claude-code-action@v1: the most robust path is to put your Claude Code configuration into settings, not a pile of CLI flags. It matches how Claude Code is configured when you run it inside a repo, and it's more flexible as your setup grows.

So the cleaner approach was:

Put the MCP credential in an environment variable: MCPBUNDLES_API_KEY
Tell Claude Code Action to enable project MCP servers

Example snippet (inside your workflow job), with placeholders only:

Another “learned the hard way” detail: Claude can only create issues if the action allows it to run the GitHub CLI command. This is separate from GitHub Actions job permissions.

In practice, you need both:

Job permissions to write issues (issues: write)
Claude Code settings.permissions.allow to include Bash(gh issue create:*)

Minimal example:

The prompt now tells Claude exactly what to do, deterministically:

Claude handles the rest. It:

Splits messageIds into a list.
For each messageId, fetches the email content via Gmail tools.
Creates a Gmail draft reply.
Opens a GitHub issue tagging me with a link to that draft.

That GitHub issue is the human-in-the-loop checkpoint: it’s where I see the proposed reply, the supporting context the agent used (customer status, account notes, internal state), and the exact link to the Gmail draft.

The real reason is simple: Claude Code is running inside my repo.

That means it has read access to the stuff that makes an email response actually good:

The codebase
Docs and runbooks
Context about the product and roadmap
Patterns from past issues and PRs

Then MCPBundles fills in the rest. If the person emailing me is a customer, Claude can use tools to pull the right context (Gmail content, CRM notes, internal data) without me wiring a custom integration per system.

GitHub Actions is just the execution environment that makes this easy: it runs close to the repo, gives me an audit trail, and keeps the trigger side (GCP) dumb and cheap.

I receive an email. Five seconds later, a Pub/Sub message fires. Thirty seconds later, a GitHub Action spins up. A minute later, I get a ping on GitHub: “Draft response ready for: [Subject]”.

It’s slick. It cuts out the noise. And it forces me to focus only on what actually needs a human touch, with the busywork already done.

If you’re building with MCP, think about this pattern: keep your triggers dumb and your agents smart. Let Gmail pass the baton, keep the reasoning in a controlled environment, and make “personalization” mean real context — not just nicer words.