Attestation is All You Need

4 min read Original article ↗

People want privacy when they use AI.

They send code through it. Contracts. Drafts. Medical questions. Things they would not say out loud at work.

Every LLM provider knows this and tells them the same thing: we don't log your prompts.

I'm supposed to believe them.

OpenAI publishes a 30-day retention policy. Anthropic publishes a privacy policy. Cohere has SOC 2 Type II. These are promises.

A promise is a thing you can sue over after it's broken. That isn't privacy. That's recourse.

There is a way to get the actual thing. It's called attestation, and it has been deployable in the cloud for years. The industry just never put inference behind it.

What attestation gives you

A confidential VM runs inside a hardware-backed enclave. AWS Nitro. GCP Confidential VMs. Azure Confidential VMs. The CPU itself signs a measurement of the running binary. The signature chains to the chip vendor's root key.

You can verify the chain.

You can compute the same hash from the open-source code.

If they match, you know exactly what code is running on the machine that just saw your prompt.

That gives you something contracts cannot:

The code that is running is the code that was published.

The code cannot be changed at runtime without breaking the attestation.

You stop trusting the operator. You start trusting the silicon. The silicon's threat surface is much smaller.

If the code you can read does not write prompts to disk, then prompts do not get written to disk. Not because somebody promised. Because the code says so and you can check.

That is privacy you can actually verify.

How we do it at TrustedRouter

api.quillrouter.com runs inside an attested gateway.

AWS Nitro Enclaves in us-east-1. GCP Confidential VMs in us-central1 and europe-west4. Cross-cloud, so a single vendor's compromise cannot take everything.

Every request can be paired with a live attestation.

You generate a nonce. You hit /attestation?nonce=<your-nonce>. The gateway returns a JWT signed by the hardware root key.

The JWT contains:

  • eat_nonce — your nonce, so the response cannot be replayed
  • image_digest — SHA-256 of the running container image
  • pcrs — the platform measurements at boot

You match image_digest against the artifact hash published at trustedrouter.com/security with every commit. If they match, the code processing your prompts is the code on GitHub.

If the attestation fails — image drift, hardware fault, expired cert, anything — the gateway fails closed. No request reaches a provider until attestation is valid again.

The synthetic monitor probes the attestation path every minute. It pages on the first miss.

What I am not claiming

Attestation is not a wand.

A nation-state with hardware access could try side-channel attacks. AWS and GCP have hardened against most known classes. Not all.

The hardware vendor's root key is a trust anchor. If AWS's or GCP's signing infrastructure is compromised, that chain breaks. Cross-cloud helps. It does not remove the dependency.

Open-source code can have bugs. Attestation proves the running binary is the published binary. It does not prove the published binary is correct. Many eyes still help.

The point isn't that attestation is perfect.

The point is that without it, every privacy claim from every provider is the operator asking you to trust them. With it, you can check.

End-to-end: ZDR and Secure Enclave providers

Attestation on the gateway closes the front half of the path. The back half is the provider you route to.

On TrustedRouter you can pick. Route to a Zero Data Retention provider and the prompt is not stored downstream. Route to a Secure Enclave provider and inference itself runs inside a confidential GPU — encrypted GPU memory, attested model server, no plaintext outside the enclave.

Combine the two and you get end-to-end privacy from your app, through the attested gateway, all the way down to encrypted GPU memory at the model.

The full provider list, ZDR status, and enclave support are at trustedrouter.com/providers.

Why now

LLM traffic is becoming the new sensitive data path. Code goes through it. Drafts. Contracts. Medical notes. Therapy logs.

The default privacy story is still "trust us."

That was acceptable for SaaS in 2010. It is not acceptable for inference in 2026, when one well-placed prompt log can leak more about a person than their email.

Confidential computing is fast now. Nitro overhead is single-digit milliseconds. GCP confidential VM overhead is the same.

There is no performance reason to keep shipping inference without attestation.

There is no honest privacy reason either.

TrustedRouter is live. The trust surface, attestation flow, and full open-source code are at trustedrouter.com/security.