Building GitHub Canarytokens: A rant about Audit Log gaps

Insights from my (mostly) successful attempt to create API tokens and SSH keys for github.com that trigger an alert (including source IP/User-Agent) when a malicious actor discovers and attempts to abuse the credentials.

Press enter or click to view image in full size

Background

A few weeks ago I got lightly involved in the community response to the supply chain attack against the Trivy vulnerability scanner. On March 19th, 2026 malicious versions of the Trivy binary and GitHub Action were published as part of a larger attack that has continued to expand in scope to include popular NPM packages, LiteLLM, etc in the following weeks.

When the Trivy attack was first discovered, initial triage by the community began as a thread on the (now deleted) discussion topic from the previous compromise of Trivy that had occurred in late February. As TeamPCP realized they had been discovered, they used their GitHub access to delete the entire discussion thread, presumably in an attempt to slow down incident response by impacted users/companies.

When I added a comment preserving the original IoCs on a new GitHub discussion, TeamPCP was so kind as to send me 1000+ spam comments/emails from compromised GitHub accounts:

Press enter or click to view image in full size

Of course the Trivy incident has kicked off the same conversations as previous supply-chain attacks about dependency cooldowns, trusted publishers, immutable releases, etc. However, very little of the conversation has focused on what Aqua Security or the companies/users who unintentionally executed the malicious Trivy binary/GitHub Action (which included extensive credential harvesting malware) could have done to detect the attack sooner.

In particular, I’ve haven’t seen many mentions of my favorite tools, Canarytokens (aka Honeytokens). These are digital “tripwires” disguised as attractive targets such as fake files, API keys, or database entries that trigger an alert the moment a threat actor touches or opens them. They act as a proactive security alarm, notifying you of a breach by revealing exactly when and where an unauthorized user is snooping in your system.

Goals

There is already an excellent list of supported Canarytokens including AWS keys and Kubeconfig both of which were exfiltrated and later abused by TeamPCP. However, notably missing from this list of supported credentials (and from the various competitors) is a GitHub Canarytoken, the primary credential type used to stage the compromise of Trivy.

Our goal is to create an SSH key and API token for github.com and then leave it somewhere an attacker/malware is likely to stumble upon it, for example:

Within a GitHub Actions secret with an interesting name like GH_ADMIN_TOKEN
Inside the ~/.ssh directory on a self-hosted GitHub actions runner image
A machine entry in the ~/.netrc file on a developer’s workstation
Inside Kubernetes Secret/ConfigMap within a development environment

If an attacker attempts to use the token/key, it should trigger an alert as quickly as possible, capturing the source IP address, User-Agent header, etc. Ideally, it shouldn’t be possible for an attacker to identify these are canary credentials without first detonating them.

The implementation will look something like this:

Identify the types of credentials that are best suited for canarytokens (i.e. read-only) and how they are provisioned (ideally they can be created programmatically).
Create fake resources (repositories, issues, pull-requests), accessible via the credentials, that will look attractive for an attacker to interact with.
Configure audit logging using available features from the vendor, filter for events related to the canarytokens and trigger an alert, for example ‘paging’ via Slack.
If the audit logs are incomplete/broken (foreshadowing), additional complexity/hacks may be required to alert on any usage…

Security log vs Audit log

If you’ve ever dug into your GitHub account settings (or had the misfortune of doing incident response on a GitHub org/user), you may already be familiar with the ‘Security log’ which records actions taken on a personal account, ex:

Press enter or click to view image in full size

While there are quite a few different event types in this log, notably missing are any events for the use of an API token/SSH key, only the initial creation or deletion of credential resources will emit a ‘Security log’ event. Additionally, there is no API* for programmatically accessing ‘Security log’ events, only a manual export button in the GUI.

Of course this lack of visibility was not acceptable for ‘Enterprise’ customers so GitHub introduced a related feature: Audit logs. These logs track actions taken against an Organization instead of a User (more on the limitations of this later). ‘Audit logs’ are far more verbose than ‘Security logs’ and can be accessed programmatically*. In particular, we will be interested in the git.clone or git.fetch events which were introduced in December 2020, as well as the api.request event added in February 2023 which became generally available in January 2025.

Unfortunately, GitHub Audit logs are not free. In order to gain access the events we care about, our Organization(s) must be part of a GitHub Enterprise Cloud (GHEC) Enterprise. This will run a minimum of $21/month (after the 30 day trial expires) for the single admin user required to setup the pipelines/credentials:

Press enter or click to view image in full size

Configuring and Streaming Audit Logs

Once our GitHub Enterprise is created, we need to configure the Audit Log settings which will be shared across every Organization in our Enterprise. In particular we need to enable source IP disclosure and API request events:

Press enter or click to view image in full size

It’s worth noting that GitHub has some (frankly absurd) opinions about which audit events can contain source IPs. You cannot see source IPs for any activity involving a public repository (you know, the ones being targeted in all these supply chain attacks). Generally, it’s a reasonable stance that I shouldn’t get to see the source IP/actor behind some unassociated GitHub user performing a ‘git clone’ of my public repository.

However, refusing to include source IPs on write actions like a git.push to a protected branch is unreasonable. Especially when the user is inherently a member/admin of the organization, acting on a public repository owned by said organization. This will force us to exclusively use private/internal repositories as our targets for our canarytokens.

Although git audit log events can be obtained by polling the REST API, the api.request event is only available when using audit log streaming, largely due to the event volume. The audit log streaming feature supports a variety of destinations including AWS S3 buckets, Google Cloud Storage, Azure Event Hub, etc.

For the purposes of building a PoC, we want minimal setup/dependencies so I’m going to focus on the Splunk HTTP Event Collector (HEC) destination which can be configured with an arbitrary hostname/port. When events arrive, GitHub will emit them as an HTTP POST to the /services/collector POST endpoint:

POST /services/collector HTTP/1.1
Host: cf-github-canarytokens.bored-engineer.workers.dev:443
Accept-Encoding: gzip
Authorization: Splunk hunter2
Connection: close
Content-Length: 1337
Content-Type: application/json
User-Agent: Go-http-client/1.1{"event":{...},"time":"1776463522.955"}{"event":{...},"time":"1776463524.123"}

We can pretend to be a compliant Splunk HEC collector by returning a successful JSON response (as well as answering health-check requests). Annoyingly, the events are delivered without any delimiter like a newline separating them which makes parsing using JSON.parse fail. I wrote a simple Cloudflare worker which handles this format and just logs each incoming event:

Press enter or click to view image in full size

A few seconds later, audit events begin to arrive:

Press enter or click to view image in full size

Organization and Repository Setup

With audit logs configured, we need to create our fake Organization under our Enterprise. I suggest a name that closely matches your actual public GitHub organization and repository which will appear enticing enough for an attacker to clone/access.

Press enter or click to view image in full size

Next, we need to create an internal/private repository. I’d suggest populating the repository with some amount of content (ex: AI slop terraform) or at least a README.md so it’s not an empty/uninitialized repository:

Press enter or click to view image in full size

Creating a SSH Deploy Key

Now that we have all the audit logging components setup, we can create our first canary credential, starting with an SSH key as they are the easier to reason about than API tokens. Specifically, deploy keys are an SSH key that is specific to an individual GitHub repository instead of a GitHub user and default to read-only access:

Press enter or click to view image in full size

If we SSH to github.com using the newly created key (which can also be created via the REST API), the response will direct the attacker to our repository:

Press enter or click to view image in full size

When we use this SSH key to clone the repository, we receive a git.clone audit event that includes the source IP and even the build/version of git used!

{
  "repository_public": false,
  "_document_id": "PHelWiQ6to2YlxnWOph03A==",
  "action": "git.clone",
  "actor": "deploy_key",
  "actor_ip": "203.0.113.37",
  "actor_location": {
    "country_code": "US"
  },
  "business": "canary-bored-engineer",
  "hashed_token": "+qycZFngYHBoZcx/+Ri5lIYs8FnI8t0DPZpEECxeuLc",
  "org": "contoso-private",
  "programmatic_access_type": "Public Key (User/Deploy)",
  "repo": "contoso-private/infra-secrets",
  "repository": "contoso-private/infra-secrets",
  "request_access_security_header": "",
  "request_id": "d389dcd282ff50415c7b41c0d1fd09ed",
  "transport_protocol_name": "ssh",
  "user": "",
  "user_agent": "git/2.53.0-Darwin",
  "@timestamp": 1776398159553,
  "actor_id": 0,
  "business_id": 588174,
  "org_id": 275559016,
  "repository_id": 1208913072,
  "token_id": 0,
  "transport_protocol": 2,
  "user_id": 0
}

What about User SSH Keys?

As a (more popular) alternative to deploy keys, every GitHub user can configure SSH keys associated with their account instead of a repository. While this can also be done programmatically via the REST API, these SSH keys are more powerful and are publicly accessible making them poorly suited for canarytokens:

Specifically, there are no fine-grained authorization controls for SSH keys on GitHub, every SSH key attached to your user will have full read/write access to every repository your GitHub user has access to (with the exception of organizations that enforce SSO which require a one-time approval/authorization before the SSH key can be used).

Additionally, it’s entirely possible for an attacker who has found an SSH private key to check if it’s valid and what GitHub user it corresponds to without actually using it. The SSH (public) keys for a user can be trivially accessed by adding ‘.keys’ to their GitHub URL, ex: https://github.com/bored-engineer.keys

This feature is what allows you to quickly add your SSH keys to a new Ubuntu install and what powers the wonderful whoami.filippo.io. Some secret scanners like trufflehog will do this out of the box. An educated attacker may be able to identify patterns in our ‘canary’ GitHub users, especially if they are being shared across multiple unrelated GitHub organizations. This violates our goal of making it difficult to check the validity of the credential without detonating it.

A final problem is discoverability, when an SSH key is found by a secret scanner, there’s nothing that will immediately lead the attacker to the canary repository where we get audit events for clones. Some tools like trufflehog will attempt to directly SSH to GitHub to obtain the username, but our attacker still won’t know the name of our private repository to clone and trigger an alert unlike with deploy keys:

Press enter or click to view image in full size

As such, I would strongly suggest using deploy SSH keys instead of user SSH keys for canarytokens.

Personal Access Tokens

With SSH credentials figured out, we can move on to personal access tokens (PATs). These are a far more interesting target for attackers as it allows them to enumerate private repositories, pulls, issues, etc. However, as we’ll soon discover, they are quite a bit messier to build alerting for compared to SSH keys.

Unfortunately, one of the biggest limitations with PATs is they cannot be created programmatically (at least without screen scraping the main github.com UI). This will significantly limit the ability for commercial Canary vendors to offer the tokens as it requires manual steps by a human to provision.

Most GitHub API users are familiar with “classic” PATs, identified by the ‘ghp_’ prefix. These PATs use broad permissions such as ‘repo’ which grants read/write to all private repositories accessible to a user. Because this also permits dangerous actions like deleting the canary repository, I would not recommend classic PATs. With the said, it is possible to create “safer” classic PATs by using a dedicated GitHub user who has restricted permissions (read-only) for the canary GitHub organization.

Get Luke Young’s stories in your inbox

Join Medium for free to get updates from this writer.

Remember me for faster sign in

Instead of classic PATs, fine-grained personal access tokens (identified by the github_pat_ prefix) can be used which grant permissions on specific GitHub repos:

Press enter or click to view image in full size

We can then use this token via the REST API or GraphQL API to access any details about our canary repository (or clone the contents) the same way an attacker might if they had discovered it:

Press enter or click to view image in full size

Shortly after, an audit event will be sent containing the source IP, User-Agent and normalized request URL/body:

{
  "actor_is_bot": false,
  "public_repo": false,
  "_document_id": "tWjPbax39pvd8MyeK-rZCA",
  "action": "api.request",
  "actor": "canary-bored-engineer",
  "actor_ip": "203.0.113.37",
  "actor_location": {
    "country_code": "US"
  },
  "business": "canary-bored-engineer",
  "hashed_token": "CUQ+aPiFMrcEOLXuuZ+0MDlCyaV5YXRoTkNgQMke5BE=",
  "operation_type": "access",
  "org": "contoso-private",
  "programmatic_access_type": "Fine-grained personal access token",
  "query_string": "",
  "repo": "contoso-private/infra-secrets",
  "request_body": "",
  "request_id": "A381:1F948A:17901F3:17F0724:69E1D1D7",
  "request_method": "GET",
  "route": "/repositories/:repository_id/pulls",
  "url_path": "/repositories/1208913072/pulls",
  "user": "canary-bored-engineer",
  "user_agent": "GitHub CLI 2.89.0",
  "user_programmatic_access_name": "ci-canary",
  "@timestamp": 1776407000157,
  "actor_id": 272395934,
  "business_id": 588174,
  "created_at": 1776407000157,
  "org_id": 275559016,
  "rate_limit_remaining": 4994,
  "repo_id": 1208913072,
  "status_code": 200,
  "token_id": 13603103,
  "user_id": 272395934
}

Cracks in API Audit Logs

Unfortunately, the gaps in GitHub audit logs begin to surface as we move on to PATs. Because audit logs are associated with a GitHub Organization, we will only see events related to Organization resources.

For example, the first request made using a compromised API token is typically the /user endpoint to obtain the id/login of the user (and associated permissions via the ‘x-oauth-scopes’ header). This is NOT an organization resource and as such no audit log event will be generated when this request is made.

What’s unexpected, is that some endpoints like /user/repos will list every repository a user has access to (including organization ones) without generating an audit log event. This is why it’s critical to create legitimate looking orgs/repositories that an attacker will be interesting in exploring further.

Interestingly, forked repositories are a weird middle-ground to this logic. If you fork a private organization repository into your personal account, it will continue to emit organization audit logs when you interact with the fork via git or the REST/GraphQL API.

As best I can tell, GitHub only emits api.request audit logs from the REST API when the URL path includes /repos/{owner}/{repo} or /orgs/{org} (where the organization has audit logs enabled). This means in addition to /user/repos, the entire /search family of endpoints can be used to access private organization code, commits, issues, pulls, etc using a basic org:contoso-private query without generating a single audit event.

Even within this simple logic, there appears to be gaps! Reading repository files via /repos/{owner}/{repo}/contents/{path}, does not generate any audit events, despite the tarball and zipball APIs right next to it doing so!

The GraphQL API is even worse, as long as you avoid directly returning a Repository or Organization object, you can perform all the queries in the world without generating audit events. For example, using the search query with any ISSUES or DISCUSSION type grants access to issues, PRs (a sub-type of Issue), discussions, etc.

With a bit of knowledge about how GitHub GraphQL node_id values are constructed, as well as a repo ID (ex: 12345) obtained from /user/repos, we can skip right past the Repository object. For example, the node_id of a git Ref is simply "REF_" + base64url(msgpack([0,12345,"refs/heads/main"]) which can be directly used in a node query to read repository contents:

query {
  node(id:"REF_kwDNMDmvcmVmcy9oZWFkcy9tYWlu") {
    ... on Ref {
      target {
        ... on Commit {
          file(path: "README.md") {
            object {
              ... on Blob {
                text
              }
            }
          }
        }
      }
    }
  }
}

These gaps extend to GraphQL mutations as well, such as updateRef which can be used to effectively force push branches/tags without any audit events (api.request or git.push) being emitted!

A draft of this post highlighting the gaps in audit logs was shared with GitHub on 4/19/2026 who stated they will take this research into account as they continue to build out the audit logging roadmap.

Building a (broken) Safety Net

Sadly, these sort of audit logging gaps are pretty common problem for multi-tenant SaaS providers. For example, in AWS there are multiple ways to test IAM credentials without generating a CloudTrail event (see Nick Frichette’s wonderful talk at fwd:cloudsec).

To solve this gap with AWS canarytokens, Thinkist developed the “AWS Safety Net” which periodically polls the credential report API that returns a last_used_date for each IAM access token and is always updated when a token is used. You won’t get access to the IP address/User-Agent, but at least you’ll know the credential was compromised wherever it was stored and can begin to roll incident response.

We can see in the GitHub UI that a similar “last used” timestamp is tracked for PATs and SSH keys:

Press enter or click to view image in full size

After some digging, the raw timestamp is accessible via the /orgs/{org}/personal-access-tokens REST API which returns a token_last_used_at field. This API does come with a few restrictions, it only works when authenticated using a GitHub App installation and only returns fine-grained PATs (no classic PATs). The /user/keys and /repos/{owner}/{repo}/keys REST APIs can be used for similar purposes for SSH keys though it has slightly less value.

However, it seems the “last used” tracking for PATs is broken. For classic PATs the field is just never updated at all anymore, even when a token is actively being used to read (or even write) organization resources (hope you haven’t relied on that for any incident response!)

For fine-grained PATs, the behavior is inconsistently broken. Some critical actions such as git clone/git fetch/git push or using the previously mentioned contents REST API do not update the token_last_used_at field (the token will incorrectly show as “Never used” in the UI). Most other API calls do populate the field, however after it’s been populated it remains fixed at that initial timestamp.

What I suspect is happening here is that GitHub has implemented a roughly 1-week backoff for updates to this field, otherwise every API request (even read-only ones able to be served from cache) require a DB write. We can see this in the UI with the wording “Last used within the last week” rather than an exact timestamp, they’ve just forgotten that the raw timestamp is exposed via the API which appears to return per-second precision...

A slow descent into madness

At this point I was ready to give up, publish this blog post, and claim it was impossible to build the “perfect” GitHub canarytokens. At best we can use audit-logs to alert on most common actions, and maybe we can detect some other usage via the polling the “last used” for each credential with a granularity of ~1 week. To be clear, this is still pretty good and worth implementing, but we can do better….

In the middle of the night, the solution hit me: GitHub may not be willing to handle a DB write for every API request just for some “last used” field in the UI, but they can’t afford to ignore API abuse and scraping...

The GitHub REST and GraphQL APIs both have primary rate limits which limit the number of HTTP requests you can make per-hour. By default, this is 5,000 requests/points per hour for core/graphql resources. Every API request will consume a rate-limit by at least 1 point.

We can use the raw/plaintext canarytoken PATs to frequently poll the /rate_limit REST API (which ironically does not consume any rate-limits) and compare with the last seen value. If the used field for any rate type (core, graphql, etc) increases, the token has been compromised and we need to trigger an alert! Of course in practice, there are few other edge cases to account for:

The rate limit resets every hour so there’s a small window where an attacker might use a token just before it resets and we’d never know as it’s still at 5,000 during the next poll. Thankfully, GitHub uses redis to store these rate-limits, populating an expires_at property with the timestamp when the limit will reset. If a PAT is never actually used, no redis key gets created, and no reset gets scheduled. You can see this if you use a brand-new PAT to poll /rate_limit, the reset timestamp is always 1 hour in the future, effectively a placeholder.

A really crafty attacker might think to abuse our PAT and then quickly use the /credentials/revoke API to invalidate it before our next scheduled poll of /rate_limit, preventing us from identifying the number of requests made. So we’ll need to alert if the API response indicates a credential was revoked as well.

Finally, rate-limits (for PATs) are per-user, not per-token which means if we have multiple tokens owned by the same GitHub user, we won’t be able to identify which specific token was compromised/used. This also makes any canarytokens owned by a “real” user who consumes API rate-limit normally (such as via the GitHub CLI) impractical.

Putting it all together into a PoC

To build a end-to-end PoC, I modified the previously mentioned Cloudflare worker to publish basic alerts to Slack using incoming webhooks:

Alerting on specific audit events is as simple as filtering on the incoming event.hashed_token field for the known canarytoken credentials (specified via GITHUB_CREDENTIALS enviroment variable/secret):

Press enter or click to view image in full size

To implement the “Safety Net” functionality I used Workers KV to track the last response from /rate_limit per each token. When the Cron Trigger runs (every 5 minutes) it alerts if the used field has increased for any resource type:

Press enter or click to view image in full size

Revocation is also detected within the same process when polling the rate-limits fails:

Press enter or click to view image in full size

I would love to see the Canarytoken vendors take this research and implement it into their products, if you’d like to do so and want to chat more, feel free to reach out on LinkedIn. For corporations that are already using the GitHub Audit Log streaming feature, it should be trivial to build similar alerting functionality directly in their SIEM with a small CronJob to implement the safety check functionality.

Recommendations for GitHub organizations

If you own a GitHub organization/enterprise, the visibility gaps in audit logs discussed in this post may have scared you (and they probably should!) but there are still some things you can do today to reduce risk:

Enable audit log streaming on your enterprise including source IPs and API requests, even if it’s just going to an S3 bucket nobody looks at it, your incident response team will thank you later.
Enforce the use of SSO on your GitHub organization, not just because SSO is good but because it forces an explicit authorization action by users to grant an SSH key/PAT access to your organization resources, instead of granting access implicitly. That way the PAT created for someone’s weekend project won’t have access to your organization resources.
Enforce an IP allowlist for your organization from a set of known trusted VPN/corporate IPs. This is by-far the strongest control (and the most painful to rollout) as it will prevent stolen credentials (even if still valid) from being used by an attacker except on the intended systems where you (hopefully) have other visibility/alerting via EDR or related tooling.
If you can, restrict access from personal access tokens to your organization resources. Blocking classic PATs and enforcing a maximum expiration (ex: 3 months) on fine-grained PATs is a great way to reduce risk if you can’t eliminate PATs altogether.
If you use GitHub enterprise (on-prem), configure collection of the raw HTTP access logs in addition to native GitHub audit logs, it may prove critical during incident response.

If you are some random person in Nebraska who maintains a critical repository under your GitHub user, please consider moving it under an organization so you can gain access to the above controls.

My requests for GitHub

In the hopes someone who works on audit logs at GitHub has made it this far into the post, please consider my (roughly prioritized) list of feature requests/fixes:

Resolve the gaps in api.request audit log events from the REST API when the response returns any organization-owned resources (ex: /user/repos, /search, /repos/{owner}/{repo}/contents/{path}, etc)
Resolve the gaps in api.request audit log events from the GraphQL API when the response returns or modifies organization-owned resources (ex: node, nodes, resource queries or mutations that accept a node ID)
Resolve the bugs that are preventing updates to the last_used fields for SSH keys and PATs.
- For SSH keys this should include ssh git@github.com sessions used by secrets scanners that don’t invoke git-shell.
- For API keys this should include the /user endpoint used by secret scanners.
- If the ~1 week backoff for updates to this field is intentional/necessary, please document it clearly.
Please reconsider your position on source IP disclosure for audit log events involving public repositories, in particular if the actor is a member of the organization and/or has any privileged access to the repository.
Add more details to the git.{clone,fetch,push} audit events, in particular a list of OIDs being fetched/pushed could be incredibly helpful during incident response.
Emit a git.push audit event when a repository is modified using the REST/GraphQL APIs such as via the updateRef mutation, or the /repos/{owner}/{repo}/contents/{path} endpoint.
Consider adding an API for provisioning of fine-grained PATs, this would at least allow automated rotation for APIs where a GitHub app cannot be used (and it would allow automated canarytoken provisioning).
Consider adding an audit log (streaming) feature for api.request and git.{clone,fetch,push} events by a GitHub app owners. This would allow app owners to pro-actively monitor for abuse of their GitHub app installation tokens/OAuth tokens (ex: the 2022 Heroku breach)
- At the very least if a GitHub app has done the right thing and configured IP allowlisting, send an email/alert when an otherwise valid token is blocked due to an invalid source IP, indicating compromise.
Consider allowing users to opt-in to audit logs as well (not just security logs). I don’t care if you have to charge money for it, or gate it behind owning repositories with enough stars, as we continue to see more supply chain attacks there’s too many high-value repositories owned by GitHub users instead of organizations.
I’ve avoided mention of Enterprise Managed Users (EMUs) (which actually allows programatic access to security log events via audit log streaming) to reduce confusion and because they don’t add much additional value in the context of canarytokens. But they could! I would love to receive an api.request for every API call (regardless of if it targets an organization resource) when the actor is an EMU user and I think there’s a strong argument for supporting this.

If you’re at GitHub and want to chat more about these, please don’t hesitate to reach out!