"Error 429: Too Many Requests When Accessing Repository Files Without Login" · community · Discussion #159123

Select Topic Area

Question

Body

"I'm experiencing an issue when trying to access files from a GitHub repository without logging in. After 3 attempts, I receive an Error 429 (Too Many Requests) and the page fails to load. I've tried accessing different repositories and files, but the issue persists.

Steps to reproduce:

Access a GitHub repository without logging in.
Attempt to view a file (e.g., clicking on a file).
Repeat step 2 three times.

Expected behavior:
The file should load without errors.

Actual behavior:
After the third attempt, an Error 429 occurs, and the page fails to load.

Additional information:
This issue might be related to GitHub's rate limiting policies. However, it would be helpful to have clearer error messages or guidance on how to access files without hitting the rate limit."

I am encountering the same thing when trying to browse through code in repositories. Gods forbid you cannot find what you're looking for in the first few clicks or you get locked out. If I were a wikipedia mod, this new "feature" of locking users out of open source code, unless they have an account and log in, would be the textbook example of Enshittification, and your logo would be placed right next to that handsome photo of Cory Doctorow. Although, this could also be a great r/LeopardsAteMyFace moment, as it was likely implemented to mitigate the AI bot hellscape you helped unleash on the net. Please remove this dark-pattern nonsense as it is an insult to the open source ethos you had at one point claimed to support. If you are reading this and a developer or maintainer of an open source project, I highly encourage you to move your project to a less predatory and more welcoming platform like Codeberg.

5 replies

If you are reading this and a developer or maintainer of an open source project, I highly encourage you to move your project to a less predatory and more welcoming platform like Codeberg.

Unfortunately, I see it on the horizon that this will be a common trope amongst every service that has expensive (in terms of CPU time) API calls to render necessary functionality... AI training bots are nasty.

The point which explains why the AI training bots then won't crawl logged in to burner accounts is missing. Currently it looks just like we throw away usability because it is easy not because it helps in any way.

Unfortunately, I see it on the horizon that this will be a common trope amongst every service that has expensive (in terms of CPU time) API calls to render necessary functionality...

Which can be solved by making it CPU-intensive for the scrapers. Though FOSS-adjacent metadata could be valuable enough that they would try to scrape it anyway. I’m trying to but failing to see a way to interpret Microsoft’s actions here other than just a big middle finger to anybody who is still hosting their projects on GitHub.

Whoa there! I just came across this. Please don't do this. As comment above says. We have users of our project who don't have GH accounts who would like to see more than three files in our repo.

It's on HN - https://news.ycombinator.com/item?id=43981344

0 replies

Limit seems to be way higher now, took me around 20 attempts to reproduce the issue.

0 replies

1 reply

this blog post talks about using the REST API and downloading raw files, not about navigating the website in the browser.

0 replies

This comment was marked as disruptive content.

How does this affect Golang code when using "go get" or "go mod download"? Sounds to me like it might fail when fetching dependencies.

3 replies

why would go get and go mod download be hitting github.com APIs?

e.g. when using GOPROXY=direct.

I wonder how GitHub handles Google's Go module mirrors (i.e. the default GOPROXY), given that SourceHut reported that they're quite aggressive in their fetching.

@Kryan90 By default, requests go to through the module mirror hosted by Google. This proxy does not work (well) from everywhere. Also, there are reasons against using the proxy (e.g. data protection).

Even when using the proxy, this proxy will be sending a lot of requests. We can only assume Google and Microsoft have some kind of deal.

This comment was marked as spam.

This does not address the issue that was raised. This issue is about blocking end users from browsing via a web browser using the web interface. Nobody is using curl, nobody is trying to clone, nobody is trying to use the API here. This is like getting blocked at the library because you are turning the pages of a book too fast. Saying you now have to create an account and log in to view something that was freely created and given to the world is a betrayal of the social contract and the very community that built this platform and is a 180 degree turn to the intentions GH once claimed to have, hence my strongly worded admonishment above. A classic "bait and switch" if you will.

Hey folks 👋 - Martin from GitHub here. Thanks for reporting the issues. As a few folks mentioned, this was an unintended consequence of throttling limits we've had to put in due to the amount of anonymous scraping activity we've been seeing. However, it's not our intention to limit legitimate human browsing of open source code on GitHub, or adversely restrict legitimate research and archiving activities. But sorry for the issues you've ran into. As of 7:00 UTC this morning we've adjusted the throttling logic and I'm seeing a lot less requests getting caught. Hoping this has stopped the problem for most folks now, but we'll keep monitoring until we've found the right balance.

15 replies

It seems like bad faith to suggest there's something wrong with GitHub just because it got bought by Microsoft. Pre-acquisition GitHub was extremely unstable and lacked many features like reviews, since then they have become quite robust, and more importantly so popular for OSS as to change the face of computing forever, single-handedly. Seems like a pretty good contribution to me. Now GitHub, still mostly Ruby-on-Rails as far as I know, is trying to manage their load and may have had some issues but I'm sure are working diligently to help their users. What AI teams in a huge company like Microsoft are doing many buildings or even regions away doesn't seem to have much to do with them.

Can't we respect GitHub's contributions and treat them with good faith instead of being so negative automatically just because of the "M word"?

What AI teams in a huge company like Microsoft are doing many buildings or even regions away doesn't seem to have much to do with them.

Can't we respect GitHub's contributions and treat them with good faith instead of being so negative automatically just because of the "M word"?

Good grief. I must have simply imagined that GitHub specifically, was forcibly cramming some "Copilot" AI product down the throats of OSS users recently.

Thanks for pointing out that GitHub isn't involved in AI.

@anuraaga:

It seems like bad faith to suggest there's something wrong with GitHub just because it got bought by Microsoft.

Call it experience.

Pre-acquisition GitHub was extremely unstable and lacked many features

Does not match my recollection.

What AI teams in a huge company like Microsoft are doing many buildings or even regions away doesn't seem to have much to do with them.

I strongly disagree. If several departments of your business are effectively at an arms race against each other, you are doing a poor job running your business. And I will make sure mine does not depend on how well you do.

Can't we respect GitHub's contributions and treat them with good faith instead of being so negative automatically just because of the "M word"?

I did. For years. To this day I keep engaging with projects on GitHub and I am not advocating against them. When they made the move to use 'our' code for training Copilot with no regard for the licensing terms, I thought 'the storm is coming'. And now look where we ended up. A proper solution would be to re-evalutate if the cost of training LLMs without consent are worth the benefits.

Long before chatbots, Wikipedia was a place that 'gathered all knowledge of the internet' and people contributed. I am sure many programmers would be fine with their code being used in LLMs. You could get chatbot time in exchange for checking in your code with them. Then they wouldn't need to scrape. The core problem here is that the FOSS comunity is not considered when big tech makes a decision. And Microsoft promised differently. So now we call them out. And if they don't listen, this platform will loose relevance, like many others have before.

Instead of receiving 429 we are now receiving 403 errors. We are authenticated via token.

I am currently getting rate-limited after responsibly running a manual python script with a token that first uses PyGithub to check for the right list of a dozen or so commits (this works) and then using requests with the same token to fetch github.com/USER/REPO/commit/SHA1.patch (this is being rate limited).

I cannot get even a SINGLE request through, to commit/.

api.github.com works fine. What is the logic in this? What are we "expected" to do?

2 replies

I suppose you could use git directly, but that’s harder to implement. It also would probably place more burden on GitHub’s servers, not less, but it’s harder to block without breaking unauthenticated clones altogether, so it might work for you for a while.

I'm sure there's lots of things I could do to be more of a burden on Microsoft resources while evading their ChatGPT throttling, but there's only so much interest or energy I have to constantly rewrite my project's tooling to add more and more elaborate state tracking. :)

It is probably practical to do it anyway. But on the other hand probably the most practical thing to do would be to find a non-github forge that is run with an interest in supporting FOSS communities.

Having this issue myself as of the last 2 days. only it happens to me when doing http requests with curl to anything github related. Changing ips via a vpn temp solves it, but doing one request every couple minutes to get rate limited seems extreme.

0 replies

I was having this issue earlier. I was downloading about 10 stl files from a 3d printing repo and hadn’t bothered to log in. In my case I think there needs to be some work done. There was no messaging on the site to indicate an issue, I just stared getting 0B downloads with the appropriate name for the file I had requested to download. Only realized what was happening after opening the browser console and seeing the 429 coming back. I don’t especially mind a rate limiting approach (within reason) but definitely there should be an actual message of some sort instead of just acting as if it’s downloading and giving a 0B file.

0 replies

There's many scripts floating around that install things, and most of them use unauthenticated HTTPS to pull down binaries or the scripts themselves. For some examples, check out Atuin's and Babashka's. With these new rate limits I'm getting 429s while using my Ansible scripts on my personal desktop machines. I could add in authentication for the requests I manage, but the scripts themselves often run their own HTTPS requests. Is the expected outcome of this for all these third party scripts to manage GitHub authentication?

In addition, projects such as Linux GSM make extensive use of unauthenticated HTTPS requests to GitHub. This will also require extensive work from the community to handle authentication. Even if that work is done it adds significant friction to usage.

3 replies

Also, nobody should be forced to even have a Microsoft account to access FOSS code in the first place.

I understand why Microsoft wishes to add limits as the AI scrapers are a drain on resources. The problem is really that the unauthenticated limits are extremely low and hamper even light legitimate usage. I have a couple machines that I just tried to update by using my Ansible playbooks to run a few of the above named install scripts (Atuin, Babashka, and also Clojure). I run them occasionally, one machine at a time, and the second machine already hit the limit. Since I have a static IP, my entire house is now blocked from unauthenticated GitHub requests for the next hour.

I understand why Microsoft wishes to add limits as the AI scrapers are a drain on resources.

Remind me whose AI it is?

I've been getting the same 429 Too Many Requests errors when updating from GitHub based APT repos.

0 replies

Getting the same error on Github Actions. Any solution?

0 replies

They seriously need to increase the limits, I host a ton of things on GitHub and it breaks a lot of my programs that auto install from GitHub releases to grab the latest version, this also breaks thousands of auto install scrips or powershell commands that are becoming more common. Only one request per minute, per IP, is an insane requirement. Bots and scrapers can change their IP at any time, in fact that's exactly what they do currently, so these limits only hurt everyone that uses GitHub that isn't a bot. Us users don't have massive bot farms that can just switch user agents, ips, and accounts, in an instant when they hit the rate limit. At the very least increase it from 1 per minute to 5 per minute, or something, anything at all to make this site usable again.

0 replies

And please don't apply these limits from Github's own IP addresses! I'm seeing these errors in Github Actions from runners that download content as part of CI. So its source IP address is Github and the destination IP address is GitHub.

2 replies

We're affected by this as well: we see a ton of GitHub Action workflows running on GitHub-hosted runners failing lately at the stage of setting up our build environment (it's a simple curl | bash). Sure, we could make this a full-blown custom action that uses the auto-generated GitHub token, but the question is: Why? The runners are subject to rate limiting on their own already... Please fix that GitHub.

I'm also getting this error sometimes when running a simple command on my workflow:
curl -fsSL https://raw.githubusercontent.com/platformsh/cli/main/installer.sh | bash

In order to circumvent this limit, we'll need to start cloning the repos instead of fetching one individual file. How does that save bandwidth? what prevents the scrapers from doing the same?

This seems to be harming genuine users a lot.

1 reply

That is indeed a technical measure that does not solve any technical problem, but only a business consideration.

That's why it has no sense.

💬 Your Product Feedback Has Been Submitted 🎉

Thank you for taking the time to share your insights with us! Your feedback is invaluable as we build a better GitHub experience for all our users.

Here's what you can expect moving forward ⏩

Your input will be carefully reviewed and cataloged by members of our product teams.
- Due to the high volume of submissions, we may not always be able to provide individual responses.
- Rest assured, your feedback will help chart our course for product improvements.
Other users may engage with your post, sharing their own perspectives or experiences.
GitHub staff may reach out for further clarification or insight.
- We may 'Answer' your discussion if there is a current solution, workaround, or roadmap/changelog post related to the feedback.

Where to look to see what's shipping 👀

Read the Changelog for real-time updates on the latest GitHub features, enhancements, and calls for feedback.
Explore our Product Roadmap, which details upcoming major releases and initiatives.

What you can do in the meantime 💻

Upvote and comment on other user feedback Discussions that resonate with you.
Add more information at any point! Useful details include: use cases, relevant labels, desired outcomes, and any accompanying screenshots.

As a member of the GitHub community, your participation is essential. While we can't promise that every suggestion will be implemented, we want to emphasize that your feedback is instrumental in guiding our decisions and priorities.

Thank you once again for your contribution to making GitHub even better! We're grateful for your ongoing support and collaboration in shaping the future of our platform. ⭐

0 replies

This issue highlights a critical UX flaw: legitimate users are being blocked while scrapers adapt and evade. Rate-limiting unauthenticated access harms countless open-source workflows, CI/CD pipelines, and install scripts that rely on raw.githubusercontent.com and similar endpoints. Encouraging login is one thing, breaking open access is another.

GitHub should reconsider this tradeoff: throttle smarter, not harsher. At the very least, exempt GitHub Actions AND self hosted runners!

That is a huge PITA at the moment with this limits!!

The current approach undermines trust in the platform's commitment to open source, shame!

0 replies

Is there any explaining how to avoid 429 and "Failed to resolve action download info. Error: Internal Server Error" errors in github actions?

Is it related to github yellow status?

0 replies

Dear GitHub Team,

While I understand the need to mitigate scraping, the current rate limiting on raw.githubusercontent.com and https://api.github.com/ is disrupting legitimate open-source workflows and automated scripts. A more nuanced approach is required to balance security with open access. Specifically, smarter throttling techniques and exempting GitHub Actions (including self-hosted runners) would greatly alleviate these concerns and reaffirm GitHub's commitment to open source.

Thank you for your consideration.

0 replies

GitHub releases is now completely useless due to rate limits for me, I have a project with an installer that auto downloads from one of my repos latest releases. Most my users cannot use my project anymore due do being rate limited simply just from downloading the installer and opening it. I'm having to tell them to wait or manually install stuff now, or use a VPN to change IPs and try again.

0 replies

I'm seeing this issue when searching for a public repository without logging in. I actually am unable to use Github at all, the very first search runs into this "rate limiting". This throttling is essentially making Github useless on the browser for non logged in users.

0 replies

I also stumbled upon this when searching for a repository via github.com, without being logged in.

I literally entered github.com in my URL bar, clicked the search, and entered the name of a repository. Nothing else, no detours. Boom, immediate 429. If a single search is already too much, just removing the search bar would be a simpler solution...

0 replies

Copy and RAW does not work and both fail with 429!

0 replies

I wonder how openAI is bypassing this rate limit;
I'm running my actions inside a self-hosted runner, and I can't push more than 4 or 5 releases a day

1 reply

Do you think after Microsoft bought GitHub for $7.5 billion they add such a rate limit feature to harm OpenAI, a for-profit organisation that Microsoft invested even more money in so that now they own more stakes than any other stakeholder, including the non-profit OpenAI Foundation itself? If anything, this is meant to given OpenAI and advantage over their competitors. And forcing even more open source developers to sign up (and in!) to GitHub so openAI can get their usage data as well (and 'learn' from their search patterns in the context of their recent/future comment/commits) might be a nice little bonus. 😉

At this point may just move to raw pastebin again, at least one can put their code text there and people can access and copy if they so wish without getting 429 threw on their faces.

0 replies