Settings

Theme

GitHub was down again

githubstatus.com

166 points by originof 5 years ago · 124 comments

Reader

arilotter 5 years ago

I was seeing PRs failing to update & webhooks failing to trigger upon pushing code for 30 minutes before GH's status page acknowledged anything. I'm surprised they don't have monitoring in place that would catch webhooks failing within minutes of the failure beginning.

  • malux85 5 years ago

    At large bureaucratic organisations there's often political implications of changing the official status, so often it lags behind reality until it cannot be swept under the rug anymore

    Not saying it's right, just an observation

    • KingOfCoders 5 years ago

      As a CTO responsible for an often failing eCommerce website that lost millions when down and which I took over, I fell in the same trap of trying to sweep things under the rug.

      Until I decided to no longer do that and my life improved considerably.

      • malux85 5 years ago

        Yeah it's very frustrating - especially if your customers are technical, they are seeing the errors and the status page says everything is fine.

        I've seen status pages and error counts tied to bonuses, which only caused a giant mess of bad incentive alignment and internal lies, customers are unhappy, developers are unhappy, management are lying to upper management, it's so much easier to focus efforts on real problems and just be honest and improve. Thank goodness I dont work there anymore (cough cough Google)

        • penagwin 5 years ago

          Do these companies not have live error reporting and tracing? Like surely github got alerts that things weren't working? Why don't they just hookup their status page and their alerts? Or is it a political/relationship thing, and they want to have a human give out the status page updates?

          This could have been caught with a cron job and some curl requests :\

          • mjayhn 5 years ago

            In all honesty they're typically just banking on people not noticing it and are trying to make it as little of a fuss as possible and get it up before it gets to twitter. The problem is when it's not just a small blip and they haven't addressed it and it goes mainstream and is still down, it just leads to concerns about transparency.

            Building infra I have to work around all sorts of 3rd party services going out or having blips throughout the day, docker registries, caches, bgp, etc., it's totally an expected part of infra design but not every team has the time or need to build in the resiliency. I see tons of outages that never get reported or IMO aren't reported adequately enough.

            With that said, I'm no angel, I get all my service down notifications through slack, so when slacks down..

    • JMTQp8lwXL 5 years ago

      What's the point in having a status page if it's a political artifact? In that case, it serves zero customer value.

      • opportune 5 years ago

        It’s still an indicator that “no you’re not crazy, we’re having issues on our end” but not a foolproof one. It’s kind of sad that twitter is usually the best place to confirm an outage as it begins, rather than the software providers themselves. I assume if they actually exposed global availability metrics in most cases it would not look as good as they would want it to

        • paulie_a 5 years ago

          Down detector is based on twitter complaints and is pretty damn accurate.

    • tfolbrecht 5 years ago

      3 words: Service Level Agreement

      I've caught a big cloud provider not reporting a degraded service, I assume they knew but politics and $ come in and it's easier to just gaslight everyone. I get it, but my frustration is worth loosing a trailing 9.

      I think there should be some 3rd party continuously testing APIs. Degraded states are downtime!

    • tdeck 5 years ago

      Honestly I think this can be true at any size organization. Small startups often take the approach of "let's hope no one noticed while we try to fix it", it's just that they have fewer users to notice so it's more likely to work.

  • njsubedi 5 years ago

    They used to have real-time graphs and stuff on their status page. That was a thing of the past; with more distributed system, they're probably not sure if the service is down everywhere. If a node somewhere is still up, they might consider the service up. I don't know much, but it's up to the kind of downtime measurement system they have.

  • emilfihlman 5 years ago

    That's the way it is _everywhere_.

    For example status.digitalocean.com is _not_ real time, it's manually updated.

    And it's irritating as fuck.

  • chrispauley 5 years ago

    Had the same issue this morning. The lagging status always causes the issue of "is it you, me or GitHub?" snaffoos. Really annoying to have these issues so consistently. Would switch to gitea or similar in a moment given the choice.

    • hyperdimension 5 years ago

      Just a friendly correction: `SNAFU:' Situation Normal: All Fucked Up

      • hinkley 5 years ago

        This misspelling brought to you by foobar.

        Foobar: for when you are too polite to say FUBAR (Fucked Up Beyond All Recognition).

        • grecy 5 years ago

          My two favourites lakes in the Yukon - SNAFU and TARFU (Things are Really Fued Up).

          Named, of course, by the Army when they built the Alaska Highway.

      • chrispauley 5 years ago

        I had never given any thought to this word as I had heard and used it since childhood, had no idea that was the origin!

silasdavis 5 years ago

GitHub actions downtime is becoming painful for us. Having been lured on there with 10,000 included minutes which they shortly thereafter dropped to 3,000 I feel aggrieved paying for overages incurred from actions regularly shitting the bed.

  • carstenhag 5 years ago

    Also having outages at Azure DevOps Pipelines every other month or so it seems. And that's paying - for hours there's no mention on the status page and we are stuck there, not being able to merge PRs or release our app in the standard way.

    • hinkley 5 years ago

      Paying Saucelabs customer here.

      It's gotten more reliable over time (especially selenium events being dropped on the floor causing tests to stall and fail), but I used to have to babysit it quite a bit and there were quite a number of times where IE instances just would not spool up (with a multiple minute timeout set). Sometimes it was a one-shot thing, other times it went on for hours.

      During these incidents the average allocation times listed on their status page would double for Windows VMs (I don't recall the exact numbers but they were on the order of 10 seconds vs 5) but nothing would be red, and most of the time nothing ever did go red.

      And that's what you get for using averages for things and divide infinity by n improperly.

    • atraac 5 years ago

      This is weird because we haven't encountered any real issues with agents in Azure DevOps pipelines. I think we maybe had a single downtime in last 6 months. They recently removed .NET Core 2.2 SDK without any notice and broke our builds but that's another thing.

  • penagwin 5 years ago

    Github actions has been a huge let down for me. Between uptime issues and the lack of support for so many basic CI features is killing it for me (and has been for a year).

    The only reason we're using it is because it's free..

    • kwhat4 5 years ago

      Honestly I've had the opposite experience. With so many community actions available, I've had little trouble finding anything I could dream up. Sure, some of the actions features are a little immature but they are improving with time. The uptime issues are annoying and I feel like the lack of transparency is not helping that situation, but as far as CI solutions go, I feel like my move to actions has been a great way to get up and running with far less effort than other offerings like Code Pipeline.

bob1029 5 years ago

Here we are again. Me taking a break on Hackernews because all my webhooks and pull requests are fucked and I have no idea where my devops tools are relative to what the real state of affairs is. I have pretty much had enough of this. It is too disruptive to our process. It is causing fragility and loss of confidence in our build pipeline.

At this point, we would probably be better off just bolting some lightweight git solution onto our devops tools (which are 100% custom in-house developed), rather than fighting with some more-durably-hosted offering of GitHub, et. al.

Anyone who posts that "but you cant make it more reliable than microsoft" line is not thinking about the dependencies between systems and the considerable impact incurred on a service just by virtue of it being a publicly-accessible platform without any cost barrier to entry. Sure, bringing it in house might bring additional difficulties, but I think I can eliminate a shitload of existing difficulties if we moved from webhooks across the public internet to a direct method invocation within the same binary image.

  • dabeeeenster 5 years ago

    We've been self hosting Gitlab CE for a couple of years. Its been great. No downtime, upgrades are seamless, fast, works.

    • bob1029 5 years ago

      Gitlab is probably at the top of the list of candidates if we go down this road. I don't necessarily need it to be in the same binary as my devops tools, but certainly no further than localhost or another machine on the same network.

  • mattstrayer 5 years ago

    or, you know, host a gitlab instance yourself & call it a day

zelly 5 years ago

https://gitea.io/en-us/

https://git.zx2c4.com/cgit/

https://about.gitlab.com/install/?version=ce

  • hundchenkatze 5 years ago

    I've been having a great experience with https://sourcehut.org/ as well.

  • INTPenis 5 years ago

    I'll use your comment to say that Federation[1] has also been discussed in Gitlab for 2 years now.

    Frankly I can't wait. Imagine being able to reference other users across instances with @username:instance or something to that extent, or projects and tickets.

    1. https://gitlab.com/gitlab-org/gitlab/-/issues/6468

  • nine_k 5 years ago

    The first two links miss the idea a bit, I'd say.

    I don't often need a web interface to a git repo. I can pull and do everything locally.

    What I do use GitHub for is (1) code review and approval process, (2) CICD / actions, (3) releases to push stuff out.

    The branch / tag / file browser is a nice addition, but it's not key. Rendering README.md is almost as important, if not more.

  • vorpalhex 5 years ago
    • sdesol 5 years ago

      For those that don't know, Gitea forked from Gogs a while back and they are very much being developed with two different philosophy. If you take take a look at the active contributors for Gitea and Gogs, you can tell how much they differ now.

      https://imgur.com/ZExNVV4

      https://imgur.com/v0fGXgv

      There are two active contributors for Gogs, while Gitea has 27. Note, the number of contributors can't tell you if one has higher quality or not, I just wanted to point out the difference in development philosophy.

      Given that Gitea has significantly more active developers working on it, we can probably assume it can add functionality faster than Gogs though.

niftylettuce 5 years ago

There have been at least three major outages, e.g. git clone of a repo, in the past week alone. All three of which have been unreported (and NOT shown on their incident page), but I have email confirmation from GitHub support of these issues. It's almost time to switch to Gitlab. I have hundreds of repositories, organizations, and packages to transfer, while it will be daunting... I need reliability. I have several paid GitHub orgs and accounts as well.

  • ShorsHammer 5 years ago

    To be fair they've been busy fixing the issue of slavery nomenclature in that time too. Respect where respect is due, important issues are being tackled here, you can't do everything at once.

    https://twitter.com/natfriedman/status/1271253144442253312

    • forty 5 years ago

      I cannot say if this comment is being sarcastic, but for the record I found it hilarious. Thanks :)

    • 0xy 5 years ago

      Hilarious. GitHub has their fingers on the pulse of what developers and their customers really want. Not stability, but pretending to do things to help POCs through mindless censorship.

rollulus 5 years ago

GitHub used to have a pretty cool status page, with all kinds of real time graphs. Does anyone know what happened to it? Since it makes me really sad that this status page is a plain lie, I had to visit HN to get the confirmation that they are having issues again, and that it just wasn't only me.

  • originofOP 5 years ago

    I read an article about it, look for "status page evolution" https://nimbleindustries.io/2020/06/04/has-github-been-down-...

    • hinkley 5 years ago

      > But that could be all a part of coordinated effort to be more transparent about their service status, an effort that should be applauded.

      Microsoft could be pushing for transparency. Or people are more relaxed about transparency now that GitHub has its exit. How long did GitHub know they were looking to be acquired? Maybe this analysis should look at a longer time interval..

    • kohtatsu 5 years ago

      From the first two graphs it looks like they are a lot less liberal about using "down" instead of "warn".

      • hinkley 5 years ago

        The best triage policies I've ever gotten to work with had severity and priority separated.

        Severity went something like this (sometimes the numbers flip which always confuses at least 20% of the team about whether things are almost normal or people are hunting each other for sport).

        1: data loss

        2: some workflows blocked

        3: some workflows unavailable w/ workarounds (ie other routes)

        4: Everything else except

        5: Irritations

        Having a UI break but the underlying functionality is still working is not good but people can still do their jobs, if more slowly. It's important to classify these separate from S2 and S4. There is urgency but don't panic. Go eat lunch or have your planning meeting, then go fix it. If data is getting lost ain't nobody doing nothin' until we figure it out, and then some people can go back to work but don't interrupt the people still working on it.

        I think the problem is that so many metrically dysfunctional people, to the point of cliché, have rationalized that an S2 means that only 20% of our customers can't do their jobs so we are degraded but still working normally, when really a yellow status should be at S3, while S2 should be at least orange although those affected will be upset that it's not red.

        Over time that 20% will shift around to most of your customers. Eventually several times, and then you'll wonder why everyone is talking trash about you on HN. It's not like that many people were affected!

  • uberman 5 years ago

    The status page clearly states (when I look) that:

    Incident on 2020-07-15 15:41 UTC We are investigating reports of degraded performance. Posted 9 minutes ago. Jul 15, 2020 - 15:41 UTC

    • rollulus 5 years ago

      Currently it does, indeed. From my Slack logs, at 15:00 UTC I noticed problems. I'm pretty sure that message is manually created, at least 41 minutes after the fact.

      • luckylion 5 years ago

        That's the most annoying thing. Usually when I get notifications from monitoring about some issue, the first thing I do is check the vendor or provider's status page to see whether it's an issue on their end. If there's nothing, I go and investigate.

        Recently, more and more of them take 10-15 minutes until they mention a service outage. I don't work in super HA, I don't want to get an alarm because a single ping failed etc, so I'm lenient and have a few minutes of a delay in alarms. If I'm writing an internal incident report before the official status page is updated, that's bad.

        This seems similar: external users noticing the outage and posting on HN before GitHub notices & acknowledges it.

        • tpetry 5 years ago

          Idea for an startup: Paying a service to do independent health checks to popular services with the ability to select the services i would like to be notified of their health status.

DevKoala 5 years ago

The company I work for moved to Gitlab because we were pessimistic on GitHub in the past few years. I don’t really have a strong opinion on which is better though, I still keep my private repositories on GitHub. However, I feel that Microsoft will start feeling the pain soon as more people in the development community get sour on GitHub.

  • ldiracdelta 5 years ago

    Why do you think they will feel the pain ever?

    • DevKoala 5 years ago

      Github had been in growth mode up until the acquisition. If Github stops being the nirvana for developers that it once was, it will be another dark mark in the history of MS acquisitions. Moreover, considering how sentiment influenced the stock market is at the moment, continued news of one of their products having outages could easily shed a considerable amount of Microsoft’s valuation, ~1%. The say stocks only go up nowadays, but when everything goes up, whoever grows at the slowest rate is really going down. I‘d assume that the Microsoft executive team won’t be happy with the new perception of Github.

cameronfraser 5 years ago

Why is outage history pre-acquisition removed from their history? If you try to go back in time it seems they only retain history up to a couple months after the acquisition. Is this just a 2 year retention policy or something being swept under the rug?

  • ketzu 5 years ago

    I can go back all the way to 2010: https://imgur.com/DsSKcFV

    • mardifoufs 5 years ago

      Wow, it used to be so much more detailed! I get they probably can't have that level of "casual" disclosure now that they are so big, but man the current status updates just feel so... useless and unhelpful in comparison.

  • jkaplowitz 5 years ago

    I have never worked at Github or MS and have no inside info on this, but it may be as simple as having switched to a MS-run system for outage history tracking as part of their own M&A integration.

  • CameronNemo 5 years ago

    This change must have happened this year. I remember comparing pre and post acquisition outage rates a few months ago.

    If it is not purposefully being swept under the rug, it sure is convenient.

  • bdcravens 5 years ago

    I think it's also worth noting that also corresponds with the Actions feature (went GA 11/11/19)

Animats 5 years ago

What's the easiest way to duplicate all your Github repositories, with history, somewhere else?

Ideally, I'd like to have two synchronized repositories, for no single point of failure, organizational or otherwise.

  • monokh 5 years ago

    Run git on a personal server[1]? It's not as complicated as you might think. Probably much more usable to setup gitlab.

    Then set up the alternative remote on your repos.

    [1]https://www.linux.com/training-tutorials/how-run-your-own-gi...

    • rovr138 5 years ago

      I posted this on another thread. If you only want the commits, something like this works,

          ssh user@git.example.com
          mkdir project-1.git
          cd project-1.git
          git init —-bare
          exit
          git remote add alternate user@git.example.com:project-1.git
      
      All you need is SSH
      • judge2020 5 years ago

        All you need to get all commits and all tags/branches:

          git clone --mirror https://github.com/you/repo
        
        Push those to another server:

          git remote add new https://gitlab.com/you/repo
          git push --mirror new
      • kenniskrag 5 years ago

        I use it also that way. Gitlab does the same if you use ssh. They provide hooks on init so that gitlab knows if something happens.

    • johannes1234321 5 years ago

      That's fine for your code repository.

      However git has more: - Bug tickets - PR - Wiki - many projects use their githubpages as primary homepage - newly GitHub actions

      And then: Often collaborators are only known and identified by their GitHub handle. Running an own server requires some mechanism to identify them again and creating a way to handle their access credentials (ssh key etc.)

      Moving a mildly successful project isn't easy. Good if more people plan for that eventuality, even if they stay on GH for the time being.

  • ddevault 5 years ago

        git remote set-url --add origin git@somewhere.else:my-project
    
    This will make it so that every time you push to "origin", it'll push twice, to two places. You can repeat this to add a third or more.
  • vorpalhex 5 years ago

    Gitlab has a feature set for this that's automated but it may be rolled under a paid plan if you want it to be bidirectional.

    Building this in git itself is not hard at all, and there's likely a script or plugin for gogs or gitea.

  • ebg13 5 years ago

    Other answers talking about using git features are assuming that you don't care about Wiki/PRs/Issues/Labels/etc that are GitHub metadata not part of your repo history.

    But GitLab does have feature support for extensive importing and mirroring. https://docs.gitlab.com/ee/user/project/import/github.html (Import your project from GitHub to GitLab) has a section on project mirroring.

  • throwaway744678 5 years ago

        git add remote NAME URL
        git push NAME
    
    It will not transfer GitHub specific content (issues, PE, wiki, etc.), though.
  • kenniskrag 5 years ago

    git clone ... which you already have

    git remote add originFoo URL

    git push --all originFoo

    There are other flags like mirror but never used them.

    Source: https://git-scm.com/docs/git-push

  • jankassens 5 years ago

    You could probably use GitHub actions to push any commits and branches to GitLab or anywhere else.

miguelmota 5 years ago

Where does github publish post-moderms of downtime? I only see things like "We have deployed a fix and are monitoring recovery." in the github status history which doesn't provide details.

originofOP 5 years ago

The Latest commits don't appear in the commits tab

varbhat 5 years ago

For free Gitea instance, You can try https://codeberg.org

rvz 5 years ago

Do I have to repeat this over and over again? If these non-profit open-source projects [0] are able to self-host a git solution like GitLab, Gitea, cgit or Phabricator instance somewhere, surely your team or open-source project can too.

Even a self-hosted GH Enterprise would suffice for some businesses but this would be overkill for others. I even see the Wireguard author using his own creation (cgit) to self-host on his own git solution for years. [1]

This is problematic since many JS/TS, Go and Rust packages are on GitHub, which many developers rely on. Thus, it would be risky to think about tieing open-source project to (GitHub Actions, Apps, etc).

[0] https://news.ycombinator.com/item?id=23818020

[1] https://git.zx2c4.com

  • nurettin 5 years ago

    Wait, will you really repeat this until you no longer see a "github down" link on HN frontpage?

    That is dedication.

kchoudhu 5 years ago

That's it, I'm launching SubversionHub.

zymhan 5 years ago

How do they not have a single update in over 1.5 hours? This is ridiculous.

yizhang7210 5 years ago

Ugh the last month has been pretty difficult. Hope they get better soon.

neurostimulant 5 years ago

So that's why my automated build wasn't triggered ~4 hours ago. I was like "no way github is having issues again, they were down just the other day, it's probably just docker hub's fault". If they decided to publish a blog post about these series of outages later, I bet it would be pretty interesting.

donatj 5 years ago

It's been having issues all day. Wanted to show a coworker some changes I was proposing but the site wouldn't show the changes I'd pushed to my pull request. Ended up just having him pull the changes.

FWIW the git backend always seems rock solid in comparison to the front end they have displaying it.

  • throwanem 5 years ago

    I'm not sure this time. I had a PR update and kick off a build half an hour or so ago, only to see the build fail because git couldn't parse what it got from the clone operation.

rydre 5 years ago

I really want to move to GitLab but its UI is atrocious... like too mobile phone looking on desktop

mikewhy 5 years ago

noticing issues on GitHub, CircleCI, and Launch Darkly.

may4m 5 years ago

I had a problem with github a while ago when I tried merging a PR to the master branch, the merge commits reflected on master but the PR was still open.I would repeatedly click the merge button but the PR wouldn't show as merged

mlang23 5 years ago

Likely unrelated, but I recently noticed that GitHub stopped updating my activities overview for july. I definitely pushed commits, but they are not noticed. Anyone else having a similar issue?

marcinzm 5 years ago

How is GitLab like in terms of downtime? I looked at their status history page and I'm seeing a lot of incidents but it's hard to figure out what it actually means.

  • s_dev 5 years ago

    Thats why the self hosted options are there -- and why GitLab has a competitive advantage in this sense.

    Cloud solutions are great -- however they have a golden rule, don't go down ever. This is seriously damaging to GitHubs reputation.

    • hn_throwaway_99 5 years ago

      > Thats why the self hosted options are there -- and why GitLab has a competitive advantage in this sense.

      GitHub has GitHub Enterprise

      > Cloud solutions are great -- however they have a golden rule, don't go down ever.

      Oh where oh where can I sign up for this mythical unicorn cloud service?

  • ocdtrekkie 5 years ago

    IIRC GitLab is super transparent about service outages, so it may not be that they're less reliable, but that they're more honest about it.

  • neurostimulant 5 years ago

    Gitlab used to have more outages than github, but these days they're about the same or even better than github. Also, they're really transparent about handling outages. They post link directly to the issue page in their status page so you see all those gitlab employees frantically trying to restore the service. I was pretty mad when they were having the last outage because I can't finish my work, but after checking the issue page and seeing how hard they work, I felt bad and decided to cut them some slack :)

talkingtab 5 years ago

Running your own git server is trivial. I have been doing it for years on a very cheap digital ocean instance. Set up ssh keys, lock it down with ufw, done.

If that is not enough, run your own instance of gitlab.

If that is not enough use Gitlab.

Microsoft is going to attempt to make a profit on Github. That's okay, but based on past experience and current issues, their business model is lock-in not service.

I suspect the same is true for NPM.

stunt 5 years ago

At the current rate, Whatever you host will have a better uptime than Github.

drcongo 5 years ago

They're probably using the Facebook SDK /s

revskill 5 years ago

Scaling Rails is hard ? Github needs to move to CDN, static site deployment instead.

juped 5 years ago

Git is distributed.

MattGaiser 5 years ago

Did Microsoft adopt Scrum?

josefrichter 5 years ago

The Microsoft Effect

iso947 5 years ago

Don’t host yourself, it’s impossible to meet the reliability of the professionals

  • wizzwizz4 5 years ago

    No, it's not. Apart from scheduled downtime when nobody's using it (e.g. restarts in the morning to update the kernel), it's not that hard to beat GitHub's uptime for a small Gitea instance. My power's on more than GitHub is up.

    A UPS and a tethered smartphone would get me three nines uptime-while-anyone-needs-it, which is well in excess of what I need.

  • tapoxi 5 years ago

    We migrated to on-prem GitLab running on k8s via the official Helm chart a year ago. We have ~50 users and so far have only had downtime when they required us to migrate from PostgreSQL 9.6 to 11 with the release of GitLab 13, and that was planned. We upgrade multiple times a month to stay up-to-date with the latest patches, and it's painless.

    I don't regret it.

  • subssn21 5 years ago

    I think that one size fits all host it yourself / don't host it yourself is the wrong approach. For some organizations that have dedicated devops people and can easily maintain their own servers, they may be able to have better uptime and reliability for their instance. For smaller shops that don't have the time or expertise, I think it us true that GIT hosting is one of the many services that should be handled by a Cloud Service whether Github or Gitlab (or someone else).

  • s_dev 5 years ago

    ... HN is mostly professional software developers though. Running a GitLab instance really isn't hard.

  • ed1222190 5 years ago

    I used to agree. Now I work with a locally hosted github. It is down all the time and sometimes it just deletes all of the work from the past day. I thought it wasn't possible to do much worse, but I was obviously wrong.

    • js4ever 5 years ago

      At work I'm managing a gitlab instance for 15k users and 5k projects. Uptime is 100% since 1 year except for few minutes of planned downtimes every month for the monthly upgrade. To be honest I expected it to be a lot harder and run into troubles ... But I always find answers quickly in gitlab doc or forum

  • syshum 5 years ago

    Hmm I am a professional, so it is easily possible for meet my own reliability

  • bdcravens 5 years ago

    For many organizations, that's still true.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection