UniSuper members go a week with no account access after Google Cloud misconfig

186 points by pgreenwood 2 years ago · 45 comments

Reader

jo-m 2 years ago

I had something similar once happen at my previous job. The company was using Google Workspace and GCP. The person who had set up GCP initially left the company. 1 month later, HR deleted his Google acc. Little we knew, the payment profile was attached to he GCP projects through his account, so deleting his account effectively removed the CC. All our stuff was shut down immediately (within minutes).

We first had no idea what was going on. GCP support just told us "it seems you deleted your CC". Eventually, we figured out what happened.

Set up a new payment profile and started migrating our GCP projects to it. Eventually had to create multiple of them, because there is an arbitrary quota of how many projects you can have per payment profile (~4), and support told us it would take days to increase it.

Fortunately, all our data was still there. However, support had initially told us that it's "all gone".

bongodongobob 2 years ago

That's why you always use service accounts for those kind of things. Admin, billing, etc. Never let a "daily driver" account hold the keys to the kingdom.
- bombcar 2 years ago
  
  The fun part is when startupserviceaccount@gmail.com or whatever gets flagged as not being a "real person's name" and then deleted.
  - bongodongobob 2 years ago
    
    I've never heard of this and I'm not aware of any requirement to use a real name. I tried googling and I'm not seeing people having that issue.
    
    bombcar 2 years ago
    
    There's not currently a requirement for a real name (though Google did at one time push that when going nuts with Google+) but they do really strongly push a cell phone number, which can easily get attached to someone who later is no longer with the company.
    You need to manufacture a persona (with password hints, a cell phone plan, etc) to really be secure - or have multiple avenues to access your system.
    
    woleium 2 years ago
    
    or switch to invoice billing?
    
    andrewinardeer 2 years ago
    
    Their point still stands. A false positive algorithmic account deactivation coupled with the impossibility of getting a human to review the decision is a very real scenario.
- candiodari 2 years ago
  
  Unfortunately personal accounts have a usable quota, service accounts have to go through all the approvals (at least 4: getting allocated at all, which cost center, what resource allocation, what scheduling priority, and this is assuming you need zero "special permissions")
  Doing the same with service accounts as you can do with a personal account takes weeks before you can even get started, and informs the whole management chain what you're doing, which means it informs essentially every manager that could complain about it of exactly the right time to complain to be maximally obstructionist about it.
  Or to put it perhaps less ...: using service accounts requires the processes in the company to be well thought out, well-resourced with people who understand the system (which this issue shows: they don't even have those at Google itself), well-planned, and generally cooperative. Often, there will be a problem somewhere.

dualscyther 2 years ago

The title I'm seeing is

> Google Cloud accidentally deletes UniSuper’s online account due to ‘unprecedented misconfiguration’

which is a lot more alarming.

I've heard of sudden and unwarranted bans before, but never an accidental deletion of a customer who they only just convinced to migrate to Google Cloud last year!

abrookewood 2 years ago

Pretty much what this update says: https://www.itnews.com.au/news/unisupers-google-cloud-enviro...
Just incredible that their entire account, spanning two geographies, was entirely deleted.
- tiew9Vii 2 years ago
  
  Yes, surprised this hasn't hit the top of Hacker News and instead gone un-noticed. If Google did delete the account, this is massive.
  Large financial pension fund with advertised $124 billion in funds under management so not some toy cat gif startup has account deleted accidentally by Google. That can very easily wipe out a company using cloud as cloud vendors advertise you to. From the article it sounds like they are lucky they had offsite backups but still potential for data loss and restoring offsite backups likely a task in itself.
  It's a major incident, I feel for the ops team who'll be working under massive pressure trying to get everything back up.
  - dualscyther 2 years ago
    
    Indeed. My gut feeling is that most companies using AWS, Azure or Google Cloud are not going to be making backups elsewhere. I wonder how much data would've been lost if they didn't have backups elsewhere?
    
    octodog 2 years ago
    
    Interestingly the Australian financial services regulator (APRA) has a requirement that companies have a multi-cloud plan for each of their applications. For example, a 'company critical' application needs to be capable of migrating to a secondary cloud service within 4 weeks.
    I'm not sure how common this regulation would be across industries in Australia or whether it's something that is common in other countries as well.
    
    toomuchtodo 2 years ago
    
    US federal financial regulators and NYDFS have similar concerns and strong opinions, but nothing in statute or regulatory rule making yet (to my knowledge; I haven’t had to sit in those meetings for about 2 years).
    
    abrookewood 2 years ago
    
    I would imagine all of it. The underlying data is going to be on encrypted volumes and I'd expect deleting them would render the data non-recoverable.
  - sumedh 2 years ago
    
    > surprised this hasn't hit the top of Hacker News and instead gone un-noticed.
    Most people outside Aus dont know about UniSuper so HN probably assumed its a small company so not really important.

taspeotis 2 years ago

> UniSuper was able to eventually restore services because the fund had backups in place with another provider.

Lesson learned: back your GCP data up to a real cloud like Azure or AWS.

olliej 2 years ago

No, back up your data to a service independent of the service hosting what you're backing up.
It reminds me of "I use box/iCloud/some-other-cloud-drive-service so don't need backups", but don't understand the model is "I deleted/broke the data on my machine, and then synced that data loss to every machine"
bananapub 2 years ago

no, just the usual: obviously always back up data somewhere where normal processes can't delete it.
that means not on the same cloud infra as the rest of it, but also means different creds and different access paths.

sidcool 2 years ago

Google Cloud accidentally deleted a company's entire cloud environment (Unisuper, an investment company, which manages $80B). The company had backups in another region, but GCP deleted those too. Luckily, they had yet more backups on another provider.

siquick 2 years ago

They only finished migrating to Google Cloud last year.

https://www.googlecloudpresscorner.com/2023-06-05-UniSuper-A...

steve_taylor 2 years ago

Whoever made the decision to use GCP should be fired. Google’s random deletion of accounts and their resources is well known. Somehow there wasn’t anyone in the whole organisation who knew about this risk or Google had convinced them it doesn’t happen to big players.

This article doesn’t challenge the assertion by Google that this is a once-off, which is really sloppy journalism.

bananapub 2 years ago

I really would like to hear the actual-actual story here, since it is basically impossible it actually was "Google let customer data be completely lost in ~hours/days". This is compounded by the bizarre announcements - UniSuper putting up TK quotes on their website, which Google doesn't publish and also doesn't dispute.

if a massive client came and said "hey our thing is completely broken", then there would have been a war room of SREs and SWEs running 24/7 over two continents until it wasn't.

liamwire 2 years ago

In fact, it turns out this was entirely the case.

octodog 2 years ago

Hopefully there is a full RCA published once the services are fully restored. This is really concerning for any GCP customer.

dankotanko1599 2 years ago

It's just mind-boggling that their architecture allows this to happen so quickly IMO. There are so many resources and dependencies, that completely nuking a cloud account cannot and should not be easy or fast... and should not actually be possible by the cloud vendor. I suppose they need to guard against anyone setting up costly infrastructure and doing a "runner" (allowing a credit card to lapse) - in that scenario - deleting all the customers data should be the absolute last resort - after it's been reasonably determined they are being malicious. How does AWS manage these scenarios? I'm sure they follow-up multiple times before hitting the nuke button. In-fact - they know and treat their "larger accounts" with special privileges and assurances. Unisuper is not a small fish.

ecliptik 2 years ago

Did they think it was leftover Google Reader infra?

connordoner 2 years ago

No — it was from whichever messaging app they’ve sunsetted this week.
dddw 2 years ago

Thanks for the chuckle

rswail 2 years ago

The choice of cloud operators is down to two. How is it that google can totally screw up so badly?

neffy 2 years ago

Hate to disillusion you - but exactly this happened to us on AWS a couple of years ago. 1 months research compute - no biggie, but no more AWS for us either.
- chucky_z 2 years ago
  
  To counterpoint this with 'not for thee, but for me.' If you are spending > $1m/yr with AWS this will never happen. The decisions need to go through a TAM who will block this kind of thing.
  For smaller users I can imagine this sort of thing happens pretty regularly with every cloud, even smaller ones.
  Please reply if you had a TAM and were spending that much! I'd be personally interested to hear that was the case.
  - neffy 2 years ago
    
    No, we had the lowest tier (research - funding is an issue) - but more seriously SME's will get trapped in this as well, and if I had $1m/yr I would definitely be running my own datacenter.
    They offered to investigate if we paid for support, I counter-offered with not using the chat script in one of my courses as an example of AWS customer "support", and we ended up getting a full refund at least.
sidcool 2 years ago

You mean AWS and Azure?
- justinclift 2 years ago
  
  Azure has a very bad reputation for terrible security, although weirdly a lot of enterprises don't seem to care. :(
  - RyeCombinator 2 years ago
    
    Where can we fund more information about this?
    
    justinclift 2 years ago
    
    There are so many that it's just not funny. :(
    • https://www.nbcnews.com/tech/security/scathing-federal-repor...
    • https://www.wiz.io/blog/bingbang
    • https://www.theverge.com/2023/7/12/23792371/security-breach-...
    • https://www.wiz.io/blog/midnight-blizzard-microsoft-breach-a...
    These are just the first few I've found for the last 12 months.
    There are more if you go back even 12 months further, and I'm pretty sure the above aren't the complete list for the last year anyway.
    Microsoft are actually really bad at this Cloud "security" thing. :( :( :(
    
    Bognar 2 years ago
    
    3 of those are about Exchange and one is about Bing (involving AAD, but it was an AAD app that was misconfigured and not an AAD issue itself). The teams that run Azure are in entirely separate organizations with wildly different product stacks.
    Exchange has a bunch of decades old infrastructure and is a security nightmare afaik. Dunno much about Bing.
    The "org chart" graphic with MS orgs all pointing guns at each other is real shit. Different orgs have very different security postures, and Azure's is much stronger than others.
    Source: spent some time at MS
  - sumedh 2 years ago
    
    > although weirdly a lot of enterprises don't seem to care. :(
    Probably because Azure is cheaper than AWS and I am sure MS might be throwing some freebies/discounts if the customer is using Enterprise Office.

dehugger 2 years ago

My previous employer has this happen with OCI (Oracle's cloud). Hundreds of servers suddenly deleted do to a internal billing problem on Oracles side.

Luckily they were able to restore from backups, but it took a full day and there was still significant data loss (the delta since last backup).

Since those servers were used to host the companies own cloud managed offering it ended up affecting all of their own cloud customers as well.

delduca 2 years ago

I bet it was some stupid AI running in background.

ein0p 2 years ago

Pichai be like: I know how to fix this! Let’s fire a bunch of people and move their jobs to Bangalore. /s

h4ch1 2 years ago

yeah that /s didn't help you a lot there mate.
- ein0p 2 years ago
  
  I’m outta that hellhole, so I don’t need help. Condolences to the rest.

Settings

UniSuper members go a week with no account access after Google Cloud misconfig

Keyboard Shortcuts