Google Cloud Datastore Incident
status.cloud.google.comThis is covered here: https://news.ycombinator.com/item?id=21503773
(Although they keep fiddling with the title, it should just be "Google Cloud Incident", it affected a lot of services)
I perceive Google Cloud to have more outages than other providers, given the rate these announcements are posted on HN versus say, AWS. Is this fair to say, or is the GCP team just more honest than others?
It would be nice if there was some sort of independent monitor for that. That might actually be a fun little project now that I think about it; deploy a VM on all major platforms that monitors some core services, collect all data in zabbix orso...
Anyway, I've never used GCE or Azure, but plenty of AWS ;) From my experience, Amazon tends to 'forget' about issues quite quickly. Last month there was a DNS issue, ~12 hours after it was 'resolved' according to their status page I was still having issues for instance.
That being said, I have not had any full outages or downtime due to AWS issues in 4 or 5 years.
It would be very good to have a monitor like the one you mentioned. Something that is gathering problems detected by you, not reported by them. We know that both groups are not always the same.
AWS regions are independent from each-other. An outage might happen but it tends to be localized to a region or even a zone. AWS is not good at reporting errors, there were plenty of times where their dashboard was green but we had issues.
Google's network is spanning the globe. Configuration changes are more likely to affect all regions. Google is also better at reporting errors.
This is based on my own experience. YMMV
The headline is a little understated. Here's what they say is down:
"Description: We are experiencing a major issue with Cloud Dataflow, AppEngine, Compute Engine, Cloud Storage, Dataflow, Dataproc, Pub/Sub, BigQuery, Networking beginning at Monday, 2019-11-11 01:15 US/Pacific"