Settings

Theme

What is BGP? – BGP routing explained

cloudflare.com

315 points by franl 4 years ago · 18 comments

Reader

sarosh 4 years ago

There is already a nice writeup on the current incident from Cloudflare at https://blog.cloudflare.com/october-2021-facebook-outage/

They key observations:

"Due to Facebook stopping announcing their DNS prefix routes through BGP, our and everyone else's DNS resolvers had no way to connect to their nameservers. Consequently, 1.1.1.1, 8.8.8.8, and other major public DNS resolvers started issuing (and caching) SERVFAIL responses.

But that's not all. Now human behavior and application logic kicks in and causes another exponential effect. A tsunami of additional DNS traffic follows.

This happened in part because apps won't accept an error for an answer and start retrying, sometimes aggressively, and in part because end-users also won't take an error for an answer and start reloading the pages, or killing and relaunching their apps, sometimes also aggressively."

NetBeck 4 years ago

Cloudflare has a useful tool for measuring if your ISP is using RPKI.[0] For Facebook, this is the latest I could find for their implementation of BGP.[1][2]

[0] https://isbgpsafeyet.com/

[1] https://engineering.fb.com/2021/05/13/data-center-engineerin...

[2] https://www.usenix.org/conference/nsdi21/presentation/abhash...

motohagiography 4 years ago

Was banging on about this with some of the people probably here over 20 years ago. Not sure what this issue with FB was as I'm not on nanog anymore, but if it's bgp, it's a short list of likely events, as I foggily remember.

- someone big redistributed their static routes for FB into their announcements to peers.

- someone who has mapped peer filters and their prefix lengths has figured out how to announce smaller prefixes for FB routes and have them propagate.

- someone with enable somewhere in one of the major ASNs (like 701 back in my day etc) is doing a straight forward attack on FB.

- someone inside FB messed with load balancing and prepended a bunch of their routes internally and redistributed the long AS paths themselves and just broke shit with internal routing loops.

I have no idea how people unbefunge routing problems now that you have to coordinate multiple teams on the phone to get anything done instead of just one router guru just logging into everything and fixing it. I would be useless at it now, but this is not a recent problem. If it's still a problem, it will always be a problem.

ijidak 4 years ago

> While there have been a number of ambitious proposals intended to make BGP more secure, these are hard to implement because they would require every autonomous system to simultaneously update their behavior. Since this would require the coordination of hundreds of thousands of organizations and potentially result in a temporary takedown of the entire Internet, it seems unlikely that any of these major proposals will be put into place anytime soon.

Excellent. Just what I like to hear /s

kristjank 4 years ago

I have a hunch that the "How BGP can break the Internet" will get updated in the near future :^)

ngcc_hk 4 years ago

Why can’t they at least start to inform who is advertising what. After say 1 year we would have most if not all … gradually we can build a grey BGP not all white but at least in case if some … wonder. Or any other option. Total trust is so untrustworthy.

dghughes 4 years ago

I recall from networking classes messing around with BGP can be bad. Very bad.

RustyConsul 4 years ago

How does one go about setting up an autonomous system? Seems like a shadowy world based on the impact they could potentially have.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection