Settings

Theme

After five years I discovered why my network goes down (2020)

community.wd.com

27 points by yitchelle 2 years ago · 37 comments

Reader

Nextgrid 2 years ago

That post has provided no proof this is indeed a MAC address conflict issue, and as other comments suggest even an actual MAC address duplicate should only cause connectivity issues for said MAC address and no other devices.

MAC addresses belong to the NIC and not the machine, and while the OS can override them, it won't do so without express user intervention. I'm especially skeptical of the Ethernet MAC being a duplicate of the Wi-Fi MAC, as this would cause obvious issues (especially considering both Wi-Fi stays up even in the presence of Ethernet, it's just that the routing table is configured to prefer the Ethernet over it) - if this was indeed the case, he would never have had network access on that machine.

However, it is known that some Realtek USB Ethernet controllers have unexpected behavior when powered but no longer enumerated on the USB bus, and they send some low-level Ethernet frame that effectively causes all traffic to stop on that L2 network segment. I'm not sure who is at fault (whether the controller or the switch it's connected to mistakenly rebroadcasting that frame), but here are more details: http://jeffq.com/blog/the-ethernet-pause-frame/ and https://lucumr.pocoo.org/2020/7/6/usb-c-network-hubs/.

  • toast0 2 years ago

    > I'm not sure who is at fault

    It's the people who thought Ethernet Pause was ever a good idea.

    I've only seen it lead to tears. In all of the situations I've run into, it would be better to drop packets that are arriving too fast to handle than to try to fix it up by trying to do flow control. Ethernet is unreliable, and packet loss is the overload signal; PAUSE changes overload from drop to buffer, and then you end up with problems with overlarge buffers. In addition to things like where a multi-queue nic has one queue overloaded and sends PAUSE, which results in starvation for the other queues; or PAUSE getting broadcast over a switch when it shouldn't have been, etc.

  • WirelessGigabit 2 years ago

    That's the one that killed my network every evening when I stopped working. My MacBook would go to sleep, with a Dell DA310 plugged in, and serving as its network connection.

    I would go on a walk and every time the missus would call me to tell that the network is down again...

ChrisMarshallNY 2 years ago

My experience: 95% of all network problems are dodgy cables. The other 5% is dodgy switches.

The switches one is a bear to debug. I just tossed out two old switches, because I wasn't sure which one was the one that polluted the network. I could have spent several hours, setting up little sandboxes, but it was easier just to toss them both, and get new ones.

  • hypercube33 2 years ago

    I just retired some 3com 1900 series 1gbps switches for my home lab. Only problems I've had with them were those dell docks I mentioned corrupting arp tables requiring a full network reboot to resolve but it also could be reproduced on Cisco and Dell switches so there's that. HP still fully warranties the damn things to this day.

  • bradfa 2 years ago

    And the final 90% is usually DNS's fault.

    • ycombinete 2 years ago

      Finding this out the hard way at the moment. My pihole appears to cause my Orbi router to intermittently lose internet connectivity.

major505 2 years ago

I used to have a dell vostro notebook that every time I logged into the network with windows, it would drop every other device on the wifi. Funny thing, is that if I used linux, the router would be alright and not collapse.

Never found out the reason. In the end I divorced the wife and she got that notebook.

Wifi is weird.

metanonsense 2 years ago

I find it curious that the wifi interface and the USB/ethernet interface are using the same MAC address. Nothing keeps a computer from being attached with both at the same time, in which case this would definitely break things.

  • dessimus 2 years ago

    Also, doesn't Apple do MAC randomization on WiFi by default? So, it should be using the same MAC that already exists on the machine already.

    • alexpotato 2 years ago

      I believe Macintoshes "remember" the most recent IP they were given on a network rather than always doing a DHCP.

      It's one of the reasons given for why people say Macs "feel better" e.g. you sit down in a room with a friend who has a Dell. Your Mac gets WiFi almost immediately while they have to wait a few seconds even though both of you have been on this network before.

      One of the benefits of owning the stack all the way from bare metal to OS to applications.

      • jethro_tell 2 years ago

        Except that it breaks other people in busy networks.

        Fuck everyone else my WiFi feels good.

        • mschuster91 2 years ago

          IIRC it keeps the IP address for as long as the DHCP lease is valid, whereas Windows will do a full DHCP request on every reconnect. That's perfectly valid behavior, and DHCP servers should not delete reservations before the lease expires - which doesn't stop some particularly shoddy implementations from doing so. Cheap SOHO routers tend to not persist leases on the flash to not use up NAND rewrite cycles, so they reset everything upon reboot.

  • detourdog 2 years ago

    Probably DHCP lease cache problem.

akira2501 2 years ago

We had Dell laptops with the dock. If you undocked the laptop under the right circumstances, which we could never determine, the dock would go into a mode where it would start spamming IEEE 802.3x PAUSE frames into the network as fast as it possibly could.

Our switches didn't handle this correctly and would forward the PAUSE frame as it if it were a broadcast. When this happened, the entire network would cease to function until the offending dock was found and disconnected.

  • RedShift1 2 years ago

    Sounds to me like something's crashing on disconnect, the ethernet watchdog notices a full receive buffer so it starts sending out pause frames, but the buffer never gets emptied anymore so the watchdog keeps sending out pause frames. Similar to like how you keep hearing the same audio sample over and over again when Windows bluescreens.

  • caconym_ 2 years ago

    I never root-caused it so thoroughly, but I have an Anker USB-C hub that causes exactly the same effect when it's unplugged from a machine while still connected to ethernet. Took me months to figure out wtf was happening.

bediger4000 2 years ago

Why would a MAC address duplication "lock up" your whole network? I would guess that at worst that particular laptop would either intermittently get packets or not get packets at all. Which is bad, but not a "lock up". Other stuff should still work.

  • wjholden 2 years ago

    I agree with your assessment. A duplicate MAC address could cause a problem for one device, but it shouldn't have caused a full network outage.

    Simply opening WireShark for a few minutes might have helped the poster identify their problem faster. Maybe you would observe the same frame transmitting again and again, maybe you would see corrupted frames, or maybe nothing at all.

  • starfallg 2 years ago

    Ethernet switches keep track of which ports have which MAC addresses behind them. This is stored in the CAM table of a layer 2 switch. When duplicate entries occur, this result in MAC address flapping and switches have different ways of handling this, which may result in network instability.

  • hypercube33 2 years ago

    It shouldn't but I've seen and had some dell docks corrupt arp tables on my switches

  • jonah 2 years ago

    Things like that can cause routing "loops" in your network which can mess up everything, not just the device in question. We've had that issue with "dumb" switches.

fmeyer 2 years ago

I've drop down two networks in my life,

First time I run a dhcp server by accident and suddenly went sideways but the blast radius was small,

Second time and more interesting one, My campus had a mac address allow list; when I got a new computer and didn't want to handle the process of updating my access access permission, I just run a script to change my mac to the old known address.

Later, I also sold the old computer to another colleague which didn't bother to register as well since everything was working. Long story short, we keep disconnecting each other.

One day I was, "c'mon, I'll fix this". Opened wireshark and started to capture network traffic. I've got a list of mac addresses from the pcap dump and every time I got disconnected, I ran the script spoofing my address to the next one in my list.

That worked fine until the day I spoofed the mac address of a central managed switch that shit itself out of the network.

:)

aag2113 2 years ago

I recently solved a 5-10 year bug in a similar vein. My local network DHCP shared an IP space with the VPN to which I have been connecting. This was causing intermittent and "unexplainable" connection issues that were ultimately due to plain old IP-address collisions.

Sadly, the immense joy that I felt upon resolution is mysteriously lacking from this person's post.

Edit: DHCP not DNS

StapleHorse 2 years ago

I have a dumb 1gb switch that hangs semi randomly but very seldom. Not all devices go throuh it,so the first time it happend was crazy to debug. Because I didn't keep track of which devices go throuh the switch and which ones go direct to one of the 4 ports of the router.

And when I found out: "Mother f

fmajid 2 years ago

Apple has this "feature" called Bonjour Sleep Proxy whereby which certain stationary devices like an Airport AP, HomePod or AppleTV will impersonate your Mac when it goes to sleep, so f you want to access a service on it, e.g. stream music from it, the other device acts as a proxy, sends a WoL packet to the Mac and passes the MAC back to it. Needless to say, this often plays havoc with poorly coded routers or switches with wonky ARP or bridge MAC to port lookup tables.

https://en.wikipedia.org/wiki/Bonjour_Sleep_Proxy

ladberg 2 years ago

I've experienced a similar situation where a USB-C hub would bring down everything connected to the switch, but only when it was disconnected from the laptop. No clue how this stuff makes it out of the factory...

  • dylan604 2 years ago

    I mean, why would you possibly have a hub connected to the switch if it wasn't connected to the laptop? Surely, nobody would ever disconnect their laptop and leave all of the other devices connected waiting and ready for the laptop to be reconnected at a later time. That's such an edge case that it could never have been thought of as a normal thing to test against. Only weirdos take their laptops away from a desk environment. What are they gonna do, use it on their laps?

g051051 2 years ago

I had (probably still have) a similar-ish problem. I have an old nVidia Shield handheld that I bought a wired ethernet adapter for. Something about that adapter would kill my network dead after a random interval. It took a while to figure out what device was causing it, and unplugging the adapter would instantly cause the network to come back to life. I never figured out what the root cause was, I just stopped using the adapter.

ElijahLynn 2 years ago

Woah! What an experience to go through, I'd find myself relieved and pissed at the same time.

> The dock apparently holds the mac address of the MacBook and retains the IP even when the MacBook isn’t docked. Thus when the MacBook is removed from the dock and connects to the Network via Wifi with the same Mac address it causes a conflict and worse, it locks the whole network.

jokoon 2 years ago

I have a thinkpad, and for some reason my wifi firmware crashes at random times, I can't fix it.

Other annoying problem: if I start my computer with the jack plugged in, unplugging it doesn't switch to speakers.

SpaceManNabs 2 years ago

I don't understand networking enough or maybe what docks actually do. What was the OOP's solution?

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection