Unexpected sudo hang under VPN kill-switch

8 min read Original article ↗

Recently I ran into a frustrating but interesting issue on my Linux machine (Ubuntu 24.04.2 LTS): whenever my VPN’s kill switch engaged (when the tunnel went down for any reason), using sudo became noticeably slow. Eventually, I figured out what was happening and found several ways to fix it. The root of the problem was surprising, at least for me, and I thought it was worth sharing.

TL;DR

My sudo was trying to resolve the hostname to its FQDN (enabled by default on Debian-based systems like Ubuntu, even without an explicit Defaults fqdn in /etc/sudoers). NSS was configured to use files, resolve, and dns. The files method failed because a hostname change left /etc/hosts stale. Then systemd-resolved and DNS lookups were blocked by the VPN kill-switch, which caused the command to hang until resolution timed out.

To avoid this make sure your current hostname and FQDN are correctly mapped in /etc/hosts, add Defaults !fqdn to /etc/sudoers unless you rely on FQDNs, and optionally enable nss-myhostname for reliable local name resolution.

Causes

When fully qualified domain name (FQDN) resolution is enabled on your system, i.e. when Defaults fqdn is set in /etc/sudoers, sudo resolves the hostname to its canonical FQDN before executing a command. This is mainly done for logging (to record the FQDN in logs) and for matching host-specific rules in /etc/sudoers1.

What’s surprising is that on some Debian-based systems (like Ubuntu), sudo performs FQDN resolution by default2, even if there’s no Defaults fqdn line in /etc/sudoers. That’s because it’s compiled with the --with-fqdn flag, which hard-codes this behavior into the binary. You can verify this on your system by running:

If you see --with-fqdn in the output, it means FQDN resolution is enabled regardless of whether Defaults fqdn is in your /etc/sudoers config.

The order and methods used for resolving hostnames are defined in the Name Service Switch (NSS) configuration file /etc/nsswitch.conf. NSS defines, per database (e.g., hosts, passwd, etc.), which services to consult and in what order. For host lookups, the configuration on my system was:

This means the resolution path is:

  • files — first consult /etc/hosts.
  • resolve — if name resolution fails via files, query systemd-resolved via a local UNIX socket.
  • dns — as a fallback, attempt a traditional DNS query using the nameservers in /etc/resolv.conf (often 127.0.0.53, the stub resolver).

So why was my sudo hanging? The first point of failure was that hostname resolution via files failed on my machine due to a misconfigured /etc/hosts file. But that wasn’t entirely my fault…

Most Linux distributions correctly set the hostname and add it to /etc/hosts during installation. On my system, the entry looked like this:

However, at some point I changed my hostname to starship, either using hostnamectl set-hostname starship or through the GUI: Settings → System → About → Device name: starship — I don’t remember the exact method I used. The crucial gotcha is that changing the system hostname through either method does not update /etc/hosts. That created a silent mismatch: the system now called itself starship, but the local lookup table still only knew the old hostname, ubuntu-box.

The next piece of the puzzle was that I was using a VPN with the kill-switch feature enabled. This mechanism works by configuring the firewall to block all traffic that would bypass the VPN if the tunnel goes down, for example when WiFi disconnects or the VPN process crashes. That is exactly what happened in my case: my WiFi dropped, the kill-switch activated, and all outbound connections were blocked, including DNS queries to my local router or public resolvers.

So when I ran sudo, it tried to resolve starship and failed. The resolution path looked like this:

  • files failed: /etc/hosts had no entry for starship.
  • resolve was next: NSS asked systemd-resolved via the local socket.
  • Upstream query: systemd-resolved didn’t have a cached record, so it forwarded to upstream DNS.
  • Kill-switch engaged: outbound DNS packets were silently dropped, so no replies ever came back.
  • Timeouts: systemd-resolved exhausted its retry attempts for AAAA and A queries across all configured DNS servers.

That prolonged wait was the cause of sudo hanging.

Reproduction

You can reproduce this issue yourself. Use two terminals, one for a root shell to manage system files and iptables rules, and another as a regular user to run the commands that demonstrate the hang.

First, prepare your system like this:

  1. Back up your config files so you can safely revert later:
    sudo cp /etc/hosts /etc/hosts.bak
    sudo cp /etc/nsswitch.conf /etc/nsswitch.conf.bak
    sudo cp /etc/sudoers /etc/sudoers.bak
    
  2. Edit your /etc/hosts file so that it does not contain a mapping for your current hostname $(hostname).
  3. In /etc/nsswitch.conf set the line for hosts to:
  4. Make sure /etc/sudoers does not contain the line Defaults !fqdn to ensure that sudo performs FQDN resolution (always use visudo to edit /etc/sudoers safely).

To simulate the VPN kill switch, we’ll use iptables to drop all outbound IP traffic except loopback (lets 127.0.0.53 stub work locally). Run the following from the root shell:

# Back up current firewall rules
iptables-save > iptables.bak
ip6tables-save > ip6tables.bak

# Flush and delete existing rules and chains
iptables  -F 
iptables  -X
ip6tables -F
ip6tables -X

# Set default DROP policy
iptables  -P INPUT DROP
iptables  -P FORWARD DROP
iptables  -P OUTPUT DROP
ip6tables -P INPUT DROP
ip6tables -P FORWARD DROP
ip6tables -P OUTPUT DROP

# Allow loopback traffic
iptables  -A INPUT -i lo -j ACCEPT
iptables  -A OUTPUT -o lo -j ACCEPT
ip6tables -A INPUT -i lo -j ACCEPT
ip6tables -A OUTPUT -o lo -j ACCEPT

To ensure reproducible results you can reset systemd-resolved and flush its DNS cache after applying the firewall rules:

systemctl restart systemd-resolved
resolvectl flush-caches

Then, in the user shell run the following commands to test hostname resolution:

hostname -f
getent hosts "$(hostname -f)"

Both will hang and fail with:

hostname: Temporary failure in name resolution

This happens because the system tries to resolve the hostname, and when that falls through to DNS, the queries are blocked by the firewall.

Next, test sudo by timing it:

Here, sudo -k invalidates your credentials and true is a dummy command that immediately exits successfully.

It will hang, and after a timeout, print a warning:

sudo: unable to resolve host starship: Temporary failure in name resolution

Then it proceeds normally. That’s because sudo only needs the hostname for logging and policy checks, not to actually run the command.

To revert the changes, run the following from a root shell:

# Restore iptables
iptables-restore < iptables.bak
ip6tables-restore < ip6tables.bak
rm -f iptables.bak ip6tables.bak

# Restore config files
mv /etc/hosts.bak /etc/hosts
mv /etc/nsswitch.conf.bak /etc/nsswitch.conf
mv /etc/sudoers.bak /etc/sudoers

Solutions

There are three main ways to solve this, and you can combine them for a robust solution.

  1. Have a static entry for your current hostname and FQDN so files lookup succeeds immediately. For example:

    127.0.1.1 starship.lan starship
    

    The FQDN must come first because it is treated as the canonical name, and the short hostname follows as an alias.

  2. Tell sudo to avoid FQDN canonicalization by adding this line to /etc/sudoers (use visudo to edit safely):

    Do this only if you don’t rely on FQDNs in sudoers rules or logs. This is a useful workaround for sudo, but it doesn’t fix resolution for other apps.

  3. Enable the nss-myhostname module for “synthetic” local answers without needing DNS. It resolves the current hostname (from gethostname()/hostnamectl) to the IPs currently assigned to your interfaces, and provides reverse lookups. If no non-loopback addresses exist, it falls back to 127.0.0.2 and ::1. First, install it:

    sudo apt install libnss-myhostname
    

    Then modify /etc/nsswitch.conf to include myhostname immediately after files in the hosts, like this:

    hosts: files myhostname resolve dns
    

    Test it by running:

    getent -s myhostname hosts "$(hostname)"
    

    The module is perfect for the short hostname, but it doesn’t synthesize an FQDN. Keeping a stable FQDN in /etc/hosts is still the best way to make hostname -f instant and predictable.

Final thoughts

This whole issue started from a simple hostname change, something you can do in the “Settings” app or with hostnamectl. But neither of these tools updates /etc/hosts by default, leaving the system in an inconsistent state where local hostname resolution can silently break and only fail under specific conditions. A better approach would be for these tools to offer to update /etc/hosts when the hostname changes, showing the diff and creating a backup, or at the very least, warn the user that local name resolution may break.

Another surprising part is that some systems (like Ubuntu) enable FQDN resolution in sudo by default, due to how it’s compiled (with the --with-fqdn flag). That means sudo tries to resolve the machine’s hostname to its FQDN before doing anything, just for logging or policy matching. If the name isn’t found locally and DNS is broken or blocked (as it was in my case), the lookup, and eventually sudo, hangs. A more robust default would be to keep FQDN resolution disabled unless explicitly needed. Short hostnames are usually enough, and admins who rely on host-specific sudoers rules can always enable it manually.