Bypassing DPI with eBPF sock_ops

Blog/Bypassing DPI with eBPF sock_ops

The problem

A site I wanted to open never loaded. The TLS handshake stalled. Somewhere upstream, a middlebox was reading the SNI in my ClientHello and dropping the connection.

DNS was the other half. Even if you slip past the DPI, the resolver hands you a fake IP that lands on a block page. So both have to be dealt with.

Most fixes mean a VPN or a proxy. I didn’t want either. A VPN sends every byte you have through some remote server. Way too much for this. A proxy needs per-app configuration, and plenty of apps just ignore the system proxy. What I wanted was simpler: system-level, transparent, one command.

sudo gecit run

The idea

The trick is simple. Send a fake ClientHello before the real one reaches the DPI.

The fake carries a different SNI, www.google.com, with a low TTL. Just high enough to reach the middlebox. Low enough to expire on the way to the server. The DPI sees it, logs “google.com”, and waves the connection through. The server never lays eyes on it.

Then the real ClientHello slips by. The DPI’s already made up its mind, and it’s wrong.

App connects to target:443
    |
gecit intercepts the connection
  Linux:  eBPF sock_ops fires (inside kernel, before app sends data)
  macOS:  TUN device captures packet, gVisor netstack terminates TCP
    |
Fake ClientHello with SNI "www.google.com" sent with low TTL
    |
Fake reaches DPI -> DPI records "google.com" -> allows connection
Fake expires before server (low TTL) -> server never sees it
    |
Real ClientHello passes through -> DPI already desynchronized

There’s one more piece. The eBPF program clamps TCP MSS down to 88 bytes, which forces the kernel to fragment the real ClientHello into tiny segments. Some DPI boxes only inspect the first segment, so when the SNI ends up split across two, they can’t read it at all.

DNS gets its own treatment. gecit runs a local DoH server on 127.0.0.1:53 and points the system at it. Queries leave the box over encrypted HTTPS, going to Cloudflare or Google or whatever upstream you pick. Plaintext poisoning has nothing to grab onto.

Linux: eBPF sock_ops

The fake has to leave before the application sends any data. Miss that window and the bypass fails.

eBPF sock_ops gives you that window cleanly. You attach a BPF program to a cgroup. The kernel calls into it at specific points in the TCP lifecycle. The one I want is BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB. It fires the moment an outgoing TCP connection finishes its three-way handshake. Connection up, app hasn’t said a word yet.

The program:

SEC("sockops")
int gecit_sockops(struct bpf_sock_ops *skops)
{
    __u32 key = 0;
    struct gecit_config_t *cfg = bpf_map_lookup_elem(&gecit_config, &key);
    if (!cfg || !cfg->enabled)
        return 1;

    switch (skops->op) {
    case BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB:
        return handle_established(skops, cfg);
    case BPF_SOCK_OPS_HDR_OPT_LEN_CB:
        return handle_hdr_opt_len(skops);
    case BPF_SOCK_OPS_WRITE_HDR_OPT_CB:
        return handle_write_hdr_opt(skops, cfg);
    }

    return 1;
}

When a fresh connection to port 443 lands, handle_established does two things:

static __always_inline int handle_established(struct bpf_sock_ops *skops,
                                              struct gecit_config_t *cfg)
{
    __u32 dst_ip = skops->remote_ip4;
    if (bpf_map_lookup_elem(&exclude_ips, &dst_ip))
        return 1;

    __u16 dst_port = (__u16)bpf_ntohl(skops->remote_port);
    if (!bpf_map_lookup_elem(&target_ports, &dst_port))
        return 1;

    // Set small MSS to force ClientHello fragmentation.
    int mss = cfg->mss;
    bpf_setsockopt(skops, IPPROTO_TCP, TCP_MAXSEG, &mss, sizeof(mss));

    // Notify userspace via perf event.
    struct conn_event evt = {};
    evt.src_ip   = skops->local_ip4;
    evt.dst_ip   = skops->remote_ip4;
    evt.src_port = skops->local_port;
    evt.dst_port = dst_port;
    evt.seq      = skops->snd_nxt;
    evt.ack      = skops->rcv_nxt;
    bpf_perf_event_output(skops, &conn_events, BPF_F_CURRENT_CPU,
                          &evt, sizeof(evt));

    // ... MSS restoration tracking omitted for brevity
    return 1;
}

First, bpf_setsockopt clamps TCP_MAXSEG to 88. In-kernel, per connection. The app has no idea. Whenever the ClientHello goes out, the kernel chops it into small segments on its way down.

Then the program fires a perf event. IPs, ports, sequence and ack numbers, all packed in. The seq/ack matter. The fake has to carry the same values as the real connection or the DPI throws it out.

A Go goroutine reads the events and ships the fake:

func (m *Manager) readEvents(ctx context.Context) {
    defer m.wg.Done()

    for {
        record, err := m.reader.Read()
        if err != nil {
            select {
            case <-ctx.Done():
                return
            default:
            }
            return
        }

        if len(record.RawSample) < 20 {
            continue
        }

        var evt connEvent
        evt.SrcIP = binary.NativeEndian.Uint32(record.RawSample[0:4])
        evt.DstIP = binary.NativeEndian.Uint32(record.RawSample[4:8])
        evt.SrcPort = binary.NativeEndian.Uint16(record.RawSample[8:10])
        evt.DstPort = binary.NativeEndian.Uint16(record.RawSample[10:12])
        evt.Seq = binary.NativeEndian.Uint32(record.RawSample[12:16])
        evt.Ack = binary.NativeEndian.Uint32(record.RawSample[16:20])

        m.injectFake(evt)
    }
}

The fake is a stripped-down TLS ClientHello, SNI set to www.google.com. Out it goes through a raw socket with IP_HDRINCL, same 5-tuple, same seq/ack from the eBPF event. One difference: TTL 8 instead of 64.

Detection and MSS clamping live in the kernel. Building the fake and writing it to a raw socket happens in Go, in userspace. The handshake is the only thing that crosses the boundary. After that, traffic stays in the kernel, full speed. gecit adds zero overhead to bulk transfer.

macOS: no eBPF, now what?

macOS has no eBPF. No programmable in-kernel packet hooks like sock_ops. DTrace, Endpoint Security, and the MAC framework exist, but none of them let you splice fake TCP into a live connection. The closest path is a Network Extension, which needs an Apple-approved entitlement on top of Developer ID signing. Not realistic for a small open-source tool.

So I tried the obvious thing first: an HTTP CONNECT proxy. Point system HTTPS at 127.0.0.1:8443, intercept CONNECT, inject the fake, pipe data. Browsers loved it. Then someone tried Discord. Discord doesn’t care about your proxy settings. Plenty of apps don’t. That killed the proxy approach.

TUN was next. A TUN device is a virtual network interface. You route traffic to it, and your userspace program reads and writes raw IP packets on the other side. Every app’s traffic flows through. Nothing escapes.

The setup uses sing-tun on top of gVisor’s userspace TCP/IP stack. A TCP connection to port 443 hits the TUN, gVisor terminates the handshake locally, and gecit opens its own connection to the real server. It reads the ClientHello off the app side, fires the fake, then forwards the real one:

func (h *handler) injectAndForward(appConn, serverConn net.Conn, dst string) {
    appConn.SetReadDeadline(time.Now().Add(5 * time.Second))
    clientHello := make([]byte, 16384)
    n, err := appConn.Read(clientHello)
    if err != nil {
        return
    }
    clientHello = clientHello[:n]
    appConn.SetReadDeadline(time.Time{})

    if sni := fake.ParseSNI(clientHello); sni != "" {
        dst = fmt.Sprintf("%s:%d", sni, serverConn.RemoteAddr().(*net.TCPAddr).Port)
    }

    seq, ack := seqtrack.GetSeqAck(serverConn)

    // ... build ConnInfo with seq/ack ...

    for i := 0; i < 3; i++ {
        h.mgr.rawSock.SendFake(connInfo, fake.TLSClientHello, h.mgr.cfg.FakeTTL)
    }

    time.Sleep(2 * time.Millisecond)

    serverConn.Write(clientHello)
    pipe(appConn, serverConn)
}

It works. Every app gets caught. There’s a price, though, and it’s not small. All traffic now passes through userspace. Every packet crosses the kernel-user boundary twice. On Linux, only the handshake leaves the kernel. On macOS, everything does. Same overhead as a VPN, no remote server.

Then the seq/ack problem. On Linux, the eBPF program reads snd_nxt and rcv_nxt straight off the socket. macOS has no equivalent. So I capture SYN-ACKs on the physical NIC with pcap and pull the sequence numbers out of there.

Windows: same approach, different pain

Same TUN + gVisor architecture as macOS. Different headaches.

Raw sockets, first. Windows has blocked TCP raw sockets since Vista. Spoofed packets through Winsock are off the table. Workaround: Npcap. Its pcap_sendpacket injects raw Ethernet frames through a kernel driver. Which means I’m now responsible for the entire frame, gateway MAC and all (which I have to fish out of the ARP table):

func (s *pcapRawSocket) SendFake(conn ConnInfo, payload []byte, ttl int) error {
    ipTcp := BuildPacket(conn, payload, ttl)

    frame := make([]byte, 14+len(ipTcp))
    copy(frame[0:6], s.dstMAC)   // gateway MAC
    copy(frame[6:12], s.srcMAC)  // our MAC
    frame[12] = 0x08             // EtherType: IPv4
    frame[13] = 0x00
    copy(frame[14:], ipTcp)

    return s.handle.WritePacketData(frame)
}

Most Windows DPI bypass tools reach for WinDivert. Solid project, but the code signing certificate expired in 2023. Defender flags it. Some systems just won’t load the driver. gecit uses WinTUN from the WireGuard project instead. Properly signed, actively maintained.

Npcap is its own thing. You can’t redistribute it without an OEM license, so users have to grab it from npcap.com themselves. Not ideal. But on Windows, raw packet injection has no other path.

What eBPF gives you

Doing this on three platforms makes the gap obvious.

Linux (eBPF): synchronous hook into the kernel TCP stack. Fires at exactly the right moment, connection established, no data sent yet. MSS clamping in-kernel via bpf_setsockopt. Seq/ack handed to you directly. Only the fake send touches userspace. Bulk traffic, zero overhead.

macOS (TUN): virtual interface, userspace TCP/IP stack via gVisor, routing table to manage, pcap to extract seq/ack, mDNSResponder to wrangle, network service detection for DNS. Everything goes through userspace.

Windows (TUN + Npcap): all of the above, plus Ethernet frame construction, ARP table parsing for the gateway MAC, IP checksum work, Npcap as a runtime dependency, and Defender false positives to manage.

The gap isn’t a slope. It’s a step. eBPF lets you reach into the kernel TCP stack at exactly the right point with no scaffolding around it. On the other platforms, you end up building a tiny VPN to do what a short BPF program does in-kernel.

TUN does have one advantage worth pointing out. It catches everything at the IP layer, including apps that ignore proxy settings. Linux gets a similar property through eBPF: sock_ops attaches to a cgroup, so any process inside that cgroup is covered no matter what it does with its own networking.

Rough edges

DPI behavior varies a lot from network to network. Some send RSTs. Some just drop the packet. The TTL has to be high enough to reach the middlebox and low enough to expire before the server. Default is 8, which works on most networks. traceroute is your friend for tuning.

The DPI also wants real seq/ack on the fake. Placeholders get rejected outright. So when pcap fails to catch the SYN-ACK, the fake goes out with garbage values and the DPI just shrugs.

DoH has a chicken-and-egg problem. gecit redirects system DNS to its local server. But the DoH client itself needs to resolve the upstream. If that upstream is a hostname (dns.nextdns.io, say), it has to be resolved before gecit takes DNS over. So gecit resolves all upstream hostnames at startup and pins them to IPs.

On macOS, DNS configuration is per network service. Connected via USB tethering? gecit needs to set DNS on the tethering service, not Wi-Fi. It picks the active one off the default route.

Flatpak on Linux is a fun one. Each app runs in its own sandbox with its own DNS resolution. gecit edits /etc/resolv.conf to point at the local DoH server, but Flatpak doesn’t see that. The DPI bypass still works just fine, because eBPF runs in the kernel, below every sandbox. DNS bypass doesn’t, and you have to set it manually inside Flatpak. A neat illustration of where kernel hooks succeed and userspace tweaks fall short.

Links

gecit is on GitHub. GPL-3.0. Supports Linux, macOS, and Windows.

It does one thing. It does not hide your IP address, encrypt your traffic, or provide anonymity. It prevents DPI middleboxes from reading the SNI field in TLS handshakes. That’s it.