Tracing HTTP Requests with Go's net/http/httptrace

7 min read Original article ↗

[Main] [Essays] [Articles] [Projects] [Now] [Resume] [Talks]

net/http/httptrace has been in the standard library since Go 1.7 and most Go developers I talk to have never used it. It exposes hooks for the points in an outgoing HTTP request that you usually cannot see from outside the transport: DNS resolution, connection acquisition, TLS handshake, the moment bytes go on the wire, the moment the first response byte comes back.

The interesting part is how it plugs in. There is no Tracer interface on http.Client, no middleware to register. You attach a ClientTrace to a context.Context and the transport pulls it back out via httptrace.ContextClientTrace at the points where it matters. I want to walk through that design choice first because it explains how the package composes with the rest of the stdlib, then build two things with it: a curl --trace-style CLI and a reusable http.RoundTripper that logs timings for every request.

Why Context, Not an Interface

The obvious design for request tracing would be to define a Tracer interface, add a Tracer field to http.Client or http.Transport, and call methods on it from inside the transport. That is roughly how most languages handle this.

Go's standard library does not work that way. Instead, httptrace.WithClientTrace returns a new context carrying a *ClientTrace, you attach that context to your request with req.WithContext(ctx), and the transport pulls the trace back out via httptrace.ContextClientTrace at the points where it matters.

trace := &httptrace.ClientTrace{
    DNSStart: func(info httptrace.DNSStartInfo) {
        fmt.Printf("DNS start: %s\n", info.Host)
    },
    DNSDone: func(info httptrace.DNSDoneInfo) {
        fmt.Printf("DNS done: %v\n", info.Addrs)
    },
}

ctx := httptrace.WithClientTrace(context.Background(), trace)
req, _ := http.NewRequestWithContext(ctx, http.MethodGet, "https://example.com", nil)
http.DefaultClient.Do(req)

This is unusual but it pays off. The trace travels with the request, so any middleware that forwards the context propagates tracing for free. Nothing on the client is shared mutable state, so concurrent requests from the same http.Client can carry different traces. And the transport ignores the trace entirely if no one attached one, so the cost when unused is a nil check.

ClientTrace itself is a struct of optional function fields:

type ClientTrace struct {
    GetConn              func(hostPort string)
    GotConn              func(GotConnInfo)
    PutIdleConn          func(err error)
    GotFirstResponseByte func()
    Got100Continue       func()
    Got1xxResponse       func(code int, header textproto.MIMEHeader) error
    DNSStart             func(DNSStartInfo)
    DNSDone              func(DNSDoneInfo)
    ConnectStart         func(network, addr string)
    ConnectDone          func(network, addr string, err error)
    TLSHandshakeStart    func()
    TLSHandshakeDone     func(tls.ConnectionState, error)
    WroteHeaderField     func(key string, value []string)
    WroteHeaders         func()
    Wait100Continue      func()
    WroteRequest         func(WroteRequestInfo)
}

You set only the fields you care about. Unset fields are nil and skipped. This is also why the package can add new hooks without breaking anyone - existing code just leaves the new field nil.

Building a curl --trace

The first thing worth building is a CLI that takes a URL and prints a timing breakdown like curl -w does, but with the granularity httptrace exposes. The trick is recording timestamps in each hook and computing durations relative to a start time.

package main

import (
    "context"
    "crypto/tls"
    "fmt"
    "net/http"
    "net/http/httptrace"
    "os"
    "time"
)

type timings struct {
    start            time.Time
    dnsStart         time.Time
    dnsDone          time.Time
    connectStart     time.Time
    connectDone      time.Time
    tlsStart         time.Time
    tlsDone          time.Time
    gotConn          time.Time
    firstByte        time.Time
    done             time.Time
}

func (t *timings) elapsed(at time.Time) time.Duration {
    return at.Sub(t.start)
}

The ClientTrace for this is mechanical - capture time.Now() in each hook:

func newTrace(t *timings) *httptrace.ClientTrace {
    return &httptrace.ClientTrace{
        DNSStart: func(_ httptrace.DNSStartInfo) {
            t.dnsStart = time.Now()
        },
        DNSDone: func(_ httptrace.DNSDoneInfo) {
            t.dnsDone = time.Now()
        },
        ConnectStart: func(_, _ string) {
            t.connectStart = time.Now()
        },
        ConnectDone: func(_, _ string, _ error) {
            t.connectDone = time.Now()
        },
        TLSHandshakeStart: func() {
            t.tlsStart = time.Now()
        },
        TLSHandshakeDone: func(_ tls.ConnectionState, _ error) {
            t.tlsDone = time.Now()
        },
        GotConn: func(_ httptrace.GotConnInfo) {
            t.gotConn = time.Now()
        },
        GotFirstResponseByte: func() {
            t.firstByte = time.Now()
        },
    }
}

And the main function wires it up and prints the breakdown:

func main() {
    url := os.Args[1]

    t := &timings{start: time.Now()}
    trace := newTrace(t)

    ctx := httptrace.WithClientTrace(context.Background(), trace)
    req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil)
    if err != nil {
        fmt.Fprintln(os.Stderr, err)
        os.Exit(1)
    }

    res, err := http.DefaultClient.Do(req)
    if err != nil {
        fmt.Fprintln(os.Stderr, err)
        os.Exit(1)
    }
    res.Body.Close()
    t.done = time.Now()

    fmt.Printf("DNS lookup:       %v\n", t.dnsDone.Sub(t.dnsStart))
    fmt.Printf("TCP connect:      %v\n", t.connectDone.Sub(t.connectStart))
    fmt.Printf("TLS handshake:    %v\n", t.tlsDone.Sub(t.tlsStart))
    fmt.Printf("Server processing: %v\n", t.firstByte.Sub(t.gotConn))
    fmt.Printf("Content transfer: %v\n", t.done.Sub(t.firstByte))
    fmt.Printf("Total:            %v\n", t.done.Sub(t.start))
}

Run it against any URL and the output tells you where the time went. A slow DNS lookup, a slow TLS handshake, or a server that takes a long time to produce the first byte all show up as their own line in the breakdown. No proxy required, no APM agent, no instrumentation library in the dependency graph.

A few things to know. DNSStart and DNSDone only fire if Go's resolver does the lookup - if the address is already in the kernel's DNS cache or you passed an IP directly, the hooks stay quiet. TLSHandshakeStart and TLSHandshakeDone only fire on HTTPS. GotConn fires whether the connection was new or reused, and the GotConnInfo struct has a Reused field that tells you which.

Building a RoundTripper

A one-shot CLI is useful but most of the time you want every request through a given http.Client traced automatically. That is what http.RoundTripper is for. Wrap the default transport, attach a trace to the context before delegating, log the result when the request finishes.

type TracingTransport struct {
    Base http.RoundTripper
    Log  func(req *http.Request, t *timings)
}

func (tt *TracingTransport) RoundTrip(req *http.Request) (*http.Response, error) {
    base := tt.Base
    if base == nil {
        base = http.DefaultTransport
    }

    t := &timings{start: time.Now()}
    trace := newTrace(t)
    req = req.WithContext(httptrace.WithClientTrace(req.Context(), trace))

    res, err := base.RoundTrip(req)
    t.done = time.Now()

    if tt.Log != nil {
        tt.Log(req, t)
    }

    return res, err
}

Using it is just:

client := &http.Client{
    Transport: &TracingTransport{
        Log: func(req *http.Request, t *timings) {
            log.Printf("%s %s dns=%v tls=%v ttfb=%v total=%v",
                req.Method, req.URL,
                t.dnsDone.Sub(t.dnsStart),
                t.tlsDone.Sub(t.tlsStart),
                t.firstByte.Sub(t.gotConn),
                t.done.Sub(t.start),
            )
        },
    },
}

client.Get("https://example.com")

There is one subtlety here that catches people. If the caller already attached a ClientTrace to the request's context, calling httptrace.WithClientTrace again does not replace it - it composes. ContextClientTrace returns the most recently attached trace, but the hooks from both fire. For a RoundTripper that is usually what you want: the caller's tracing is preserved and yours runs alongside it. If you wanted to replace the existing trace you would need to inspect the context yourself.

The other thing to know is that RoundTrip returns when the response headers are read, not when the body is consumed. t.done here measures TTFB plus header read, not total request time including body. For total time including body transfer, wrap res.Body in a reader that records time.Now() on Close:

type timedBody struct {
    io.ReadCloser
    onClose func()
}

func (tb *timedBody) Close() error {
    tb.onClose()
    return tb.ReadCloser.Close()
}

// ...inside RoundTrip, after the base.RoundTrip call:
res.Body = &timedBody{
    ReadCloser: res.Body,
    onClose:    func() { t.done = time.Now() },
}

Now t.done is set when the caller finishes reading the body and closes it, which is what you want for total request duration.

Connection Reuse

One of the more useful things httptrace reveals is whether your client is actually reusing connections. The GotConnInfo struct passed to GotConn includes Reused bool and WasIdle bool fields. If you are making repeated calls to the same host and Reused is false every time, something in your code is preventing pooling - usually a per-request http.Client, or a response body that never gets closed.

Adding it to the trace is one field:

GotConn: func(info httptrace.GotConnInfo) {
    t.gotConn = time.Now()
    if info.Reused {
        log.Printf("connection reused (idle for %v)", info.IdleTime)
    } else {
        log.Printf("new connection to %s", info.Conn.RemoteAddr())
    }
},

This is the kind of thing you would otherwise need a packet capture to verify.

Conclusion

net/http/httptrace is a small surface: one function to attach a trace, one to retrieve it, and a struct of hooks. The context-based design means it composes with anything in the stdlib that already threads context.Context through, which is most things. The CLI and RoundTripper shown here are short enough to drop into any project as a starting point for debugging slow HTTP calls before reaching for distributed tracing or an APM agent.

If you find this tracing information useful then you might appreciate https://probes.dev since we, at Limeleaf, are building such a tool to monitor websites and APIs with this detailed information. Check it out!