Zero-Downtime Ingress Controller Migration in Kubernetes

8 min read Original article ↗
Zero-Downtime Ingress Controller Migration in Kubernetes

The Kubernetes community announced the retirement of ingress-nginx, with maintenance ending in March 2026. After that date: no releases, no bugfixes, no security patches. If you’re running ingress-nginx, you need a migration plan.

This article presents a generic blueprint for migrating from ingress-nginx to any alternative ingress controller with zero downtime. The approach relies on two properties of Kubernetes: multiple ingress controllers can run simultaneously, and ingressClassName determines which controller handles which Ingress resource. Combined with DNS-based traffic switching, this allows a gradual, verifiable migration without service interruption.

This post is for: Platform and DevOps engineers running ingress-nginx in production who need to migrate to a different ingress controller without downtime.


TL;DR

  1. Install the new ingress controller alongside ingress-nginx (each gets its own LoadBalancer IP)
  2. Duplicate Ingress resources with ingressClassName pointing to the new controller
  3. Verify the application works through both controllers
  4. Switch DNS from the old IP to the new IP
  5. Keep ingress-nginx running for 24-48h to serve clients with stale DNS caches
  6. Remove old Ingress resources and uninstall ingress-nginx

The full process is a parallel-run migration: both controllers serve traffic simultaneously during the transition, and DNS provides the cutover mechanism.

🔗 Reference Implementation on GitHub


Why This Approach Works

The blueprint applies a well-known pattern from application deployments to infrastructure components: parallel deployment with gradual traffic shifting.

The approach works because of three Kubernetes properties:

  1. Controller isolation via ingressClassName: Each Ingress resource declares which controller should handle it. Two controllers can coexist without interfering with each other, even in the same cluster — just make sure not to configure both as the default ingress class, otherwise Ingress resources without an explicit ingressClassName may get assigned to the wrong controller.

  2. Independent LoadBalancer services: Each ingress controller gets its own external IP through a separate LoadBalancer Service. Traffic routing happens at the DNS level, outside the cluster, giving you full control over the cutover.

  3. Graceful DNS propagation: DNS TTLs provide a built-in transition window. By keeping the old controller running after the DNS switch, clients with cached DNS entries continue to be served until their caches expire.

The result: at no point during the migration is there a moment where incoming traffic has nowhere to go.

The Blueprint

Step 1: Document the Baseline

Before changing anything, document your current state:

  • Which Ingress resources exist, in which namespaces
  • What nginx-specific annotations are in use (rewrite rules, rate limiting, custom headers, SSL configuration)
  • The current LoadBalancer IP and DNS records pointing to it
  • Any nginx ConfigMap customizations

Nginx-specific annotations won’t carry over to the new controller. Each controller has its own annotation format for features like URL rewriting, header manipulation, or rate limiting. Identifying these early prevents surprises in step 3.

Step 2: Install the New Controller

Install the target ingress controller alongside ingress-nginx. The important configuration detail: set isDefaultClass: false on the new controller to prevent it from claiming existing Ingress resources.

Example: Traefik values.yaml

ingressClass:

isDefaultClass: false

After this step, both controllers are running, each with its own LoadBalancer IP. The existing ingress-nginx setup is completely unaffected since no Ingress resources reference the new controller yet.

Verify by checking that both controller Services have external IPs:

Terminal window

kubectl get svc -A | grep -E 'LoadBalancer'

Step 3: Create Dual Ingress Resources

For each existing Ingress resource, create a second one targeting the new controller. The key fields to change:

  • metadata.name: Use a distinct name (e.g., append -traefik)
  • spec.ingressClassName: Set to the new controller’s class name
  • Annotations: Translate nginx-specific annotations to the new controller’s format

For example, nginx’s rewrite-target annotation becomes a Middleware resource in Traefik, a url-rewrite-target annotation in HAProxy, or an HTTPRoute filter in Gateway API-based controllers. Just to illustrate the difference (don’t replace the existing NGINX ingress yet):

nginx → Traefik Ingress

apiVersion: networking.k8s.io/v1

kind: Ingress

metadata:

name: my-app-nginx

name: my-app-traefik

annotations:

nginx.ingress.kubernetes.io/rewrite-target: /$2

traefik.ingress.kubernetes.io/router.middlewares: default-strip-prefix@kubernetescrd

spec:

ingressClassName: nginx

ingressClassName: traefik

rules:

- http:

paths:

- path: /my-app(/|$)(.*)

pathType: ImplementationSpecific

- path: /my-app

pathType: Prefix

backend:

service:

name: my-app

port:

number: 80

---

apiVersion: traefik.io/v1alpha1

kind: Middleware

metadata:

name: strip-prefix

spec:

stripPrefix:

prefixes:

- /my-app

After applying the new Ingress resources, test thoroughly via the new controller’s IP:

Terminal window

curl http://<NEW_CONTROLLER_IP>/my-app

Both controllers now serve the same application independently. This is the critical verification step. Every route, every rewrite rule, every TLS certificate must work through the new controller before proceeding.

For complex setups, Traefik provides a NGINX Migration Report tool that analyzes existing nginx Ingress resources and reports what needs to change.

Step 4: Switch DNS

Lower your DNS TTL to 60 seconds before the switch (ideally 24h in advance to let the lower TTL propagate). Then update your DNS A/AAAA records to point to the new controller’s LoadBalancer IP.

Do not remove ingress-nginx yet. DNS caches exist at multiple levels: recursive resolvers, operating systems, browsers, and applications. Even with a 60-second TTL, some clients may hold stale records for hours. Keep ingress-nginx running for at least 24-48 hours after the switch.

During this window, monitor both controllers. Traffic on the old controller should taper off as DNS caches expire. Traffic on the new controller should ramp up correspondingly.

Step 5: Remove Old Ingress Resources

Once traffic on ingress-nginx has dropped to zero (or near-zero), delete the nginx-specific Ingress resources:

Terminal window

kubectl delete ingress my-app-nginx -n default

The new controller is now the sole ingress path.

Step 6: Uninstall ingress-nginx

Remove the ingress-nginx Helm release and its namespace:

Terminal window

helm uninstall ingress-nginx --namespace ingress-nginx

kubectl delete namespace ingress-nginx

Verify cleanup:

Terminal window

kubectl get ingressclass # should only show the new controller

Monitoring During Migration

Continuous monitoring during the migration is important for verifying zero downtime. A simple approach: poll your application endpoint every second and log HTTP status codes throughout the process.

The reference implementation includes a monitor.sh script that does exactly this. It tracks request success/failure rates and reports downtime windows with timestamps. Since both controllers serve traffic throughout the DNS transition, no requests should fail: clients either resolve the old IP (old controller still running) or the new IP (new controller ready). DNS resolvers return one or the other, not an error.

Choosing a Target Controller

The blueprint above works with any Kubernetes ingress controller. Here’s a brief overview of the main alternatives.

Traefik

Traefik is a CNCF project with native Kubernetes support. It auto-discovers services, handles Let’s Encrypt certificate provisioning out of the box, and supports both the Ingress API and its own IngressRoute CRD for advanced routing. Traefik also supports the Kubernetes Gateway API. Configuration happens primarily through Kubernetes resources (Ingress, IngressRoute, Middleware), which fits well into GitOps workflows.

HAProxy Kubernetes Ingress Controller

The HAProxy Kubernetes Ingress Controller brings HAProxy’s battle-tested load balancing to Kubernetes. It supports advanced traffic management features like connection draining, circuit breaking, and detailed traffic metrics. HAProxy’s strength is raw performance and fine-grained control over load balancing behavior.

Envoy Gateway

Envoy Gateway is a CNCF project that implements the Kubernetes Gateway API using Envoy as the data plane. If you’re looking to adopt the Gateway API (which is the intended successor to the Ingress API), Envoy Gateway is a natural fit. It provides advanced traffic routing, traffic splitting, and extensibility through Envoy’s filter chain.

Cilium

Cilium is primarily known as a CNI plugin using eBPF, but it also includes a full ingress controller and Gateway API implementation. If you’re already using Cilium for networking or network policies, its ingress capabilities eliminate the need for a separate controller. The eBPF-based data plane avoids iptables overhead and can provide better performance at scale.

A Note on Gateway API

The Kubernetes Gateway API is the successor to the Ingress API, and the retirement of ingress-nginx is partly motivated by this shift. Gateway API provides a more expressive resource model (Gateway, HTTPRoute, GRPCRoute) with role-oriented design. If you’re migrating anyway, it may be worth evaluating whether to adopt Gateway API at the same time. Traefik, Envoy Gateway, Cilium, and HAProxy all support it to varying degrees. This is a separate migration, though, and can be done independently after switching controllers.

Key Considerations

Annotation translation is the hardest part. The actual controller swap is mechanical. Translating nginx-specific behavior (rewrite rules, custom headers, rate limits, authentication) to the new controller’s configuration model takes the most effort. Test each route individually.

TLS certificates need attention. If you’re using cert-manager, your Certificate resources and Ingress TLS configuration may need adjustments for the new controller. Create the new certificates before the DNS switch so they’re ready when traffic arrives.

Don’t rush the cleanup. The cost of running two controllers for 48 hours is negligible. The cost of removing the old controller too early while clients still resolve to its IP is downtime.

Automate the verification. For clusters with many Ingress resources, scripting the verification step (curling each route through the new controller) saves time and catches issues that manual testing misses.

Conclusion

Zero-downtime ingress controller migration follows a straightforward pattern: run both controllers in parallel, verify through the new one, switch DNS, and clean up. The approach is generic and works with any target controller.

The retirement of ingress-nginx is a forcing function, but the migration pattern itself is useful beyond this specific event. Ingress controllers are infrastructure, and infrastructure changes over time. Having a tested migration blueprint makes that inevitable change less disruptive.

A reference implementation targeting Traefik with step-by-step scripts is available at georg-schwarz/zero-downtime-nginx-to-traefik.


Resources: