DevOps/Kubernetes/SRE & Reliability 3 min read

How to load balance the Kubernetes API server with NGINX

A lab-oriented explanation of placing NGINX in front of Kubernetes API servers, with notes on high availability alternatives.

What you will learn

Why a stable Kubernetes control plane endpoint matters.
How NGINX can distribute API server traffic.
What to watch for when running self-managed clusters.

Problem statement

In a self-managed Kubernetes cluster, worker nodes, administrators, and automation need a stable API endpoint. If you have more than one control plane node, clients should not be configured against one individual node. A load balancer gives the cluster a single endpoint while distributing traffic across healthy API servers.

When this matters in real work

This pattern matters when you run your own control plane, build a lab that resembles production, or need a clear mental model for highly available Kubernetes control plane access. Managed Kubernetes services usually provide this endpoint for you. Self-managed clusters require you to design it.

Prerequisites

At least two Kubernetes control plane nodes listening on TCP port 6443.
An NGINX host or pair of hosts that clients can reach.
Firewall rules that allow TCP 6443 from clients to NGINX and from NGINX to the API servers.
Kubernetes certificates that include the load balancer DNS name or virtual IP in the API server SANs.

NGINX TCP stream example

The Kubernetes API uses HTTPS, but the load balancer does not need to terminate TLS. Use the NGINX stream module to pass TCP traffic through to the API servers.

stream {
    upstream kube_apiservers {
        server 10.0.0.11:6443 max_fails=3 fail_timeout=10s;
        server 10.0.0.12:6443 max_fails=3 fail_timeout=10s;
        server 10.0.0.13:6443 max_fails=3 fail_timeout=10s;
    }

    server {
        listen 6443;
        proxy_pass kube_apiservers;
        proxy_connect_timeout 3s;
        proxy_timeout 10m;
    }
}

After changing the configuration, validate and reload NGINX:

nginx -t
systemctl reload nginx

Common mistakes

Using an HTTP reverse proxy configuration instead of TCP stream load balancing.
Forgetting to include the load balancer address in the API server certificate SANs.
Making NGINX a single point of failure without a virtual IP, redundant node, or cloud load balancer in front of it.
Testing only kubectl access and not node bootstrap, controllers, and automation that also depend on the API endpoint.

Production notes and security tradeoffs

Keep TLS end-to-end unless you have a deliberate reason to terminate it. Restrict who can reach port 6443, monitor NGINX health, and make sure the load balancer layer is itself highly available. For production environments, also compare this approach with a managed cloud load balancer or a purpose-built HA design such as keepalived plus HAProxy.

Summary

NGINX can be a useful TCP load balancer for a self-managed Kubernetes API endpoint. The important details are certificate SANs, stream configuration, health behavior, access control, and avoiding a new single point of failure.

Diagram placeholder

The basic lab shape is a client or worker node talking to one stable endpoint, with NGINX forwarding to healthy API server instances.

kubectl / kubelet / controllers
          |
          v
   NGINX TCP load balancer
      |        |        |
      v        v        v
 api-1:6443 api-2:6443 api-3:6443
          |
          v
        etcd quorum

Lab pattern, not production HA

Lab pattern, not production HA: a single NGINX node is useful for learning the traffic flow, health checks, and TLS pass-through behavior, but it is still a single point of failure. For real high availability, evaluate a managed Kubernetes control plane, cloud load balancer, HAProxy with Keepalived, kube-vip, or a platform-supported virtual IP design.

Also separate API-server availability from etcd health. A load balancer can send traffic to healthy API servers, but it cannot fix an unhealthy etcd quorum or broken certificates. If you want help turning this kind of architecture tradeoff into interview-ready reasoning, start with the mentorship page. For operational tooling around clusters, see useful kubectl plugins.

If you are working on this topic and want practical guidance, you can book a mentorship call.

We can use a focused session to clarify the concept, review your next step, or connect it to Cloud, DevOps, and SRE work.

Book intro call