Why Kubernetes exists.
Kubernetes is an orchestrator for containers — declarative, self-healing, control-loop-shaped. It exists because once you have more than 50 containers running across more than two machines, you start re-inventing the same five things: scheduling, health checks, rolling updates, service discovery, and config distribution. Kubernetes is what you'd build if you ran that fleet in production for a decade. Google did, called it Borg, then open-sourced a clean rewrite in 2014.
The mental model is simple and worth committing to memory: every Kubernetes concept is a declarative resource reconciled by a control loop. You write down what you want; the controller compares it to what is, and acts to close the gap. Pods, Deployments, Services, Ingress, ConfigMaps — every one of them follows the same pattern. The complexity is in the surface area, not the design.
When not to use Kubernetes. Single-binary apps. Fleets under ten containers. Workloads that fit in a managed PaaS (Heroku, Fly.io, Railway). Anything where the operational tax of a control plane (~$70/month minimum on managed K8s) outweighs the benefit. Kubernetes pays back at scale; it punishes small.
The twelve mental models you must build.
Twelve concepts cover ~95% of Kubernetes' user-facing surface. Get these into your bones in the first week — every other resource (CronJob, HorizontalPodAutoscaler, NetworkPolicy, ResourceQuota) reduces to a recombination of these.
Day-zero — your first hour.
One hour, no cloud credit, no YAML. You'll have a working local cluster, a running pod, and a service you can hit from your browser. Everything below runs on macOS, Linux, or WSL.
# 1. Install Docker Desktop or Podman, then kind: brew install kind # macOS go install sigs.k8s.io/kind@latest # any platform with Go # 2. Create a cluster (60 seconds) kind create cluster --name first # 3. Run your first pod kubectl run nginx --image=nginx --port=80 # 4. Expose it as a service kubectl expose pod nginx --type=ClusterIP --port=80 # 5. Tunnel and visit kubectl port-forward svc/nginx 8080:80 # open http://localhost:8080 — there's nginx. # 6. Read the events (this is the most useful command in K8s) kubectl get events --sort-by=.metadata.creationTimestamp # 7. Tear down cleanly kubectl delete pod nginx kind delete cluster --name first
Done. You've now exercised four of the twelve mental models — Pod, Service, the kubectl client, and the controller loop that scheduled the pod onto your single-node cluster. The next step is muscle memory.
Day-1 to Day-7 — the muscle memory.
Seven days, ~30–60 minutes each. Each day is a single concept and a single hands-on task. By the end of the week your fingers will type kubectl get pods -A faster than you type your name.
Day 1
Pods, containers, exec
Run a pod, exec a shell, read its logs, kill a misbehaving container by hand.
Day 2
Deployments and rolling updates
Deploy nginx, change image, watch the rollout. Reverse it with kubectl rollout undo.
Day 3
Services and DNS
ClusterIP, NodePort, LoadBalancer, Headless — when to use each. nslookup from a debug pod.
Day 4
ConfigMaps, Secrets, env
Twelve-factor configuration in K8s. Mount a Secret as a file; mount a ConfigMap as env.
Day 5
Volumes and PVCs
emptyDir, hostPath, PVC. Why "stateful in K8s" is genuinely hard; why people pay for managed databases.
Day 6
Probes — liveness, readiness, startup
Build a pod that fails its readiness probe. Watch the Service drain it. Use the probe-generator tool.
Day 7
Namespaces and RBAC basics
Create a service account that can only read pods in one namespace. Test it with kubectl auth can-i.
Week-1 to Month-3 — pick a track.
After the first week you'll know enough to be dangerous. The next two months should be one track at a time, depth-first. Don't try to learn networking and operators in the same fortnight. Pick the track that maps to your job, finish it, then come back.
Networking
kube-proxy modes (iptables, IPVS, eBPF), CNI choices (Cilium, Calico, Flannel), Network Policies, Service Mesh basics.
Storage
CSI drivers, dynamic provisioning, StatefulSet caveats, why volume snapshots are not backups.
Scheduling
Taints, tolerations, affinity, anti-affinity, priority classes, eviction. Why your pod is Pending.
Observability
Prometheus + Grafana, OpenTelemetry, kubectl logs vs Loki, kubectl top vs metrics-server.
Security
RBAC patterns, Pod Security Standards, NetworkPolicy templates, Kyverno / OPA Gatekeeper.
CI/CD
Argo CD, Flux, Helm, Kustomize. GitOps as the default mental model.
Operators
kubebuilder, Operator SDK, controller-runtime. When to write your own CRD vs ship a Helm chart.
The four books that matter.
2nd ed · 2023 · Manning
Marko Lukša — Kubernetes in Action
~700 pages and the most coherent narrative anyone has written. Ignore its length: it's the textbook of the field. Read it in order; do the exercises.
3rd ed · 2022 · O'Reilly
Brendan Burns, Joe Beda, Kelsey Hightower — Kubernetes Up & Running
Terse, opinionated, written by the people who shipped it. The "what would Kelsey do" reference. Pair with Lukša.
2019 · O'Reilly
Hausenblas & Schimanski — Programming Kubernetes
For the day you write your first controller. controller-runtime, custom resources, the API machinery internals. Dated in places (2019) but still the right starting point.
2020 · O'Reilly
Liz Rice — Container Security
The security book that doesn't read like a checklist. Capabilities, namespaces, seccomp, the kernel-level reasoning behind every Pod Security Standard.
Honourable mentions: Cloud Native DevOps with Kubernetes (Arundel/Domingus), Production Kubernetes (Lambert/Saxon), Designing Distributed Systems (Burns).
Courses, labs, and where to actually run things.
Free first. Paid second only when you've outgrown the free tier and want exam-grade lab access.
Free
Kelsey Hightower · GitHub
Kubernetes the Hard Way
Build a cluster from raw VMs. Painful. Irreplaceable. Every operator should do it once.
Linux Foundation · edX
LFS158 — Introduction to Kubernetes
Free, official, 14 chapters, ~10 hours. The standard "I want my résumé to say I know K8s" course.
CNCF · KubeAcademy
Free video courses
Beginner, Intermediate, and "K8s by example" tracks. Video-first. Fast.
Killercoda
Browser-based labs
Free interactive scenarios. Killer.sh runs the official CKA/CKAD simulators on the same engine.
Docker
Play with Kubernetes
Throwaway browser cluster. Four hours of lab time, no signup.
Paid (worth it)
KodeKloud
CKA / CKAD / CKS prep
Best-in-class for cert prep. Hands-on labs match the exam style. ~$30/month gets you through one cert.
A Cloud Guru / Pluralsight
Broader cloud-native paths
Less hands-on than KodeKloud, broader coverage. Nigel Poulton's path is the standout.
Linux Foundation
Official cert prep courses
LFS158 (free intro), LFS258 (CKA), LFS259 (CKAD), LFS260 (CKS). The canonical bundle.
The four-step certification ladder.
Certifications are not the goal — but the CKA and CKAD exam formats (hands-on, 2 hours, real cluster, must-finish-fast) are some of the best forcing functions for memorisation that exist in the industry. The KCNA is the cheap warm-up; CKS is the late-stage flex.
| Cert | Audience | Prep | Format |
|---|---|---|---|
| KCNA | Day-zero → Practitioner | ~40 hours | Multiple choice. Vocabulary cert. |
| CKA | Operator | 80–120 hours | Hands-on. 2 hours, ~6 questions, real cluster. The flagship. |
| CKAD | Practitioner | 60–100 hours | Hands-on. Closer to "build a workload" than "operate a cluster". |
| CKS | Operator + security | 100–150 hours | Hands-on. Requires CKA first. The "you really know K8s" badge. |
Vouchers go on sale at KubeCon and during CNCF promo weeks — wait for them. Each cert is normally $395; you can usually get the bundle for $200–250.
The documentation canon.
Get used to reading these directly. They are the source of truth — every blog, every Stack Overflow answer, every YouTube tutorial is a derivative.
kubernetes.io
Kubernetes documentation
The primary source. Tasks, tutorials, concepts, reference. Pin the table-of-contents.
kubernetes.io
kubectl reference
Every flag, every verb. kubectl explain <resource> is the most under-used command on Earth.
kubernetes/community
API conventions
The document every controller author re-reads twice a year. Status vs spec, optional vs required, the meaning of "PATCH".
CNCF
CNCF Landscape
The map. ~1,200 cloud-native projects categorised. Useful when picking observability/CI/security tools — and humbling.
Hannaford · GitHub
what-happens-when-k8s
A trace of every code path between kubectl run nginx and a running pod. Read once a year.
Community · GitHub
Awesome Kubernetes
Curated link list. Use as a discovery tool, not as a reading order.
Talks and videos worth your evening.
Kelsey Hightower · 2017
Kubernetes for sysadmins
The talk. Forty-three minutes. Gets the architecture across better than most books.
Joe Beda · YouTube archive
TGI Kubernetes
Long-form live coding. Where TGIK started.
CNCF · YouTube
KubeCon recordings
Every KubeCon since 2016. Filter by track. The "best practices" tracks age well; the "what's new" ones don't.
Brendan Burns
Designing Distributed Systems talks
Microsoft talks; designing-distributed-systems content. The "what we'd do differently" perspective from a co-founder.
Hands-on environments — where to actually run things.
| Environment | Cost | Best for |
|---|---|---|
| kind | Free, local | Day-zero through CKA prep. Ships nodes as Docker containers. |
| minikube | Free, local | Same niche as kind. Slightly different defaults; more docs. |
| k3d | Free, local | k3s-based. Very fast spin-up. Lighter than kind. |
| k3s | Free, single-node Pi | Edge / homelab learning. Resource-tiny. |
| GKE Autopilot | Pay-per-pod | Cheapest cloud cluster — under $5/day for tinkering. |
| EKS / AKS | Pay-per-cluster + nodes | Worth doing once for "real cloud" exposure. |
| Civo / Linode LKE | Cheap managed | Cheapest "real cloud" K8s. ~$10/month for a hobby cluster. |
| Killercoda / PWK | Free, browser | Time-boxed practice without setup. Best for cert prep simulators. |
kubectl cheat sheet.
Thirty commands cover ~80% of operator daily use. Memorise these; everything else is kubectl explain.
| kubectl get pods -A | List every pod, every namespace. |
| kubectl get pods -w | Watch pods change state in real time. |
| kubectl describe pod <p> | Read events, conditions, container statuses. |
| kubectl logs <p> -c <c> | Logs from a specific container in a pod. |
| kubectl logs <p> -f --tail=200 | Follow logs, last 200 lines. |
| kubectl logs <p> --previous | Logs from the previous (crashed) container. |
| kubectl exec -it <p> -- sh | Shell into a running container. |
| kubectl port-forward <p> 8080:80 | Tunnel local 8080 to pod port 80. |
| kubectl apply -f <file> | Declarative create or update. |
| kubectl delete -f <file> | Inverse of apply. |
| kubectl edit deployment <d> | Open a deployment in $EDITOR. |
| kubectl rollout status deployment/<d> | Wait for a rollout to finish. |
| kubectl rollout undo deployment/<d> | Roll back to the previous revision. |
| kubectl scale deployment/<d> --replicas=5 | Manual scale. |
| kubectl get events --sort-by=.metadata.creationTimestamp | Time-ordered cluster events. |
| kubectl explain pod.spec.containers | Schema reference for any field. |
| kubectl auth can-i list pods | Check what your current identity can do. |
| kubectl get pods -l app=foo | Filter by label. |
| kubectl top pods | Live CPU/memory (needs metrics-server). |
| kubectl config get-contexts | List configured clusters. |
| kubectl config use-context <ctx> | Switch active cluster. |
| kubectl cp <p>:/path local-path | Copy a file out of a pod. |
| kubectl run -it --rm tmp --image=busybox -- sh | Throwaway debug pod. |
| kubectl get all -n <ns> | Pods, services, deployments, etc. in one view. |
| kubectl drain <node> | Cordon and evict pods for maintenance. |
| kubectl uncordon <node> | Mark a node schedulable again. |
| kubectl taint nodes <n> key=value:NoSchedule | Reserve nodes for specific workloads. |
| kubectl label pod <p> env=staging | Add a label. |
| kubectl annotate pod <p> note="restart-me" | Add an annotation. |
| kubectl proxy | Local proxy to the API server. Try /api/v1/namespaces. |
Bonus: the kubectl official quick-reference covers the long-tail. Also: kubectx + kubens are mandatory.
Common mistakes you'll make on day 30.
Patterns that bite every team. Fix the prevention before you ship.
- No requests / limits
- Schedulers cannot pack effectively. The OOMKiller picks at random. Set resources.requests on every container.
- Missing PodDisruptionBudget
- Drains kill quorum on stateful workloads. PDB tells K8s how many pods must stay up during voluntary disruptions.
- Hardcoded namespaces in YAML
- Breaks Helm and Kustomize. Use --namespace at apply-time, or template it.
- Liveness == readiness
- Restart loops on cold start. Liveness should detect deadlock, not "not ready yet". Use startupProbe for slow-boot apps.
- Ingress without TLS
- Use cert-manager + Let's Encrypt. Set HSTS at the ingress.
- Secrets in ConfigMap
- ConfigMap data is base64-encoded only. Use Secret resources, ideally with an external secret manager (Vault, AWS Secrets Manager).
- No namespace ResourceQuota
- One workload eats the cluster. ResourceQuota at the namespace level keeps tenants honest.
- image: latest
- Rollouts become unreproducible. Pin the digest or at least a SHA tag.
- No HPA, no VPA, no Cluster Autoscaler
- Static fleet sizing wastes money during quiet hours and pages the on-call during peaks.
- Cluster-admin everywhere
- Audit fails. Real RBAC takes a week to learn and forever to live without.
Semicolony's own Kubernetes assets.
Everything Semicolony publishes on Kubernetes, in one grid. Use these as the "play with the concept after you read the chapter" companions.
Guides (how it works)
- → Pod creation, end to end
- → Kubernetes networking — kube-proxy, CNI, NetworkPolicy
- → Autoscaling — HPA, VPA, Cluster Autoscaler, Karpenter
- → Containers — namespaces, cgroups, the kernel mechanics
- → Service discovery
Simulators
- → Rolling deployments — maxSurge / maxUnavailable in action
- → Pod eviction — the kubelet's decision tree
Tools
- → Probe generator — liveness / readiness / startup YAML
- → Resource calculator — requests, limits, QoS class
Going deeper — internals
Once you can use Kubernetes, the next stop is understanding how it works under the hood — the api-server, etcd, the scheduler framework, the controller pattern. The Kubernetes internals multi-page covers the control plane, the data plane, the apply lifecycle (twelve hops, end to end), and the controller pattern itself. For infra engineers, SREs, and operator authors.
The six-month roadmap.
A visual timeline. Each station is where most learners are at that point — not a strict gate.
kubectl flashcards.
A practice deck. The questions are the kinds of "you have ten seconds to type the right command" moments the CKA exam puts in front of you. Reveal the answer when you've taken your guess.
Card 1 of 10
You suspect a pod is crash-looping. What single command shows logs from the previous container instance?
Suggested sequences
Reading progressions
Three ordered paths through this material — pick the one that matches where you are.
Keep going.
Kubernetes rewards repetition. The CKA exam is mostly muscle memory; production operation is mostly pattern recognition; writing a controller is mostly knowing what conventions exist. None of these are intellectual feats. They're feats of accumulation.
Build a homelab cluster on a Raspberry Pi or a spare Mac mini. Break it. Read the kubectl source — it's Go and surprisingly readable. Pick one operator and read its entire reconciler from the top of the file. Subscribe to KubeWeekly and read it every Tuesday. In six months you'll be one of the people the rest of the team asks.
If you find yourself stuck, three places help: the Kubernetes Slack, the /r/kubernetes subreddit, and the SIG-specific GitHub repos. Use them.