Kubernetes for Beginners: Complete Guide to Container Orchestration
Kubernetes has become the default platform for running containerized applications in production. Over 96% of organizations surveyed by the CNCF in 2025 reported using or evaluating Kubernetes, and every major cloud provider offers a managed Kubernetes service. If you are a cloud engineer, DevOps practitioner, or software developer building applications that need to scale, understanding Kubernetes is no longer optional.
This guide is structured for someone with basic Docker knowledge who has never deployed a Kubernetes cluster. By the end, you will understand the core architecture, be able to deploy applications with deployments and services, configure storage and networking, implement basic security with RBAC, and know how to choose between EKS, AKS, and GKE for managed production clusters.
Why Kubernetes Exists
Before Kubernetes, scaling containerized applications meant writing custom scripts to start containers across servers, implementing health checks manually, building service discovery from scratch, and handling rolling updates without downtime through bespoke deployment pipelines. Every team solved these problems differently, creating snowflake infrastructure that was expensive to maintain.
Kubernetes solves this by providing a declarative API for defining your desired state — "I want 5 replicas of my web server, each with 512 MB of memory, behind a load balancer" — and a control plane that continuously reconciles actual state with desired state. If a node fails, Kubernetes reschedules pods on healthy nodes. If traffic spikes, the Horizontal Pod Autoscaler adds replicas. If you push a new container image, Kubernetes performs a rolling update with zero downtime.
Google developed Kubernetes based on 15 years of running production containers with Borg and Omega. It was open-sourced in 2014 and donated to the Cloud Native Computing Foundation (CNCF) in 2015. That lineage matters — the design decisions baked into Kubernetes reflect lessons learned from running millions of containers at planet scale.
Core Architecture
A Kubernetes cluster consists of a control plane (the brain) and worker nodes (the muscle).
Control Plane Components
kube-apiserver — The front door to Kubernetes. Every operation — creating a pod, scaling a deployment, reading logs — goes through the API server. It validates requests, authenticates callers, and persists state to etcd. In production, the API server runs as multiple replicas behind a load balancer for high availability.
etcd — A distributed key-value store that holds all cluster state. Every resource you create (pods, services, secrets, config maps) is serialized and stored in etcd. This is the single source of truth. If etcd is lost and unrecoverable, the cluster state is gone. Production clusters back up etcd every 30 minutes at minimum.
kube-scheduler — Watches for newly created pods that have no assigned node and selects a node based on resource requirements, affinity rules, taints, tolerations, and topology constraints. The scheduler does not run pods — it only decides where they should run.
kube-controller-manager — Runs a collection of controllers that watch for state drift and reconcile. The ReplicaSet controller ensures the correct number of pod replicas exist. The Node controller detects when nodes go offline. The Job controller manages batch workloads. Each controller operates independently, watching its resource type and taking action when actual state diverges from desired state.
Worker Node Components
kubelet — An agent running on every node that receives pod specifications from the API server and ensures the described containers are running. It reports node status and pod health back to the control plane. If a container crashes, kubelet restarts it according to the pod's restart policy.
kube-proxy — Maintains network rules on each node that implement Kubernetes Services. When you create a Service, kube-proxy configures iptables or IPVS rules so that traffic sent to the Service's cluster IP is forwarded to healthy pod endpoints.
Container runtime — The software that actually runs containers. Kubernetes supports containerd (default since v1.24), CRI-O, and other CRI-compliant runtimes. Docker as a runtime was removed in Kubernetes 1.24, though Docker-built images still work fine — only the runtime interface changed.
Fundamental Resources
Pods
A Pod is the smallest deployable unit in Kubernetes — one or more containers that share a network namespace and storage volumes. In practice, most pods run a single application container. Multi-container pods are used for sidecars (log shippers, proxies, secret injectors).
apiVersion: v1
kind: Pod
metadata:
name: nginx-basic
labels:
app: web
spec:
containers:
- name: nginx
image: nginx:1.27-alpine
ports:
- containerPort: 80
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 250m
memory: 256Mi
Always set resource requests and limits. Requests determine scheduling — the scheduler places the pod on a node with at least 100m CPU and 128Mi memory available. Limits prevent a runaway container from consuming all node resources and affecting other pods.
Deployments
You rarely create pods directly. Deployments manage the lifecycle of pods through ReplicaSets, providing declarative updates, rollbacks, and scaling.
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-app
spec:
replicas: 3
selector:
matchLabels:
app: web-app
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
maxSurge: 1
template:
metadata:
labels:
app: web-app
spec:
containers:
- name: app
image: myregistry/web-app:v2.1.0
ports:
- containerPort: 8080
readinessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 5
periodSeconds: 10
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 15
periodSeconds: 20
resources:
requests:
cpu: 200m
memory: 256Mi
limits:
cpu: 500m
memory: 512Mi
Key details: readinessProbe controls when a pod receives traffic — it will not be added to Service endpoints until the readiness check passes. livenessProbe determines if a container is still healthy — a failing liveness probe triggers a container restart. These probes are essential for production reliability.
Services
Services provide stable networking for pods. Since pods are ephemeral — they get new IP addresses when rescheduled — Services give you a consistent DNS name and IP address that routes to healthy pod endpoints.
apiVersion: v1
kind: Service
metadata:
name: web-app-service
spec:
type: ClusterIP
selector:
app: web-app
ports:
- port: 80
targetPort: 8080
Service Types:
- ClusterIP (default) — Accessible only within the cluster. Use for internal service-to-service communication.
- NodePort — Exposes the service on a static port (30000-32767) on every node's IP. Rarely used in production.
- LoadBalancer — Provisions a cloud provider load balancer (ALB on AWS, Azure Load Balancer, GCP HTTP(S) LB). The standard way to expose services to the internet on managed Kubernetes.
Ingress
For HTTP/HTTPS routing, Ingress provides path-based and host-based routing to multiple backend services through a single load balancer.
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: web-ingress
annotations:
nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec:
ingressClassName: nginx
tls:
- hosts:
- app.example.com
secretName: tls-secret
rules:
- host: app.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: web-app-service
port:
number: 80
- path: /api
pathType: Prefix
backend:
service:
name: api-service
port:
number: 80
You need an Ingress Controller installed in the cluster for Ingress resources to function. Popular choices include NGINX Ingress Controller, Traefik, and cloud-native options like AWS ALB Ingress Controller.
Configuration and Secrets
ConfigMaps
Store non-sensitive configuration data that can be injected into pods as environment variables or mounted as files.
apiVersion: v1
kind: ConfigMap
metadata:
name: app-config
data:
DATABASE_HOST: "postgres.default.svc.cluster.local"
LOG_LEVEL: "info"
MAX_CONNECTIONS: "100"
Secrets
Secrets hold sensitive data — API keys, database passwords, TLS certificates. They are base64-encoded (not encrypted) by default. In production, enable encryption at rest for etcd and consider external secret managers.
apiVersion: v1
kind: Secret
metadata:
name: db-credentials
type: Opaque
stringData:
username: app_user
password: "use-external-secret-manager-in-production"
For production workloads, use the External Secrets Operator to sync secrets from AWS Secrets Manager, Azure Key Vault, or GCP Secret Manager into Kubernetes Secrets automatically.
Storage
PersistentVolumes and PersistentVolumeClaims
Pods are ephemeral — when a pod is deleted, its local filesystem is gone. For databases, file uploads, and any workload that needs data to survive pod restarts, you need persistent storage.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: postgres-data
spec:
accessModes:
- ReadWriteOnce
storageClassName: gp3
resources:
requests:
storage: 50Gi
On managed Kubernetes, StorageClasses map to cloud provider storage — gp3 on EKS provisions EBS gp3 volumes, managed-csi on AKS provisions Azure Managed Disks, and standard-rwo on GKE provisions Persistent Disks.
RBAC: Role-Based Access Control
RBAC controls who can do what in your cluster. It is enabled by default on all managed Kubernetes services.
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: production
name: pod-reader
rules:
- apiGroups: [""]
resources: ["pods", "pods/log"]
verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
namespace: production
name: read-pods-binding
subjects:
- kind: User
name: "developer@example.com"
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: Role
name: pod-reader
apiGroup: rbac.authorization.k8s.io
Principle of least privilege: give users and service accounts only the permissions they need. Start restrictive and expand as needed, rather than starting with cluster-admin and trying to lock down later.
Managed Kubernetes: EKS vs AKS vs GKE
Amazon EKS
EKS charges $0.10/hour for the control plane ($73/month). Worker nodes are billed as regular EC2 instances. EKS integrates deeply with AWS IAM (IRSA for pod-level IAM roles), ALB for ingress, EBS/EFS for storage, and CloudWatch for logging. Fargate profiles allow serverless pod execution without managing nodes.
Best for: AWS-centric organizations, complex multi-account architectures, workloads needing deep AWS service integration.
Azure AKS
AKS does not charge for the control plane on the standard tier (the uptime SLA tier costs $0.10/hour). Worker nodes are billed as Azure VMs. AKS integrates with Entra ID for authentication, Azure Monitor for observability, and Azure Policy for governance. Virtual Nodes (Azure Container Instances) provide serverless burst capacity.
Best for: Microsoft shops, Azure AD-centric organizations, Windows container workloads (AKS has the best Windows node support).
Google GKE
GKE Autopilot is the most opinionated managed Kubernetes — Google manages the nodes entirely, and you pay only for pod resource requests. GKE Standard charges $0.10/hour for control plane plus node costs. GKE Autopilot enforces security best practices (no privileged containers, no host network) and eliminates node management entirely.
Best for: Teams wanting the least operational overhead, Kubernetes-purist architectures, organizations prioritizing developer productivity over infrastructure control.
Production Best Practices
-
Namespaces for isolation — Separate environments (dev, staging, production) and teams into namespaces. Apply ResourceQuotas to prevent any namespace from consuming disproportionate cluster resources.
-
Pod Disruption Budgets — Define the minimum number of pods that must remain available during voluntary disruptions (node drains, cluster upgrades).
-
Network Policies — By default, all pods can communicate with all other pods. Network Policies restrict traffic flow — for example, allowing only the API pods to reach the database pods.
-
Image scanning — Scan container images for vulnerabilities before deployment. Tools: Trivy, Snyk Container, AWS ECR scanning, GCP Artifact Analysis.
-
GitOps deployment — Use ArgoCD or Flux to sync cluster state from a Git repository. Every change goes through a pull request, providing audit trails and easy rollbacks.
-
Observability — Deploy Prometheus for metrics, Grafana for dashboards, and a log aggregation stack (Loki, EFK, or cloud-native logging). Set alerts for pod restart loops, high memory pressure, and certificate expiration.
Getting Started: Your First Deployment
Install kubectl and create a local cluster with kind (Kubernetes in Docker) or minikube:
# Install kind
go install sigs.k8s.io/kind@v0.24.0
# Create a cluster
kind create cluster --name learning
# Verify connection
kubectl cluster-info
kubectl get nodes
# Deploy nginx
kubectl create deployment nginx --image=nginx:1.27-alpine --replicas=3
# Expose it
kubectl expose deployment nginx --port=80 --type=NodePort
# Check status
kubectl get pods -o wide
kubectl get services
Practice deploying applications, scaling them, performing rolling updates, and recovering from pod failures. The muscle memory of kubectl commands and YAML manifests comes from repetition, not reading.
Building Container Orchestration Skills
Kubernetes is a deep platform — the resources covered here are approximately 20% of what is available. StatefulSets, DaemonSets, Jobs, CronJobs, Custom Resource Definitions, Operators, and service meshes all build on these fundamentals.
Citadel Cloud Management's DevOps courses cover Kubernetes from beginner to production-grade deployments, including hands-on labs for EKS, AKS, and GKE. The DevOps Tools collection provides Helm charts, Kustomize overlays, and ArgoCD configurations ready for production use.
For cloud engineers preparing for certification exams, the Career Resources include Kubernetes-specific study guides aligned to the CKA, CKAD, and cloud provider Kubernetes certifications.
Ready to master container orchestration? Start with Citadel's free DevOps and cloud courses and build production-grade Kubernetes skills with real-world labs and enterprise architecture patterns. Explore the full toolkit catalog to accelerate your learning.
Continue Learning
Start Your Cloud Career Today
Access 17 free courses covering AWS, Azure, GCP, DevOps, AI/ML, and cloud security — built by a practicing Senior Cloud Architect with enterprise experience.
Get Free Cloud Career Resources