Kubernetes Application Scaling
Here are some thoughts on Kubernetes application scaling, both manual and autoscaling.
You can set scaling for a service, to ensure that an appropriate number of pods are available to fulfil requests. An integrated load balancer is part of the Service.
A vanilla deployment starts with 1 pod:
$ kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
kubernetes-bootcamp 1/1 1 1 47s
The scaling replication set can be viewed with:
$ kubectl get rs
NAME DESIRED CURRENT READY AGE
kubernetes-bootcamp-fb5c67579 1 1 1 47s
This can be changed with:
$ kubectl scale deployments/kubernetes-bootcamp --replicas=4
deployment.apps/kubernetes-bootcamp scaled
Now, extra pods with replica configurations will be created:
$ kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
kubernetes-bootcamp 4/4 4 4 21m
$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kubernetes-bootcamp-fb5c67579-p4pxk 1/1 Running 0 63s 172.18.0.7 minikube <none> <none>
kubernetes-bootcamp-fb5c67579-q42js 1/1 Running 0 63s 172.18.0.9 minikube <none> <none>
kubernetes-bootcamp-fb5c67579-qzxjl 1/1 Running 0 63s 172.18.0.8 minikube <none> <none>
kubernetes-bootcamp-fb5c67579-rjmz4 1/1 Running 0 21m 172.18.0.2 minikube <none> <none>
$ kubectl get rs
NAME DESIRED CURRENT READY AGE
kubernetes-bootcamp-fb5c67579 4 4 4 21m
Likewise, we can scale down with:
$ kubectl scale deployments/kubernetes-bootcamp --replicas=2
deployment.apps/kubernetes-bootcamp scaled
Scaling events are logged in the deployments event log. See:
$ kubectl describe deployments/kubernetes-bootcamp
Name: kubernetes-bootcamp
Namespace: default
CreationTimestamp: Wed, 24 Aug 2022 17:40:26 +0000
Labels: app=kubernetes-bootcamp
Annotations: deployment.kubernetes.io/revision: 1
Selector: app=kubernetes-bootcamp
Replicas: 2 desired | 2 updated | 2 total | 2 available | 0 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 25% max unavailable, 25% max surge
Pod Template:
Labels: app=kubernetes-bootcamp
Containers:
kubernetes-bootcamp:
Image: gcr.io/google-samples/kubernetes-bootcamp:v1
Port: 8080/TCP
Host Port: 0/TCP
Environment: <none>
Mounts: <none>
Volumes: <none>
Conditions:
Type Status Reason
---- ------ ------
Progressing True NewReplicaSetAvailable
Available True MinimumReplicasAvailable
OldReplicaSets: <none>
NewReplicaSet: kubernetes-bootcamp-fb5c67579 (2/2 replicas created)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ScalingReplicaSet 29m deployment-controller Scaled up replica set kubernetes-bootcamp-fb5c67579 to 1
Normal ScalingReplicaSet 9m23s deployment-controller Scaled up replica set kubernetes-bootcamp-fb5c67579 to 4
Normal ScalingReplicaSet 48s deployment-controller Scaled down replica set kubernetes-bootcamp-fb5c67579 to 2
We can see that the load-balancing is working by making multiple requests to the application. Below, we view the service information, extract the node port, and make a request to the service, which is handled by one of the 4 pods.
$ kubectl describe services/kubernetes-bootcamp
Name: kubernetes-bootcamp
Namespace: default
Labels: app=kubernetes-bootcamp
Annotations: <none>
Selector: app=kubernetes-bootcamp
Type: NodePort
IP Families: <none>
IP: 10.108.183.223
IPs: 10.108.183.223
Port: <unset> 8080/TCP
TargetPort: 8080/TCP
NodePort: <unset> 32555/TCP
Endpoints: 172.18.0.2:8080,172.18.0.7:8080,172.18.0.8:8080 + 1 more...
Session Affinity: None
External Traffic Policy: Cluster
Events: <none>
$ export NODE_PORT=$(kubectl get services/kubernetes-bootcamp -o go-template='')
$ echo NODE_PORT=$NODE_PORT
NODE_PORT=32555
$ curl $(minikube ip):$NODE_PORT
Hello Kubernetes bootcamp! | Running on: kubernetes-bootcamp-fb5c67579-qzxjl | v=1
$ curl $(minikube ip):$NODE_PORT
Hello Kubernetes bootcamp! | Running on: kubernetes-bootcamp-fb5c67579-qzxjl | v=1
$ curl $(minikube ip):$NODE_PORT
Hello Kubernetes bootcamp! | Running on: kubernetes-bootcamp-fb5c67579-rjmz4 | v=1
$ curl $(minikube ip):$NODE_PORT
Hello Kubernetes bootcamp! | Running on: kubernetes-bootcamp-fb5c67579-p4pxk | v=1
$ curl $(minikube ip):$NODE_PORT
Hello Kubernetes bootcamp! | Running on: kubernetes-bootcamp-fb5c67579-qzxjl | v=1
$ curl $(minikube ip):$NODE_PORT
Hello Kubernetes bootcamp! | Running on: kubernetes-bootcamp-fb5c67579-qzxjl | v=1
$ curl $(minikube ip):$NODE_PORT
Hello Kubernetes bootcamp! | Running on: kubernetes-bootcamp-fb5c67579-qzxjl | v=1
Autoscaling
You can also get kubernetes to autoscale, and more advanced still, use a service mesh for more sophisticated control.
To autoscale, you need to be able to monitor your nodes. Check whether metrics are installed:
$ kubectl top nodes
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
lima-rancher-desktop 387m 19% 1375Mi 23%
Yes! Rancher comes with metrics included (See Rancher Docs: Metrics Server).
If you see:
$ kubectl top nodes
error: Metrics API not availble.
This means that metrics is not installed on your system.
You can use Helm to install Bitnami’s metrics-server chart. This might require extra configuration, depending on your deployment platform.
From Helm Charts to deploy Metrics Server in Kubernetes:
helm repo add bitnami https://charts.bitnami.com/bitnami
helm install my-release bitnami/metrics-server
Once the metrics server is running, you can create a horizontal pod autoscaler:
$ kubectl autoscale [-n namespace] deployment blog --min=1 --max=5 --cpu-percent=75
$ kubectl get hpa [-n namespace] blog