In the last blogpost I explained how to easily (but manually) scale a TIBCO BWCE application Docker image on a local Kubernetes cluster (Minikube) to keep up with user demand. In this blogpost I will be explaining how to leverage the autoscale functionality in Kubernetes by autoscaling a TIBCO BusinessWorks Container Edition (BWCE) application Docker Image.
Scaling a deployment (increasing the number of replicas) will make sure that new pods are created or removed. Kubernetes supports autoscaling with horizontal pod autoscaling.
“With Horizontal Pod Autoscaling, Kubernetes automatically scales the number of pods in a replication controller, deployment or replica set based on observed CPU utilization (or, with alpha support, on some other, application-provided metrics).” (source: https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/)
How does this work?
The horizontal pod autscaling (from now on hpa) consists of two parts, the Kubernetes API resource and a controller. The controller queries the resource utilization and compares the returned metric against the metrics specified in the horizontal pod autoscaler object. We can either use default or custom metrics. However to keep things easy we will be using the default supported CPU autoscaling.
Before we move ahead we need to enable the addon Heapster. Heapster is a cluster-wide aggregator of monitoring and event data. It supports Kubernetes natively and runs within a Kubernetes pod. From this pod it discovers all nodes in the cluster and queries usage information from each node’s Kubele (node agent).
MacBook-Pro:~ ruben.middeljans$ minikube addons enable heapster heapster was successfully enabled
By running the following command we can verify that Heapster has been enabled succesfully:
MacBook-Pro:~ ruben.middeljans$ minikube addons list - kube-dns: enabled - heapster: enabled - ingress: disabled - registry: disabled - registry-creds: disabled - addon-manager: enabled - dashboard: enabled - default-storageclass: enabled
“InfluxDB is an open source time series database (written in Go) with no external dependencies. It’s useful for recording metrics, events, and performing analytics.”
Important: The addon InfluxDB comes out-of-the-box, Heapster is set up to use InfluxDB as the storage backend by default.
“Grafana is an open source, feature rich metrics dashboard and graph editor for Graphite, Elasticsearch, OpenTSDB, Prometheus and InfluxDB.”
Important: The addon Grafana comes out-of-the-box and displays resource usage of the Kubernetes cluster and the pods inside of it.
Please note that in my case the “Heapster + InfluxDB + Grafana” stack was available out-of-the-box with Kubernetes (Minikube). In some cases you’ll have to manually install these addons.
Autoscaling on Kubernetes
1. To create a hpa autoscaler object we use the “kubectl autoscale” command. The following command will create a hpa object that maintains between 1 and 5 replicas controlled by the “helloworldbwce-node” deployment we created before. Roughly, the autoscaler will increase and decrease the amount of pods to keep the average CPU utilization across all pods on 50% or less.
MacBook-Pro:~ ruben.middeljans$ kubectl autoscale deployment helloworldbwce-node --cpu-percent=50 --min=1 --max=5 deployment "helloworldbwce-node" autoscaled
2.1 To verify that the hpa object has been created succesfully we can run the following command:
MacBook-Pro:~ ruben.middeljans$ kubectl get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE helloworldbwce-node Deployment/helloworldbwce-node <unknown> / 50% 1 5 1 4m
Although the hpa object has been created succesfully we notice that no current CPU utilization (unknown) is available! We can now also see this in the Kubernetes dashboard (Deployments). At the section “Horizontal Pod Autoscalers” we see the no value at all for “Current CPU Utilization”.
2.2 To verify what is going on we run the following command to view the extensive description of the hpa object and to see the last log-entries (events).
MacBook-Pro:~ ruben.middeljans$ kubectl describe hpa helloworldbwce-node
As part of the last log-entries we notice the following “warning”:
"FailedComputeMetricsReplicafailed to get cpu utilization: missing request for cpu on container helloworldbwce-node in pod default/helloworldbwce-node-2301106929-hprg4"
After doing some research and using google it appears that we need to set “cpu” as a resource request on the container. This is not done automatically when creating an hpa object nor does Kubernetes warns you about it. The solution here is very easy, but the fact that you weren’t warned about it is the most frustrating. Why would you want to enable autoscaling on a deployment but not enable the CPU resource request on the pods?
To solve the problem simply add the “cpu” resource request to the configuration of the pod (yaml):
... spec: containers: name: helloworldbwce-node resources: requests: cpu: 1000m ...
2.3 If we now rerun the following command we get a proper result (0%). We can now also see this in the Kubernetes dashboard (Deployments) and in the Grafana dashboard (see below).
MacBook-Pro:~ ruben.middeljans$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
helloworldbwce-node Deployment/helloworldbwce-node 0% / 50% 1 5 1 1h
2.3 Fire up the Grafana dashboard by running the following command:
MacBook-Pro:~ ruben.middeljans$ minikube addons open heapster Opening kubernetes service kube-system/monitoring-grafana in default browser...
3. And last but not least, let the party begin! Let’s do some stress tests to verify that the hpa object is actually doing what it is supposed to do. For this I created a simple bash script which will basically run a lot of curl requests consecutively. I will be running multiple instances of the terminal on my MacBook Pro executing the same bash script in parallel.
for ((i=1;i<=10000;i++)); do curl -v --header "Connection: keep-alive" "192.168.99.100:30724/helloworld/test"; done
4. While the bash script is running run the following command to verify that the hpa object is working and scaling up the pods:
MacBook-Pro:~ ruben.middeljans$ kubectl get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE helloworldbwce-node Deployment/helloworldbwce-node 60% / 50% 1 5 5 2h
We can clearly see that the hpa object is autoscaling the pods by automatically increasing the amount of replicas to 5. We can also see this by looking at the log-entries (events) belonging to the deployment object “helloworldbwce-node”, which is responsible for scaling the pods. To view these events run the following (describe) command:
MacBook-Pro:~ ruben.middeljans$ kubectl describe deployment helloworldbwce-node ... Events: FirstSeen LastSeen Count From SubObjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 26m 26m 1 deployment-controller Normal ScalingReplicaSet Scaled up replica set helloworldbwce-node-3458225718 to 2 17m 17m 1 deployment-controller Normal ScalingReplicaSet Scaled up replica set helloworldbwce-node-3458225718 to 4 13m 13m 1 deployment-controller Normal ScalingReplicaSet Scaled up replica set helloworldbwce-node-3458225718 to 5 21m 8m 2 deployment-controller Normal ScalingReplicaSet Scaled down replica set helloworldbwce-node-3458225718 to 1 6m 6m 1 deployment-controller Normal ScalingReplicaSet Scaled up replica set helloworldbwce-node-2253477073 to 1 6m 6m 1 deployment-controller Normal ScalingReplicaSet Scaled down replica set helloworldbwce-node-3458225718 to 0 4m 4m 1 deployment-controller Normal ScalingReplicaSet Scaled up replica set helloworldbwce-node-2253477073 to 2
As we noticed autoscaling on Kubernetes is very easy to setup and actually looks to be working quite well! For TIBCO this is a very smart choice as well, we can now easily and automatically scale up in just a matter of second! We don’t need to deploy TIBCO instances multiple times anymore just in case we need some additional peak capacity. No, we just let the Kubernetes cluster handle the scaling.