Kubernetes Best Practices but easy to forget
What you should do if you are running K8s
Logging and Monitoring
I can’t address more how much logging and monitoring are important, so I will talk about ‘how-to’.

What we should monitor?
- Every component in the Kubernetes (Control plan, worker nodes)
- DevOps Pipeline
- Applications
- Other cloud instances (Virtual Machines, Networks, Storages) — Hardware
- Administrator activity logs (for all tools)
To understand more, you should understand end-to-end traffic (data) flow (client — application) and consider you should monitor everywhere the traffic gets through.
You also need to perform a regular audit and analyze the logs. (Don’t just sit and wait for the alert system to alert you, then it’s already late). There are a lot of log analysis tools as well. https://logz.io/blog/open-source-monitoring-tools-for-kubernetes/ “Logging should be for logging it.”
Don’t run a separate logging container (Sidecar) for every application container. (‘hardly a significant overhead) — https://platform9.com/blog/kubernetes-logging-best-practices/#sidecar
Make your application graceful shutdown
Are you having always downtime when you update/upgrade K8s, while they are supporting rolling updates? That may be caused by the application without the readiness of a graceful shutdown.
(How graceful shutdown works? https://pracucci.com/graceful-shutdown-of-kubernetes-pods.html)
- A
SIGTERM
the signal is sent to the main process (PID 1) in each container, and a “grace period” countdown starts (defaults to 30 seconds - see below to change it). - Upon arrival
SIGTERM
, each container should start a graceful shutdown of the running application and exit. - If a container doesn’t terminate within the grace period, a
SIGKILL
signal will be sent and the container violently terminated.
So your application must stop accepting new requests on all remaining connections and close the queue is drained. If the application still accepts incoming requests in the grace shutdown time, then you can consider preStop handler https://kubernetes.io/docs/tasks/configure-pod-container/attach-handler-lifecycle-event/#define-poststart-and-prestop-handlers
A SIGTERM
the signal is sent to the main process (PID 1) in each container
Your application will capture the SIGTERM
which means the pod is going to be terminated and do the right process to shutdown your application gracefully.
- https://www.ctl.io/developers/blog/post/gracefully-stopping-docker-containers/
- (NodeJS) http://dillonbuchanan.com/programming/gracefully-shutting-down-a-nodejs-http-server/
- (Java) https://dzone.com/articles/gracefully-shutting-down-java-in-containers
- (python) https://technology.amis.nl/platform/docker/graceful-shutdown-of-forked-workers-in-python-and-javascript-running-in-docker-containers/
Enable the Auto scaler
There are three types of scaling, horizontal pod autoscaler, cluster autoscaler, vertical pod autoscaler. To get higher resiliency of your system, you should consider three autoscaling enabled.
- Vertical Pod Autoscaler: automatically update the value set for CPU and RAM requests and limits. https://docs.aws.amazon.com/eks/latest/userguide/vertical-pod-autoscaler.html
- Horizontal Pod Autoscaler: https://docs.aws.amazon.com/eks/latest/userguide/horizontal-pod-autoscaler.html
- Cluster Autoscaler: expands and shrinks the size of the pool of worker nodes https://docs.aws.amazon.com/eks/latest/userguide/cluster-autoscaler.html
Small container images
Keeping small container images is very important and it’s not about the number of microservices, but the size of the image itself. There are a lot of benefits like
- Reduce the wasting space by removing unnecessary libraries (less storage)
- Create faster build and faster pipelines
- Narrow Attackable surface
Best practices from Google Cloud Architecture Center: https://cloud.google.com/architecture/best-practices-for-building-containers
Readiness and Liveness Probes
Kubelet uses liveness probes to know when to restart a container, readiness probes to know when a container is ready to start accepting traffic, and startup probes to know when a container application has started.
With these, K8s check increases its reliability by avoiding pod failures. (especially readinessProbe and livenessProbe)
Ref
- https://www.weave.works/blog/kubernetes-best-practices
- https://containerjournal.com/topics/container-management/kubernetes-best-practices-in-production/
- https://www.bmc.com/blogs/kubernetes-best-practices/
- https://learnk8s.io/production-best-practices
- https://cloud.google.com/architecture/best-practices-for-building-containers
- https://cloud.google.com/blog/products/containers-kubernetes/your-guide-kubernetes-best-practices
- https://cloud.google.com/blog/products/containers-kubernetes/kubernetes-best-practices-how-and-why-to-build-small-container-images