Tools you need to know when developing for and operating a Kubernetes cluster
KubeCon 2019 has just ended. It has been a huge event that gathered thousands of people and hundreds of sponsors. It was problematic to choose which sessions I wanted to go because of their enormous amount.
During the KubeCon I have learned many tools, paradigms, concepts, and ideas that can be used to develop for and operate a Kubernetes cluster. This article covers the following topics:
- Cluster management
- Reusable declarative configuration
- Service mesh
- Edge proxy
- Monitoring and logging
- Tracing and debugging
- Edge computing
- Distributed computing
- Serverless computing
- Machine learning
Ready? Let’s begin!
Rancher allows running your Kubernetes cluster anywhere in the cloud as well as on-prem. With Rancher, you are able to manage multiple clusters at from a single web app. It also provides a nice UI that can help you to understand your Kubernetes resources.
Reusable declarative configuration
Kubernetes provides a declarative configuration syntax that you can utilize to write your YAML config files (manifests) describing your application configuration. These manifests are used when you run
kubectl command to make sure you are running a state of your application that matches the desired state.
It is a common problem of reusing existing configurations for different environments or even microservices. There are multiple tools to solve this problem.
Kustomize was born as a separate project to help DevOps engineers and developers to override base YAML configurations. E.g. your prod environment may need a larger number of pod replicas so you can create a special overlay patch on top of the base configuration specifying the number of replicas.
Kustomize provides a nice set of features such as common labels/annotations for multiple resources, ConfigMap Generator, and many others. Now Kustomnize is a part of Kubernetes and triggered by
kubectl apply command. Here are some good slides on it.
Helm is a package manager for Kubernetes. It allows creating reusable charts for applications that run multiple services. A chart uses special template syntax on top of YAML configuration files allowing you to use conditions, inclusions, and functions. Additionally, a chart can require other charts, so Helm provides a modular way to define your application components. You can maintain your own chart repository or rely on the official one. Check out this Drupal chart to figure out what the syntax is.
Kubernetes declarative deployments reply on rolling out changes to production using manifests that are stored in Git repo. In the most simple case, this process includes pulling changes from Git repository and running
kubectl apply command. This process can be easily automated and it is called GitOps.
In large applications, there is a bigger level of complexity that may require you to use a special GitOps CD system tailored for Kubernetes.
Argo CD is a declarative GitOps continuous delivery system for Kubernetes. The idea behind GitOps is simple: Kubernetes configuration files for different environments are stored in the Git repository. Upon the update of such a repository, a webhook triggers a CD system that allows deploying configurations changes: deployment could be manual or automatic. Argo CD provides the user interface and the command-line tool to set up and manage deployments. The UI allows easily to navigate across deployed applications, see their states, and perform manual deployments and rollbacks. Argo CD can be used as a stand-alone system or in conjunction with other CI systems to implement complex workflows such as on-demand environments. Argo CD also supports Blue/green and Canary deployment strategies.
Flux and Flagger
Argo CD + Flux == GitOps Engine
GitOps Engine has been introduced recently. Companies standing behind Argo CD and Flux decided to join forces in order to build a top-notch tool for GitOps. GitOps Engine is very fresh and there is no even an alpha release yet.
There is also a useful tool for deployment called Krane that provide meaningful messages on how the deployment really went.
Dealing with micro-services in an enterprise application could be a complex task because of the growing number of micro-services and hybrid nature of an enterprise cloud. The common problem is connecting a service running in cloud A to a service running in cloud B. This is usually achieved by sidecar proxy containers that seamlessly redirect all traffic back and forth.
Using service mesh you will be able to:
- Interconnect services in a hybrid cloud
- Load-balance various kinds of traffic
- Setup rules for fine-grained traffic control
- Gather logs and metrics automatically
- Secure service-to-service communication
There is also a much simpler tool called Skupper aimed to interconnect multiple services on level 7 of the OSI model.
Another important aspect of microservice architecture is a concept of the edge proxy, which serves multiple purposes: API Gateway, access control, and traffic management. You can also use it for Canary deployments. Think of the edge proxy as an advanced ingress.
Kubernetes ingress is very simple and it only supports HTTP and HTTPS protocols so this is why you may need to use edge proxy. One of the edge proxies for Kubernetes is Ambassador. Ambassador is based on Envoy proxy and can reroute traffic to Itsio (service mesh) if needed.
Monitoring and logging
No matter how big an application is, you want to be able to monitor it and log what happens there. Fortunately, there are tools that became the de facto standard in the Kubernetes world. Unfortunately, you need to spend some time to set up them. Also, they are quite demanding in terms of memory and CPU.
While monitoring is useful to understand what is going on with your cluster resources at the node level such CPU and memory, logging helps to understand what is happening inside your applications.
Prometheus + Grafana
Prometheus is used for storing and visualizing metrics of your cluster. It provides many useful features such as query language, alerting functionality, and multiple integrations with 3rd party applications.
Grafana is a monitoring and analytics front-end that works with multiple data sources including Prometheus. It provides powerful dashboards and tons of integrations. Also, Grafana is part a part of Itsio and Linkerd.
While Prometheus has its own dashboard, many folks use one provided by Grafana.
Fluentd + Fluent Bit + Elasticsearch + Kibana
Fluentd is a log collector that allows grabbing logs from the Kubernetes cluster and forwarding them to the log databases and search engines.
Fluent Bit is another log collector tool that is more performant than Fluentd, it runs on the node level and forwards logs to Fluentd.
Elasticsearch is a distributed log analytics engine allowing to search and analyze a huge number of logs with blazing speed. Logs gathered by Fluent Bit and Fluentd are forwarded to Elasticsearch. Elasticsearch provides query language to filter out these logs.
Kibrana is a visualization tool for Elasticsearch that allows building powerful dashboards using a web UI.
Tracing and debugging
Another important aspect of developing an app containing microservices is tracing and debugging. The more microservices an app has, the more complex debugging could be. The following tools are aimed to simplify this process.
OpenTelemetry provides a set of API that can be used by microservices allowing you to introspect what is going on behind the scenes. One of the most important features of OpenTelemetry is context propagation allowing to trace into a chain of calls from one service to another filtering out calls that do not fit the specific context. E.g. setting an order ID as a context allows tracing front-end service calls to a back-end and other services associated with this ID, as a result, you can see traces that are grouped by an order ID. Context propagation works with logs and metrics.
Jaeger does basically the same thing as OpenTelemetry, but it also provides a user interface that can visualize traces.
Both Jaeger and OpenTelemetry are CNCF projects providing overlapping functionality. The decision has been made to freeze adding new features to Jaeger but to add them to the OpenTelemetry project. You can learn more about OpenTelemetry and Jaeger in this article.
Ephemeral containers provide an easy way to debug running containers when using
exec command is not enough. Most of the Docker images do not provide any debugging utilities, while distroless images do not even provide a shell. Here ephemeral containers come into play. With ephemeral containers, you can access pod’s file system as well as running processes. Additionally, ephemeral containers can carry all the debugging tools that are required. For security reasons, ephemeral containers do not communicate with the outside world so
port definition is disabled for them. Also their availability is not guaranteed. Ephemeral container is now an alpha feature of Kubernetes.
Telepresence is a very useful tool for debugging and developing microservices. It allows swapping a pod running in the cluster with your local Docker container, which provides you with numerous possibilities such as debugging with your IDE or even making code changes on the fly. That is very useful when dealing with an application that consists of a large number of microservices, so you don’t need to run the whole copy of it locally. With the help of Ambassador, Telepresence supports dynamic swapping to make sure you are not intersecting with other developers. Alternatively, Telepresence can be used to run integration tests in the CI system, which makes this process quite fast because you rely on your existing stack rather than spinning it up again.
Telepresence is relatively a new system, but it has a huge potential. It is not recommended to run telepresence in production at any time.
There is one more tool called Octant that helps to visualize your cluster structure. The good thing that it runs in the local environment.
Docker has changed the way how we develop applications. Since the code needs to be built and supposed to be run inside a container it is no longer easy to do changes on the fly. Also, the container provides an encapsulation of your app so it is not easy yo access it from outside. The following tools helps to deal with such problems.
Tilt is a tool to simplify development. It works with your local Kubernetes cluster and automatically rebuilds images when needed. In case if you work with scripting languages such as Go or PHP, you can use a special live update feature that will simply sync modified files into the running container without a need to rebuild it. Tilt is highly customizable so you can specify the way how and what files needs to be synced as well as what additional commands should be executed. Additionally, Tilt provides a nice web UI to understand what is going on with your containers so you don’t need to use command line.
Ahoy! allows projects to have their own command-line interface. It works with Docker and Kubernetes. That is useful if you want to run special commands inside containers but these commands are too long to type. You can define all the commands you need in the YAML file and call them while you are developing. For example, you may want to clear caches for your application or to import DB from a remote server.
Edge computing is a distributed programming paradigm, which brings computational and storage resources to the places where it is needed. This paradigm becomes popular with the evolution of the Internet of Things.
At the moment there are two different approaches and two different tools correspondingly:
- KubeEdge helps to offload cloud and run pods directly on the edge. It consists of the cloud part and edge part which communicate between each other via MQTT. You can deploy your pods easily by running
- k3s is a lightweight Kubernetes that runs directly on the edge. In this case, you need to take care of controlling k3s instances on the edge from your command and control server, which can be implemented using RSocket, NATS or other tools.
Here is an interesting presentation on how to run k3s in the car.
There are also many interesting projects that change the way how cloud computing works. You should meet them! All of these projects are cloud-native meaning that they provide a cloud-like experience (AWS/Azure) in your Kubernetes cluster.
Vitess is a database clustering system for horizontal scaling of MySQL. It does many useful things for you such as sharding, master failovers, backups, query optimization, etc… And of course, it runs on top of Kubernetes.
Longhorn is a distributed block storage system for Kubernetes. It provides various features such as no single point of failure, snapshots, backups, and even GUI. It adds persistent volume support for Kubernetes cluster.
Rook is a storage orchestrator for Kubernetes. It automates deployment, configuration, scaling, and monitoring of various storage providers such as Ceph, EdgeFS, Cassandra, Minio, NFS, etc…
Another interesting tendency is serverless computing. It is aimed to focus on development rather than operations. Knative provides AWS Lambda like experience to run your applications in your Kubernetes cluster. It abstracts various Kubernetes concepts such as pods, services or ingress so you don’t need to worry about them. Knative takes care of autoscaling and simplifies deployment strategies such as Canary. Additionally, it provides an eventing mechanism for interservice communication.
Kubeflow is a machine learning framework that works on top of Kubernetes. It includes Jupyter notebooks and TensorFlow. Basically it is a preconfigured stack aimed to do machine learning at scale.
Kudos if you went through all of this article! There is a lot of information and I hope that it was useful to you. Kubernetes is being developed at a huge pace. If you have not started using Kubernetes, now it is a good time for it!