An Operating Model for Dynamic, Distributed Kubernetes Environments
Kubernetes clusters have a lot of moving parts—and so does each application running on a cluster. With frequent application and environment updates, the state of every cluster can change rapidly. Operating at scale—with dozens of clusters and hundreds of application instances—it can be almost impossible to avoid configuration inconsistencies between clusters and configuration mistakes that result in prolonged troubleshooting, downtime, or worse.
It’s exactly these types of challenges that have led so many organizations to adopt GitOps, bringing the familiar capabilities of Git tools to infrastructure management and continuous delivery (CD). In last year’s AWS Container Security Survey, a whopping 64.5% of the respondents indicated that they were already using GitOps. That number will undoubtedly grow when this year’s survey is released.
This blog takes a look at the GitOps principles and GitOps workflows that make the approach so powerful. Although GitOps methods can be applied to other infrastructure approaches, Kubernetes is the main focus.
What Is GitOps?
With GitOps, a Git repository stores all the information for defining, creating, and updating applications and infrastructure. By leveraging software development lifecycle principles such as version control, collaboration, and compliance and applying them to infrastructure, DevOps teams can more effectively manage resources. When changes are made to a Git repository, code is pushed to (or rolled back from) the production infrastructure, thus automating deployments quickly and reliably.
GitOps is really an extension of the infrastructure-as-code (IaC) concept. Using the same approach for managing infrastructure configuration files as for software code enables your team to more effectively on infrastructure changes and collaborate vet configuration files with the same rigor you apply to code.
Why should you use GitOps? GitOps leverages Git as a single source of truth for both infrastructure and applications. Because GitOps is declarative, it provides for better standardization, enhanced security, and improved productivity.
The GitOps operating model offers teams a number of advantages:
- family: The same process used to merge code with pull or merge requests is used to drive workload deployments to Kubernetes clusters.
- Velocity: Deployments are performed in near real-time. The time to deploy to a cluster is seconds from the time the artifact repository is updated.
- Agility: By slashing the cost and operational burden of deployment, application teams can focus on moving fast with as many releases as required to support the business.
- Consistency: Aligns developers and operations teams by managing operational workflows for Kubernetes using a single, unified version control system for apps and infrastructure.
- Security: The elimination of manual processes means less opportunity for inadvertent, manual errors. GitOps guarantees that the “desired state” expressed in version control is implemented on Kubernetes clusters. Drift from the desired state (whether inadvertent or malicious) is instantly detected and optionally blocked in some implementations.
- Compliance: Reduces compliance burden by centralizing Git as the source of truth for artifact versions, changes, and audits.
- cost savings: Eliminating unnecessary manual processes makes application teams more effective and efficient.
GitOps is based on a set of core principles that are easy to understand. These principles align well with the underlying design principles of Kubernetes, which is why GitOps and Kubernetes work so well together:
- Declarative: Like Kubernetes, GitOps is declarative. You use Git to “declare” the desired state for your application, cluster, etc., and GitOps works in the background to achieve and maintain that state.
- Versioned: Everything (app code, app configuration, cluster configuration, etc.) is versioned and controlled in Git (or another version control system) providing a single canonical state of truth—all stored in a single location.
- Automatic: Once you’ve updated and approved the desired state in Git, that state can be automatically applied with no manual effort.
- Self-healing: Software agents run in the background to ensure the desired state is maintained and alert you if anything drifts.
GitOps Workflows and Procedures
So now that you know what GitOps is and the underlying GitOps principles, how do people actually use it? One of the strengths of GitOps is that it automates infrastructure at the same time application code generates app binaries fostering better collaboration.
For example, teams typically use GitOps to automate the heavy lifting required when deploying a new software feature. In addition to creating and checking in the code for the feature, you also update and check in the application manifest. Completion of these check-ins can trigger a deployment of the updated code and configuration files to the specified cluster(s), or you can trigger the deployment manually. If something goes wrong with the deployment, you can roll back to a previous state just as easily.
You can control how the deployment of your application on Kubernetes is performed simply by setting the strategy for the Deployment as illustrated in the following code fragment:
Options for strategy type include:
The Recreate strategy ensures that old and new pods do not run concurrently. This can be useful when synchronizing changes to a backend data store that does not support access from two different client versions.
The Rolling Update strategy ensures there are some pods available to continue serving traffic during the update, so there is no downtime. However, both the old and new pods run side by side while the update is taking place, meaning any data stores or clients must be able to interact with both versions.
More advanced deployment patterns such as blue-green and canary deployments can be implemented with a little additional effort.
In a blue-green deployment, there are two versions of the application, but only one version is live and accessible to users at a time. This works well for applications where a rolling update won’t work but you don’t want any downtime. This approach also enables you to roll back to the old version immediately if something goes wrong.
A canary deployment creates new pods in parallel with existing ones in the same way as a Rolling Update does, but gives you more control over the update process. This is implemented by periodically updating the application manifest to increase the number of new pods while decreasing the number of old pods. If something goes wrong with the new version, you can detect the problems without affecting too many users.
While you can implement the GitOps methodology and GitOps workflows using standard Git tools, you’ll need some additional tooling to get the full benefits, especially the ability to ensure that the desired state is maintained. Popular open-source GitOps tools that work with Kubernetes include Flux and ArgoCD.
It’s worth noting that a GitOps pipeline can be pull-based or push-based. In a pull-based pipeline, a GitOps Kubernetes operator on each cluster watches for changes to the Git repository and pulls them into the cluster when they occur. In the push-based approach, repository updates trigger the build and deploy pipeline to push updates to each target cluster.
Pull-based GitOps pipelines have a number of advantages versus push-based:
- Require read-only (RO) credentials versus full credentials
- Don’t require inbound connectivity to each cluster
- Provide active detection and blocking/remediation of configuration drift
As a rule, pull-based GitOps is more secure, and active detection and remediation can also be highly beneficial.