This is an early prototype for running an autoscaling Buildkite Agent stack on Kubernetes.
We've seen many customers running the Agent on their own Kubernetes clusters. This is an extraction of some of the patterns we've seen. The stack works today, but we'll be improving it over time as we discover the best ways to run Buildkite pipelines on Kubernetes.
You'll need to create your own overlay to add:
- Buildkite agent token
- Private repository access using either
- Git credentials
- SSH key
cp -R k8s/overlays/example k8s/overlays/my-stackname
Your Buildkite agent token can be found here: https://buildkite.com/organizations/~/agents
Paste that value into the space labeled "BUILDKITE_AGENT_TOKEN" in k8s/overlays/my-stackname/kustomization.yaml
You can create a set of git credentials for testing on GitHub here. You only need to select repository access. Fill in the values in k8s/overlays/my-stackname/git-credentials
.
We would recommend making the agent its own ssh key and adding it as a deploy key to the repository you want to test, or using a machine user with a dedicated ssh key. But for simplicity during local testing you can also use your own ssh key.
Paste the private into ./k8s/overlays/my-stackname/private-ssh-key
You can view the generated manifests before apply them to the cluster with:
kustomize build k8s/overlays/my-stackname
You can pipe this input directly into kubectl to apply it:
kustomize build k8s/overlays/my-stackname | kubectl apply -f -
The example scales Buildkite agent pods using a horizontal pod autoscaler and buildkite metrics from the default job queue. Whenever there are scheduled jobs waiting for execution the number of agent boxes scale up by either double or add 7 agents whichever is greater every 30 seconds. Whenever there are idle agent boxes they will begin to scale down 1 box every 20 seconds, but there may appear to be a delay if that box is currently running a job. These rules can be seen and modified in the supplied manifests.
The example uses the Buildkite k8s job plugin to allow running Buildkite pipeline jobs as a Kubenetes Job using a Pod spec. Some example pipelines are included in this repository, as well as the source Dockerfiles for the associated containers. If you need to network between containers in a pod step you can use kubedns to talk between containers.
There are a couple of really simple test services -- win penguin and fail whale. We also pushed the images for win penguin and fail whale up to Docker Hub for testing.
The easiest way to get started is to run a kind (kubernetes-in-docker) cluster on your local machine. We have a few scripts that make cluster provisioning and bootstrap easier.
You'll need to have the docker community edition installed to run the local tests with kind (kubernetes-in-docker). We recommend installing it directly from https://www.docker.com/get-started/
The rest of the local development environment dependencies are managed with homebrew and can be installed with the bootstrap script:
./bin/bootstrap
The easiest way to get started is to run a kind(kubernetes-in-docker) cluster on your local machine. We have a few scripts that make cluster provisioning and bootstrap easier.
To get started run the command below. This will setup a single node kubernetes cluster in docker, add the kubernetes metric server, and then add manifests for the buildkite kubernetes stack.
./bin/up k8s/overlay/my-stackname
When you are finished tear it down with the command below. This deletes all the resources out of the cluster and
./bin/down k8s/overlay/my-stackname
- Build steps running in a pod spec are currently limited to checking the status code of the first container in the container list.
- Agent scaling is pre-defined at the moment to show the functionality, but will be customizeable in the future.
- The pod steps are not expected to run for more than 60 seconds within the pod. If they need longer to run adjust the timeout in the plugin configuration.