id | title | sidebar_label |
---|---|---|
byoc-pod-delete |
BYOC Pod Delete Experiment Details |
Service Pod - Application |
Type | Description | Tested K8s Platform |
---|---|---|
ChaosToolKit | BYOC pod delete experiment | Kubeadm, Minikube |
- Ensure that the Litmus ChaosOperator is running by executing
kubectl get pods
in operator namespace (typically,litmus
). If not, install from here - Ensure that the
k8-pod-delete
experiment resource is available in the cluster by executingkubectl get chaosexperiments
in the desired namespace. If not, install from here - Ensure you have nginx default application setup on default namespace ( if you are using specific namespace please execute below on that namespace)
- Application replicas are healthy before chaos injection
- Service resolution works successfully as determined by deploying a sample nginx application and a custom liveness app querying the nginx application health end point
- Application replicas are healthy after chaos injection
- Service resolution works successfully as determined by deploying a sample nginx application and a custom liveness app querying the nginx application health end point
- Causes graceful pod failure of application replicas using ChaosToolKit based on provided namespace and Label while doing health checks against the endpoint
- Tests deployment sanity with steady state hypothesis executed pre and post pod failures
- Service resolution will fail if application replicas are not present.
Type | Experiment | Details | json |
---|---|---|---|
ChaosToolKit | ChaosToolKit single, random pod delete experiment with count | Executing via label name app=<> | pod-app-kill-count.json |
ChaosToolKit | ChaosToolKit single, random pod delete experiment | Executing via label name app=<> | pod-app-kill-health.json |
ChaosToolKit | ChaosToolKit single, random pod delete experiment with count | Executing via Custom label name =<> | pod-app-kill-count.json |
ChaosToolKit | ChaosToolKit single, random pod delete experiment | Executing via Custom label name =<> | pod-app-kill-health.json |
ChaosToolKit | ChaosToolKit All pod delete experiment with health validation | Executing via Custom label name app=<> | pod-app-kill-all.json |
ChaosToolKit | ChaosToolKit All pod delete experiment with health validation | Executing via Custom label name =<> | pod-custom-kill-all.json |
- Pod failures can be effected using one of these chaos libraries:
litmus
-
This ChaosExperiment can be triggered by creating a ChaosEngine resource on the cluster. To understand the values to provide in a ChaosEngine specification, refer Getting Started
-
Follow the steps in the sections below to create the chaosServiceAccount, prepare the ChaosEngine & execute the experiment.
- Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: k8-pod-delete-sa
namespace: default
labels:
name: k8-pod-delete-sa
app.kubernetes.io/part-of: litmus
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: k8-pod-delete-sa
namespace: default
labels:
name: k8-pod-delete-sa
app.kubernetes.io/part-of: litmus
rules:
- apiGroups: ["","apps","batch"]
resources: ["jobs","deployments","daemonsets"]
verbs: ["create","list","get","patch","delete"]
- apiGroups: ["","litmuschaos.io"]
resources: ["pods","configmaps","events","services","chaosengines","chaosexperiments","chaosresults","deployments","jobs"]
verbs: ["get","create","update","patch","delete","list"]
- apiGroups: [""]
resources: ["nodes"]
verbs : ["get","list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: k8-pod-delete-sa
namespace: default
labels:
name: k8-pod-delete-sa
app.kubernetes.io/part-of: litmus
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: k8-pod-delete-sa
subjects:
- kind: ServiceAccount
name: k8-pod-delete-sa
namespace: default
Note: In case of restricted systems/setup, create a PodSecurityPolicy(psp) with the required permissions. The chaosServiceAccount
can subscribe to work around the respective limitations. An example of a standard psp that can be used for litmus chaos experiments can be found here.
- Provide the application info in
spec.appinfo
- Override the experiment tunables if desired in
experiments.spec.components.env
- To understand the values to provided in a ChaosEngine specification, refer ChaosEngine Concepts
Variables | Description | Specify In ChaosEngine | Notes |
---|---|---|---|
NAME_SPACE | This is chaos namespace which will create all infra chaos resources in that namespace | Mandatory | Default to default |
LABEL_NAME | The default name of the label | Mandatory | Defaults to nginx |
APP_ENDPOINT | Endpoint where ChaosToolKit will make a call and ensure the application is healthy | Mandatory | Defaults to localhost |
FILE | Type of pod-delete chaos (in terms of steady state checks performed) we want to execute, represented by the ChaosToolKit json file | Mandatory | Default to `pod-app-kill-health.json` |
REPORT | The Report of execution coming in json format | Optional | Defaults to is `true` |
REPORT_ENDPOINT | Report endpoint which can take the json format and submit it | Optional | Default to setup for Kafka topic for chaos, but can support any reporting database |
TEST_NAMESPACE | Place holder from where the chaos experiment is executed | Optional | Defaults to is `default` |
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: nginx-chaos-app-health
namespace: default
spec:
appinfo:
appns: 'default'
applabel: 'app=nginx'
appkind: 'deployment'
engineState: 'active'
chaosServiceAccount: k8-pod-delete-sa
experiments:
- name: k8-pod-delete
spec:
components:
env:
# set chaos namespace
- name: NAME_SPACE
value: 'default'
# set chaos label name
- name: LABEL_NAME
value: 'nginx'
# pod endpoint
- name: APP_ENDPOINT
value: 'localhost'
- name: FILE
value: 'pod-app-kill-health.json'
- name: REPORT
value: 'true'
- name: REPORT_ENDPOINT
value: 'none'
- name: TEST_NAMESPACE
value: 'default'
-
Create the ChaosEngine manifest prepared in the previous step to trigger the Chaos.
kubectl apply -f chaosengine.yml
-
View application pod termination & recovery by setting up a watch on the pods in the application namespace
watch kubectl get pods
-
Check whether the application is resilient to the ChaosToolKit pod failure, once the experiment (job) is completed. The ChaosResult resource name is derived like this:
<ChaosEngine-Name>-<ChaosExperiment-Name>
.kubectl describe chaosresult k8-pod-delete -n <chaos-namespace>
-
Check the log and result for existing experiment
kubectl log -f k8-pod-delete-<> -n <chaos-namespace>