id	title	sidebar_label
openebs-target-container-failure	OpenEBS Target Container Failure Experiment Details	Target Container Failure

Experiment Metadata

Type	Description	Tested K8s Platform
OpenEBS	Kill the cStor target/Jiva controller container	GKE, EKS, Konvoy(AWS), Packet(Kubeadm), Minikube, OpenShift(Baremetal)

Note: In this example, we are using nginx as stateful application that stores static pages on a Kubernetes volume.

Prerequisites

Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus). If not, install from here
Ensure that the openebs-target-container-failure experiment resource is available in the cluster. If not, install from here
The DATA_PERSISTENCE can be enabled by provide the application's info in a configmap volume so that the experiment can perform necessary checks. Currently, LitmusChaos supports data consistency checks only for MySQL and Busybox.
- For MYSQL data persistence check create a configmap as shown below in the application namespace (replace with actual credentials):
```
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: openebs-target-container-failure
data:
  parameters.yml: | 
    dbuser: root
    dbpassword: k8sDem0
    dbname: test
```
- For Busybox data persistence check create a configmap as shown below in the application namespace (replace with actual credentials):
```
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: openebs-target-container-failure
data:
  parameters.yml: | 
    blocksize: 4k
    blockcount: 1024
    testfile: exampleFile
```
Ensure that the chaosServiceAccount used for the experiment has cluster-scope permissions as the experiment may involve carrying out the chaos in the openebs namespace while performing application health checks in its respective namespace.
Ensure that you have adequate amount of CPU and Memory resources available in your cluster to run the experiment.

Entry Criteria

Application pods are healthy before chaos injection
Application writes are successful on OpenEBS PVs

Exit Criteria

Stateful application pods are healthy post chaos injection
OpenEBS Storage target pods are healthy

If the experiment tunable DATA_PERSISTENCE is set to 'enabled':

Application data written prior to chaos is successfully retrieved/read
Database consistency is maintained as per db integrity check utils

Details

This scenario validates the behaviour of stateful applications and OpenEBS data plane upon forced termination of the controller container
Kills the specified container in the controller pod by sending SIGKILL termination signal to its docker socket (hence docker runtime is required)
Containers are killed using the kill command provided by pumba
Pumba is run as a daemonset on all nodes in dry-run mode to begin with; the kill command is issued during experiment execution via kubectl exec
Can test the stateful application's resilience to momentary iSCSI connection loss

Integrations

Container kill is achieved using the pumba chaos library in case of docker runtime, & litmuslib using crictl tool in case of containerd runtime.
The desired lib image can be configured in the env variable LIB_IMAGE.

Steps to Execute the Chaos Experiment

This Chaos Experiment can be triggered by creating a ChaosEngine resource on the cluster. To understand the values to be provided in a ChaosEngine specification, refer Getting Started
Follow the steps in the sections below to create the chaosServiceAccount, prepare the ChaosEngine & execute the experiment.

Prepare chaosServiceAccount

Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

Sample Rbac Manifest

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: target-container-failure-sa
  namespace: default
  labels:
    name: target-container-failure-sa
    app.kubernetes.io/part-of: litmus
---
# Source: openebs/templates/clusterrole.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: target-container-failure-sa
  labels:
    name: target-container-failure-sa
    app.kubernetes.io/part-of: litmus
rules:
- apiGroups: ["","litmuschaos.io","batch","apps","storage.k8s.io"]
  resources: ["pods","jobs","pods/log","pods/exec","events","configmaps","secrets","persistentvolumeclaims","storageclasses","persistentvolumes","chaosengines","chaosexperiments","chaosresults"]
  verbs: ["create","list","get","patch","update","delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: target-container-failure-sa
  labels:
    name: target-container-failure-sa
    app.kubernetes.io/part-of: litmus
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: target-container-failure-sa
subjects:
- kind: ServiceAccount
  name: target-container-failure-sa
  namespace: default

Prepare ChaosEngine

Provide the application info in spec.appinfo
Override the experiment tunables if desired in experiments.spec.components.env
Provide the auxiliary applications info (ns & labels) in spec.auxiliaryAppInfo
Provide the configMaps and secrets in experiments.spec.components.configMaps/secrets, For more info refer Sample ChaosEngine
To understand the values to provided in a ChaosEngine specification, refer ChaosEngine Concepts

Supported Experiment Tunables

Variables	Description	Type	Notes
APP_PVC	The PersistentVolumeClaim used by the stateful application	Mandatory	PVC may use either OpenEBS Jiva/cStor storage class
LIB_IMAGE	The chaos library image used to run the kill command	Optional	Defaults to `gaiaadm/pumba:0.6.5`. Supported: `{docker : gaiaadm/pumba:0.6.5, containerd: gprasath/crictl:ci}`
CONTAINER_RUNTIME	The container runtime used in the Kubernetes Cluster	Optional	Defaults to `docker`. Supported: `docker`, `containerd`
TARGET_CONTAINER	The container to be killed in the storage controller pod	Optional	Defaults to `cstor-volume-mgmt`
TOTAL_CHAOS_DURATION	Amount of soak time for I/O post container kill	Optional	Defaults to 60 seconds
DEPLOY_TYPE	Type of Kubernetes resource used by the stateful application	Optional	Defaults to `deployment`. Supported: `deployment`, `statefulset`
DATA_PERSISTENCE	Flag to perform data consistency checks on the application	Optional	Default value is disabled (empty/unset). It supports only `mysql` and `busybox`. Ensure configmap with app details are created
INSTANCE_ID	A user-defined string that holds metadata/info about current run/instance of chaos. Ex: 04-05-2020-9-00. This string is appended as suffix in the chaosresult CR name.	Optional	Ensure that the overall length of the chaosresult CR is still < 64 characters

Sample ChaosEngine Manifest

apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
  name: target-chaos
  namespace: default
spec:
  # It can be active/stop
  engineState: 'active'
  #ex. values: ns1:name=percona,ns2:run=nginx 
  auxiliaryAppInfo: ''
  appinfo:
    appns: 'default'
    applabel: 'app=nginx'
    appkind: 'deployment'
  chaosServiceAccount: target-container-failure-sa
  experiments:
    - name: openebs-target-container-failure
      spec:
        components:
          env:
            # provide the total chaos duration
            - name: TOTAL_CHAOS_DURATION
              value: '20'
              
            - name: TARGET_CONTAINER
              value: 'cstor-istgt'

            - name: APP_PVC
              value: 'demo-nginx-claim'   
              
            - name: DEPLOY_TYPE
              value: 'deployment'

Create the ChaosEngine Resource

Create the ChaosEngine manifest prepared in the previous step to trigger the Chaos.

kubectl apply -f chaosengine.yml
If the chaos experiment is not executed, refer to the troubleshooting section to identify the root cause and fix the issues.

Watch Chaos progress

View pod restart count by setting up a watch on the pods in the OpenEBS namespace

watch -n 1 kubectl get pods -n <application-namespace>

Check Chaos Experiment Result

Check whether the application is resilient to the target container kill, once the experiment (job) is completed. The ChaosResult resource naming convention is: <ChaosEngine-Name>-<ChaosExperiment-Name>.

kubectl describe chaosresult target-chaos-openebs-target-container-failure -n <application-namespace>

Recovery

If the verdict of the ChaosResult is Fail, and/or the OpenEBS components do not return to healthy state post the chaos experiment, then please refer the OpenEBS troubleshooting guide for more info on how to recover the same.

OpenEBS Target Container Failure Demo [TODO]

A sample recording of this experiment execution is provided here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

openebs-target-container-failure.md

openebs-target-container-failure.md

Experiment Metadata

Prerequisites

Entry Criteria

Exit Criteria

Details

Integrations

Steps to Execute the Chaos Experiment

Prepare chaosServiceAccount

Sample Rbac Manifest

Prepare ChaosEngine

Supported Experiment Tunables

Sample ChaosEngine Manifest

Create the ChaosEngine Resource

Watch Chaos progress

Check Chaos Experiment Result

Recovery

OpenEBS Target Container Failure Demo [TODO]

Files

openebs-target-container-failure.md

Latest commit

History

openebs-target-container-failure.md

File metadata and controls

Experiment Metadata

Prerequisites

Entry Criteria

Exit Criteria

Details

Integrations

Steps to Execute the Chaos Experiment

Prepare chaosServiceAccount

Sample Rbac Manifest

Prepare ChaosEngine

Supported Experiment Tunables

Sample ChaosEngine Manifest

Create the ChaosEngine Resource

Watch Chaos progress

Check Chaos Experiment Result

Recovery

OpenEBS Target Container Failure Demo [TODO]