id	title	sidebar_label
openebs-pool-disk-loss	OpenEBS Pool Disk Loss Experiment Details	Pool Disk Loss

Experiment Metadata

Type	Description	Tested K8s Platform
OpenEBS	OpenEBS Pool Disk Loss contains chaos to disrupt state of infra resources. Experiments can inject disk loss against OpenEBS pool.	GKE, AWS (KOPS)

Prerequisites

Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus). If not, install from here
Ensure that the openebs-pool-disk-loss experiment resource is available in the cluster by executing kubectl get chaosexperiments in the specificed namespace. If not, install from here
The DATA_PERSISTENCE can be enabled by provide the application's info in a configmap volume so that the experiment can perform necessary checks. Currently, LitmusChaos supports data consistency checks only for MySQL and Busybox.
For MYSQL data persistence check create a configmap as shown below in the application namespace (replace with actual credentials):

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: openebs-pool-disk-loss
data:
  parameters.yml: | 
    dbuser: root
    dbpassword: k8sDem0
    dbname: test

For Busybox data persistence check create a configmap as shown below in the application namespace (replace with actual credentials):

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: openebs-pool-disk-loss
data:
  parameters.yml: | 
    blocksize: 4k
    blockcount: 1024
    testfile: exampleFile

There should be administrative access to the platform on which the cluster is hosted, as the recovery of the affected node could be manual. Example gcloud access to the project

apiVersion: v1
kind: Secret
metadata:
  name: cloud-secret
type: Opaque
stringData:
  cloud_config.yml: |-
  # Add the cloud AWS credentials or GCP service account respectively

Ensure that the chaosServiceAccount used for the experiment has cluster-scope permissions as the experiment may involve carrying out the chaos in the openebs namespace while performing application health checks in its respective namespace.

Entry Criteria

Application pods are healthy before chaos injection
Application writes are successful on OpenEBS PVs
The pool disk is healthy before chaos injection

Exit Criteria

Application pods are healthy post chaos injection
OpenEBS Storage pool pods are healthy
The disk is healthy after chaos injection

If the experiment tunable DATA_PERSISTENCE is set to 'mysql' or 'busybox':

Application data written prior to chaos is successfully retrieved/read
Database consistency is maintained as per db integrity check utils

Details

This scenario validates the behaviour of stateful applications and OpenEBS disk pool upon disk loss.
Injects disk loss on the specified OpenEBS disk pool and node pool
Can test the stateful application's resilience to disk loss

Integrations

Disk loss is achieved using the litmus chaos library

Steps to Execute the Chaos Experiment

This Chaos Experiment can be triggered by creating a ChaosEngine resource on the cluster. To understand the values to provide in a ChaosEngine specification, refer Getting Started
Follow the steps in the sections below to create the chaosServiceAccount, prepare the ChaosEngine & execute the experiment.

Prepare chaosServiceAccount

Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app)namespace. This example consists of the minimum necessary cluster role permissions to execute the experiment.

Sample Rbac Manifest

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: pool-disk-loss-sa
  namespace: default
  labels:
    name: pool-disk-loss-sa
    app.kubernetes.io/part-of: litmus
---
# Source: openebs/templates/clusterrole.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: pool-disk-loss-sa
  labels:
    name: pool-disk-loss-sa
    app.kubernetes.io/part-of: litmus
rules:
- apiGroups: ["","apps","litmuschaos.io","batch","extensions","storage.k8s.io","openebs.io"]
  resources: ["pods", "pods/log", "jobs", "events", "pods/exec", "cstorpools", "configmaps", "secrets", "storageclasses", "persistentvolumes", "persistentvolumeclaims", "cstorvolumereplicas", "chaosexperiments", "chaosresults", "chaosengines"]
  verbs: ["create","list","get","patch","update","delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: pool-disk-loss-sa
  labels:
    name: pool-disk-loss-sa
    app.kubernetes.io/part-of: litmus
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: pool-disk-loss-sa
subjects:
- kind: ServiceAccount
  name: pool-disk-loss-sa
  namespace: default

Prepare ChaosEngine

Provide the application info in spec.appinfo
Provide the auxiliary applications info (ns & labels) in spec.auxiliaryAppInfo
Override the experiment tunables if desired in experiments.spec.components.env
Provide the configMaps and secrets in experiments.spec.components.configMaps/secrets, For more info refer Sample ChaosEngine
To understand the values to provided in a ChaosEngine specification, refer ChaosEngine Concepts

Supported Experiment Tunables

Variables	Description	Specify In ChaosEngine	Notes
APP_PVC	The PersistentVolumeClaim used by the stateful application	Mandatory	Corresponds to the PVC using OpenEBS cStor storage class
CLOUD_PLATFORM	Cloud Platform name	Mandatory	Supported platforms: GKE, AWS
PROJECT_ID	GCP project ID, leave blank if it's AWS	Mandatory
NODE_NAME	Node name of the cluster	Mandatory
DISK_NAME	Name of external/cloud disk attached of the node	Mandatory
DEVICE_NAME	Enter the device name which you wanted to mount. Applies only to AWS.	Mandatory
ZONE_NAME	Zone Name for GCP and region name for AWS	Mandatory	Note: Use REGION_NAME for AWS
TOTAL_CHAOS_DURATION	Total duration for which disk loss is injected	Optional	Defaults to 60 seconds
DATA_PERSISTENCE	Flag to perform data consistency checks on the application	Optional	Default value is disabled (empty/unset). It supports only `mysql` and `busybox`. Ensure configmap with app details are created
APP_CHECK	If it checks to true, the experiment will check the status of the application.	Optional
RAMP_TIME	Period to wait before and after injection of chaos in sec	Optional
OPENEBS_NAMESPACE	Namespace in which OpenEBS pods are deployed	Optional
INSTANCE_ID	A user-defined string that holds metadata/info about current run/instance of chaos. Ex: 04-05-2020-9-00. This string is appended as suffix in the chaosresult CR name.	Optional	Ensure that the overall length of the chaosresult CR is still < 64 characters

Sample ChaosEngine Manifest

apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
  name: pool-chaos
  namespace: default
spec:
  # It can be active/stop
  engineState: 'active'
  #ex. values: ns1:name=percona,ns2:run=busybox 
  auxiliaryAppInfo: ''
  appinfo:
    appns: 'default'
    applabel: 'app=nginx'
    appkind: 'deployment'
  chaosServiceAccount: pool-disk-loss-sa
  experiments:
    - name: openebs-pool-disk-loss
      spec:
        components:
          env:  
            # provide the total chaos duration
            - name: TOTAL_CHAOS_DURATION
              value: '60'    

            - name: APP_PVC
              value: 'demo-nginx-claim'

            # GKE and AWS supported
            - name: CLOUD_PLATFORM
              value: 'GKE'

            # Enter the project id for gcp only
            - name: PROJECT_ID 
              value: 'litmus-demo-123'

            # Enter the node name
            - name: NODE_NAME
              value: 'demo-node-123' 

            # Enter the disk name
            - name: DISK_NAME
              value: 'demo-disk-123 '  
            
            # Enter the device name
            - name: DEVICE_NAME
              value: '/dev/sdb'

            # Enter the zone name
            - name: ZONE_NAME
              value: 'us-central1-a'

Create the ChaosEngine Resource

Create the ChaosEngine manifest prepared in the previous step to trigger the Chaos.

kubectl apply -f chaosengine.yml
If the chaos experiment is not executed, refer to the troubleshooting section to identify the root cause and fix the issues.

Watch Chaos progress

Watch the behaviour of the application pod and the OpenEBS data replica/pool pods by setting up a watch on the respective namespaces

watch -n 1 kubectl get pods -n <application-namespace> watch -n 1 kubectl get pods -n <openebs-namespace>

Check Chaos Experiment Result

Check whether the application is resilient to the pool disk loss, once the experiment (job) is completed. The ChaosResult resource naming convention is: <ChaosEngine-Name>-<ChaosExperiment-Name>.

kubectl describe chaosresult pool-chaos-openebs-pool-disk-loss -n <application-namespace>

OpenEBS Pool Disk Loss Demo [TODO]

A sample recording of this experiment execution is provided here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

openebs-pool-disk-loss.md

openebs-pool-disk-loss.md

Experiment Metadata

Prerequisites

Entry Criteria

Exit Criteria

Details

Integrations

Steps to Execute the Chaos Experiment

Prepare chaosServiceAccount

Sample Rbac Manifest

Prepare ChaosEngine

Supported Experiment Tunables

Sample ChaosEngine Manifest

Create the ChaosEngine Resource

Watch Chaos progress

Check Chaos Experiment Result

OpenEBS Pool Disk Loss Demo [TODO]

Files

openebs-pool-disk-loss.md

Latest commit

History

openebs-pool-disk-loss.md

File metadata and controls

Experiment Metadata

Prerequisites

Entry Criteria

Exit Criteria

Details

Integrations

Steps to Execute the Chaos Experiment

Prepare chaosServiceAccount

Sample Rbac Manifest

Prepare ChaosEngine

Supported Experiment Tunables

Sample ChaosEngine Manifest

Create the ChaosEngine Resource

Watch Chaos progress

Check Chaos Experiment Result

OpenEBS Pool Disk Loss Demo [TODO]