Skip to content
This repository has been archived by the owner on Nov 19, 2024. It is now read-only.

Latest commit

 

History

History
893 lines (851 loc) · 22.7 KB

chaosengine-concepts.md

File metadata and controls

893 lines (851 loc) · 22.7 KB
id title sidebar_label
chaosengine
Constructing the ChaosEngine
ChaosEngine

The ChaosEngine is the main user-facing chaos custom resource with a namespace scope and is designed to hold information around how the chaos experiments are executed. It connects an application instance with one or more chaos experiments, while allowing the users to specify run level details (override experiment defaults, provide new environment variables and volumes, options to delete or retain experiment pods, etc.,). This CR is also updated/patched with status of the chaos experiments, making it the single source of truth with respect to the chaos.

This section describes the fields in the ChaosEngine spec and the possible values that can be set against the same.

State Specification

Field .spec.engineState
Description Flag to control the state of the chaosengine
Type Mandatory
Range active, stop
Default active
Notes The engineState in the spec is a user defined flag to trigger chaos. Setting it to active ensures successful execution of chaos. Patching it with stop aborts ongoing experiments. It has a corresponding flag in the chaosengine status field, called engineStatus which is updated by the controller based on actual state of the ChaosEngine.

Application Specification

Field .spec.appinfo.appns
Description Flag to specify namespace of application under test
Type Optional
Range user-defined (type: string)
Default n/a
Notes The appns in the spec specifies the namespace of the AUT. Usually provided as a quoted string. It is optional for the infra chaos.
Field .spec.appinfo.applabel
Description Flag to specify unique label of application under test
Type Optional
Range user-defined (type: string)(pattern: "label_key=label_value")
Default n/a
Notes The applabel in the spec specifies a unique label of the AUT. Usually provided as a quoted string of pattern key=value. Note that if multiple applications share the same label within a given namespace, the AUT is filtered based on the presence of the chaos annotation litmuschaos.io/chaos: "true". If, however, the annotationCheck is disabled, then a random application (pod) sharing the specified label is selected for chaos. It is optional for the infra chaos.
Field .spec.appinfo.appkind
Description Flag to specify resource kind of application under test
Type Optional
Range deployment, statefulset, daemonset, deploymentconfig, rollout
Default n/a (depends on app type)
Notes The appkind in the spec specifies the Kubernetes resource type of the app deployment. The Litmus ChaosOperator supports chaos on deployments, statefulsets and daemonsets. Application health check routines are dependent on the resource types, in case of some experiments. It is optional for the infra chaos
Field .spec.auxiliaryAppInfo
Description Flag to specify one or more app namespace-label pairs whose health is also monitored as part of the chaos experiment, in addition to a primary application specified in the .spec.appInfo. NOTE: If the auxiliary applications are deployed in namespaces other than the AUT, ensure that the chaosServiceAccount is bound to a cluster role and has adequate permissions to list pods on other namespaces.
Type Optional
Range user-defined (type: string)(pattern: "namespace:label_key=label_value").
Default n/a
Notes The auxiliaryAppInfo in the spec specifies a (comma-separated) list of namespace-label pairs for downstream (dependent) apps of the primary app specified in .spec.appInfo in case of pod-level chaos experiments. In case of infra-level chaos experiments, this flag specifies those apps that may be directly impacted by chaos and upon which health checks are necessary.

Note: Irrespective of the nature of the chaos experiment, i.e., pod-level (single-app impact/lesser blast radius) or infra-level(multi-app impact/higher blast radius), the .spec.appinfo is a must-fill where the experiment is pointed to at least one primary app whose health is measured as an indicator of the resiliency / success of the chaos experiment.

RBAC Specification

Field .spec.chaosServiceAccount
Description Flag to specify serviceaccount used for chaos experiment
Type Mandatory
Range user-defined (type: string)
Default n/a
Notes The chaosServiceAccount in the spec specifies the name of the serviceaccount mapped to a role/clusterRole with enough permissions to execute the desired chaos experiment. The minimum permissions needed for any given experiment is provided in the .spec.definition.permissions field of the respective chaosexperiment CR.

Runtime Specification

Field .spec.annotationCheck
Description Flag to control annotationChecks on applications as prerequisites for chaos
Type Optional
Range true, false
Default true
Notes The annotationCheck in the spec controls whether or not the operator checks for the annotation "litmuschaos.io/chaos" to be set against the application under test (AUT). Setting it to true ensures the check is performed, with chaos being skipped if the app is not annotated, while setting it to false suppresses this check and proceeds with chaos injection.
Field .spec.terminationGracePeriodSeconds
Description Flag to control terminationGracePeriodSeconds for the chaos pods(abort case)
Type Optional
Range integer value
Default 30
Notes The terminationGracePeriodSeconds in the spec controls the terminationGracePeriodSeconds for the chaos resources in abort case. Chaos pods contains chaos revert upon abortion steps, which continuously looking for the termination signals. The terminationGracePeriodSeconds should be provided in such a way that the chaos pods got enough time for the revert before completely terminated.
Field .spec.jobCleanUpPolicy
Description Flag to control cleanup of chaos experiment job post execution of chaos
Type Optional
Range delete, retain
Default delete
Notes jobCleanUpPolicy controls whether or not the experiment pods are removed once execution completes. Set to retain for debug purposes (in the absence of standard logging mechanisms).

Component Specification

Field .spec.components.runner.image
Description Flag to specify image of ChaosRunner pod
Type Optional
Range user-defined (type: string)
Default n/a (refer Notes)
Notes The .components.runner.image allows developers to specify their own debug runner images. Defaults for the runner image can be enforced via the operator env CHAOS_RUNNER_IMAGE
Field .spec.components.runner.imagePullPolicy
Description Flag to specify imagePullPolicy for the ChaosRunner
Type Optional
Range Always, IfNotPresent
Default IfNotPresent
Notes The .components.runner.imagePullPolicy allows developers to specify the pull policy for chaos-runner. Set to Always during debug/test.
Field .spec.components.runner.imagePullSecrets
Description Flag to specify imagePullSecrets for the ChaosRunner
Type Optional
Range user-defined (type: []corev1.LocalObjectReference)
Default n/a
Notes The .components.runner.imagePullSecrets allows developers to specify the imagePullSecret name for ChaosRunner.
Field .spec.components.runner.runnerAnnotations
Description Annotations that needs to be provided in the pod which will be created (runner-pod)
Type Optional
Range user-defined (type: map[string]string)
Default n/a
Notes The .components.runner.runnerAnnotation allows developers to specify the custom annotations for the runner pod.
Field .spec.components.runner.args
Description Specify the args for the ChaosRunner Pod
Type Optional
Range user-defined (type: []string)
Default n/a
Notes The .components.runner.args allows developers to specify their own debug runner args.
Field .spec.components.runner.command
Description Specify the commands for the ChaosRunner Pod
Type Optional
Range user-defined (type: []string)
Default n/a
Notes The .components.runner.command allows developers to specify their own debug runner commands.
Field .spec.components.runner.configMaps
Description Configmaps passed to the chaos runner pod
Type Optional
Range user-defined (type: {name: string, mountPath: string})
Default n/a
Notes The .spec.components.runner.configMaps provides for a means to insert config information into the runner pod.
Field .spec.components.runner.secrets
Description Kubernetes secrets passed to the chaos runner pod.
Type Optional
Range user-defined (type: {name: string, mountPath: string})
Default n/a
Notes The .spec.components.runner.secrets provides for a means to push secrets (typically project ids, access credentials etc.,) into the chaos runner pod. These are especially useful in case of platform-level/infra-level chaos experiments.
Field .spec.components.runner.nodeSelector
Description Node selectors for the runner pod
Type Optional
Range Labels in the from of label key=value
Default n/a
Notes The .spec.components.runner.nodeSelector The nodeselector contains labels of the node on which runner pod should be scheduled. Typically used in case of infra/node level chaos.
Field .spec.components.runner.resources
Description Specify the resource requirements for the ChaosRunner pod
Type Optional
Range user-defined (type: corev1.ResourceRequirements)
Default n/a
Notes The .spec.components.runner.resources contains the resource requirements for the ChaosRunner Pod, where we can provide resource requests and limits for the pod.
Field .spec.components.runner.tolerations
Description Toleration for the runner pod
Type Optional
Range user-defined (type: []corev1.Toleration)
Default n/a
Notes The .spec.components.runner.tolerations Provides tolerations for the runner pod so that it can be scheduled on the respective tainted node. Typically used in case of infra/node level chaos.

Experiment Specification

Field .spec.experiments[].name
Description Name of the chaos experiment CR
Type Mandatory
Range user-defined (type: string)
Default n/a
Notes The experiment[].name specifies the chaos experiment to be executed by the ChaosOperator.
Field .spec.experiments[].spec.components.env
Description Environment variables passed to the chaos experiment
Type Optional
Range user-defined (type: {name: string, value: string})
Default n/a
Notes The experiment[].spec.components.env specifies the array of tunables passed to the experiment pods. Though the field is optional from a chaosengine definition viewpoint, it is almost always necessary to provide experiment tunables via this definition. While some of the env variables override the defaults in the experiment CR and some of the env are mandatory additions filling in for placeholders/empty values in the experimet CR. For a list of "mandatory" & "optional" env for an experiment, refer to the respective experiment documentation.
Field .spec.experiments[].spec.components.configMaps
Description Configmaps passed to the chaos experiment
Type Optional
Range user-defined (type: {name: string, mountPath: string})
Default n/a
Notes The experiment[].spec.components.configMaps provides for a means to insert config information into the experiment. The configmaps definition is validated for correctness and those specified are checked for availability (in the cluster/namespace) before being mounted into the experiment pods.
Field .spec.experiments[].spec.components.secrets
Description Kubernetes secrets passed to the chaos experiment
Type Optional
Range user-defined (type: {name: string, mountPath: string})
Default n/a
Notes The experiment[].spec.components.secrets provides for a means to push secrets (typically project ids, access credentials etc.,) into the experiment pods. These are especially useful in case of platform-level/infra-level chaos experiments. The secrets definition is validated for correctness and those specified are checked for availability (in the cluster/namespace) before being mounted into the experiment pods.
Field .spec.experiments[].spec.components.experimentImage
Description Override the image of the chaos experiment
Type Optional
Range string
Default n/a
Notes The experiment[].spec.components.experimentImage overrides the experiment image for the chaoexperiment.
Field .spec.experiments[].spec.components.experimentImagePullSecrets
Description Flag to specify imagePullSecrets for the ChaosExperiment
Type Optional
Range user-defined (type: []corev1.LocalObjectReference)
Default n/a
Notes The .components.runner.experimentImagePullSecrets allows developers to specify the imagePullSecret name for ChaosExperiment.
Field .spec.experiments[].spec.components.nodeSelector
Description Provide the node selector for the experiment pod
Type Optional
Range Labels in the from of label key=value
Default n/a
Notes The experiment[].spec.components.nodeSelector The nodeselector contains labels of the node on which experiment pod should be scheduled. Typically used in case of infra/node level chaos.
Field .spec.experiments[].spec.components.statusCheckTimeouts
Description Provides the timeout and retry values for the status checks. Defaults to 180s & 90 retries (2s per retry)
Type Optional
Range It contains values in the form {delay: int, timeout: int}
Default delay: 2s and timeout: 180s
Notes The experiment[].spec.components.statusCheckTimeouts The statusCheckTimeouts override the status timeouts inside chaosexperiments. It contains timeout & delay in seconds.
Field .spec.experiments[].spec.components.resources
Description Specify the resource requirements for the ChaosExperiment pod
Type Optional
Range user-defined (type: corev1.ResourceRequirements)
Default n/a
Notes The experiment[].spec.components.resources contains the resource requirements for the ChaosExperiment Pod, where we can provide resource requests and limits for the pod.
Field .spec.experiments[].spec.components.experimentAnnotations
Description Annotations that needs to be provided in the pod which will be created (experiment-pod)
Type Optional
Range user-defined (type: label key=value)
Default n/a
Notes The .spec.components.experimentAnnotation allows developers to specify the custom annotations for the experiment pod.
Field .spec.experiments[].spec.components.tolerations
Description Toleration for the experiment pod
Type Optional
Range user-defined (type: []corev1.Toleration)
Default n/a
Notes The .spec.components.tolerationsTolerations for the experiment pod so that it can be scheduled on the respective tainted node. Typically used in case of infra/node level chaos.
Field .spec.experiments[].spec.probe
Description Declarative way to define the chaos hypothesis
Type Optional
Range user-defined
Default n/a
Notes The .probe allows developers to specify the chaos hypothesis. It supports four types: cmdProbe, k8sProbe, httpProbe, promProbe. For more details refer