id | title | sidebar_label |
---|---|---|
litmus-demo |
Chaos Engineering in a Microservices Environment |
Litmus Demo |
The Litmus Demo is a quick way to introduce yourself to the world of Cloud Native Chaos Engineering. It helps familiarize oneself with running litmus chaos experiments in a realistic app environment running multiple services on a Kubernetes cluster. In other words, you will be able to create a cluster, install a sample application that will be subject to chaos, build the chaos infrastructure & run the chaos experiments, all in a matter of minutes, using a simple python script.
It is designed to help you experience chaos injection almost out of the box and is expected to serve as a precursor to a more involved exploration of LitmusChaos framework for your own business application/infrastructure chaos needs.
Three cluster types are supported:
-
KinD: A 3 node KinD cluster pre-installed with the sock-shop demo application, litmus chaos CRDs, operator, and a minimal set of prebuilt chaos experiment CRs is setup.
-
GKE: A 3 node GKE cluster pre-installed with the sock-shop demo application, litmus chaos CRDs, operator, and the full generic Kubernetes chaos experiment suite is setup.
-
EKS: A 3 node EKS cluster pre-installed with the sock-shop demo application, litmus chaos CRDs, operator, and the full generic Kubernetes chaos experiment suite is setup.
-
Docker 18.09 or greater (When using containerized setup)
-
Docker, Kubectl & Python3.7+ (with the PyYaml package) are all you will need for running the KinD platform based chaos demo. If GKE/EKS is your platform choice, you may need to configure gcloud/aws (When using a non-containerized setup).
To get started with any of the above platforms we will follow the following steps.
-
Clone litmus demo repository in your system.
git clone https://github.com/litmuschaos/litmus-demo.git cd litmus-demo
We can Setup litmus demo in two different ways:
Containerized Setup | All the dependencies are installed inside a container we just need to have docker installed to run the container. It has been tested for Kind platform. |
---|---|
Non-Containerized Setup | We need to make sure that all the prerequisites are installed manually before running the demo script. |
You can setup & run the demo from a containerized environment by following the below-mentioned steps:
make build
OR
docker build -t litmuschaos/litmus-demo .
Run docker container interactive, now you can run any commands mentioned in the usage section with python3.
make exec
OR
docker run -v /var/run/docker.sock:/var/run/docker.sock --net="host" -it --entrypoint bash litmuschaos/litmus-demo
Now you can run the litmus demo script in the following ways:
- Execing into the container: The
make exec
will exec into the litmus demo container where you can use the./manage.py
script directly to run the demo script.
$ make exec
------------------
--> Login to Litmus Demo container
bash-5.0# ./manage.py -h
usage: manage.py [-h] {start,test,list,stop} ...
Spin up a Demo Environment on Kubernetes.
positional arguments:
{start,test,list,stop}
start Start a Cluster with the demo environment deployed.
test Run Litmus ChaosEngine Experiments inside litmus demo environment.
list List all available Litmus ChaosEngine Experiments available to run.
stop Shutdown the Cluster with the demo environment deployed.
optional arguments:
-h, --help show this help message and exit
bash-5.0#
- Without execing into the container: You can run the demo script from outside the container using the
runcmd
script.
$ ./runcmd -h
running -h inside container
usage: manage.py [-h] {start,test,list,stop} ...
Spin up a Demo Environment on Kubernetes.
positional arguments:
{start,test,list,stop}
start Start a Cluster with the demo environment deployed.
test Run Litmus ChaosEngine Experiments inside litmus demo environment.
list List all available Litmus ChaosEngine Experiments available to run.
stop Shutdown the Cluster with the demo environment deployed.
optional arguments:
-h, --help show this help message and exit
To get started with the non containerized setup we need to clone the litmus-demo repository and run the demo script as mentioned in the Usage section below.
git clone https://github.com/litmuschaos/litmus-demo.git
cd litmus-demo
Now we can use the manage.py
python script to setup the litmus demo environment.
$ ./manage.py -h
usage: manage.py [-h] {start,test,list,stop} ...
Spin up a Demo Environment on Kubernetes.
positional arguments:
{start,test,list,stop}
start Start a Cluster with the demo environment deployed.
test Run Litmus ChaosEngine Experiments inside litmus demo
environment.
list List all available Litmus ChaosEngine Experiments
available to run.
stop Shutdown the Cluster with the demo environment
deployed.
optional arguments:
-h, --help show this help message and exit
If you are using containerized setup. Follow one of the steps mentioned above to run the litmus demo. If you want to run the demo script without execing into the container
replace ./manage.py
with ./runcmd
from the below commands. For a non-containerized setup you can directly run the commands mentioned below.
-
Install the demo environment using one of the platforms with the start argument
For KinD Cluster
-
Install & bring-up the KinD cluster using the following command
./manage.py start --platform kind
-
Wait for all the pods to get in a ready state. You can monitor this using
watch kubectl get pods --all-namespaces
Once all pods come into Running state we can access the sock-shop application through web-ui which will help us to visualize the impact of chaos on the application and whether the application persists after chaos injections.
-
Get the port of frontend deployment
kubectl get deploy front-end -n sock-shop -o jsonpath='{.spec.template.spec.containers[?(@.name == "front-end")].ports[0].containerPort}'
-
Perform port forwarding on the port obtained above
kubectl port-forward deploy/front-end -n sock-shop 3000:<port-number> (typically, 8079)
-
Access the web-ui (catalogue) of the sock-shop app via the address
127.0.0.1:3000
on your browser
For GKE Cluster
-
Create the GKE cluster (ensure you have set up access to your gcloud project)
./manage.py start --platform GKE
-
Wait for all the pods to get in a ready state. You can monitor this using
watch kubectl get pods --all-namespaces
Once all pods come into Running state we can access the sock-shop application through web-ui which will help us to visualize the impact of chaos on the application and whether the application persists after chaos injections.
-
After a few min, identify the ingress IP to access the web-ui
kubectl get ingress basic-ingress --namespace=sock-shop
You can access the web application in a few minutes at
http://<ingress-ip>
For EKS Cluster
-
Create the EKS cluster (ensure you have setup access to your AWS project)
./manage.py start --platform EKS --name {EKS_CLUSTER_NAME}
-
Wait for all the pods to get in a ready state. You can monitor this using
watch kubectl get pods --all-namespaces
Once all pods come into Running state we can access the sock-shop application through web-ui which will help us to visualize the impact of chaos on the application and whether the application persists after chaos injections.
-
After a few min, identify the ingress IP to access the web-ui
kubectl get ingress basic-ingress --namespace=sock-shop
You can access the web application in a few minutes at
http://<ingress-ip>
-
-
To find out the supported tests for a platform, execute the following command:
./manage.py list --platform <kind|gke>
-
For running all experiments, run:
./manage.py test --platform <platform-name>
-
For running selective experiments
./manage.py test --platform <platform-name> --test <test-name>
Example: For running the pod-delete experiment.
./manage.py test --platform kind --test pod-delete
The experiment results (Pass/Fail) are derived based on the simple criteria of app availability post chaos & are summarized on the console once the execution completes. You are also encouraged to check the changes to the status of the web-ui & the respective microservices/pods as the experiment execute to get an idea of the failure injection process and subsequent recovery.
Get more details about the flags used to configure and run the chaos tests please refer to the parameter tables in the test section.
We can also generate the pdf report of the experiment result summary using --report
flag as follow:
./manage.py test --report=yes
It will generate a pdf report of the name chaos-report.pdf
in the current location containing the ChaosResult summary.
-
To shut down and destroy the cluster when you're finished, run the following commands:
For KinD cluster
./manage.py --platform kind stop
For GKE cluster
./manage.py --platform GKE stop --project {GC_PROJECT}
For EKS cluster
./manage.py --platform EKS stop --name {EKS_CLUSTER_NAME} --awsregion {EKS_REGION_NAME}