Gremlin - The Magic Of Chaos Engineering
Onkar Naik
DevOps @Forescout ?? | Google Champion Innovator | AWS | DevOps | 3X GCP | 1X Azure | 1X Terraform | Ansible | Kubernetes | SRE | Jenkins | Tech Blogger ??
?? Hello Connections ??
?? Welcome to the world of Chaos Engineering ??
?? In this article, we going to see how to perform Chaos Engineering with Docker Container by Gremlin Blackhole Attack ??
?? What Is Chaos Engineering?
Chaos Engineering is a disciplined approach to identifying potential failures before they become outages.
Chaos Engineering lets you compare what you think will happen to what actually happens in your systems. You literally "break things on purpose" to learn how to build more resilient systems.
?? What Is Gremlin?
Gremlin is a simple, safe, and secure way to improve the resilience of your systems by using Chaos Engineering to identify and fix failure modes.
Gremlin is a cloud-native platform that runs in any environment. Gremlin supports all public cloud environments -?AWS, Azure, and GCP - and runs on Linux, Windows, and containerized environments like?Kubernetes, and yes, bare metal too.
? Now let's begin with Chaos Engineering Experiments using Gremlin as follows ?
?? Prerequisites ??
Before you begin this Chaos Engineering Experiments, you'll need the following:
1) Installing Gremlin In a Docker Container
After you have created your Gremlin account you will need to find your Gremlin credentials. Login to the Gremlin web app using your Company name and sign-in credentials. These were emailed to you when you signed up for Gremlin.
Navigate to Team Settings by clicking on the user icon in the top right (next to the halt button), then clicking Team Settings. Select the Configuration tab. Here, you'll see your Team ID and Secret Key. Store both of these as environment variables by running the following commands (replacing?YOUR_TEAM_ID?and?YOUR_SECRET_KEY?respectively):
export GREMLIN_TEAM_ID=YOUR_TEAM_ID
export GREMLIN_TEAM_SECRET=YOUR_SECRET_KEY
Next, run the Gremlin Docker container. Use?docker run?to pull the?official Gremlin Docker image?and start the Gremlin agent:
docker run -d --net=host \
--cap-add=NET_ADMIN --cap-add=SYS_BOOT --cap-add=SYS_TIME \
--cap-add=KILL \
--pid=host \
-v $PWD/var/lib/gremlin:/var/lib/gremlin \
-v $PWD/var/log/gremlin:/var/log/gremlin \
-v /var/run/docker.sock:/var/run/docker.sock \
-e GREMLIN_TEAM_ID="$GREMLIN_TEAM_ID" \
-e GREMLIN_TEAM_SECRET="$GREMLIN_TEAM_SECRET" \
gremlin/gremlin daemon
Let's check the Gremlin Container launched successfully or not using the following docker command :
docker ps
Now that everything's up and running, let's open an interactive shell on the Gremlin container and use the Gremlin CLI. Run the following command, replacing?GREM_ID?with the ID or name of your Gremlin container:
docker exec -it $GREM_ID /bin/sh
After going inside the container test the basic Gremlin Help Commands like
gremlin help attack-container
2) Creating a CPU Attack from Gremlin Container against host using Gremlin CLI
?? Creating an htop container for monitoring ??
htop is an interactive process viewer for Unix. This step isn't a requirement for installing Gremlin in Docker, but we'll use htop in this tutorial to observe the impact of our attacks. You can skip this step completely, or use another monitoring tool of your choice.
First, create the Dockerfile for your htop container
FROM alpine:latest
RUN apk add --update htop && rm -rf /var/cache/apk/*
ENTRYPOINT ["htop"]
Build the Dockerfile and tag the image:
sudo docker build -t htop .
Now, start an htop container. Using?--pid=host?grants htop access to the host's process space so that htop can monitor processes running on the host:
sudo docker run -it --rm --pid=host htop
?? Building Slack App Integration with Gremlin App ??
We can also Integrate our Slack app with Gremlin for getting Attack reports and Notifications on the slack channel.
For setting up Slack Integration with Gremlin we have to add the slack channel In Gremlin Integration we created for getting Gremlin attacks notifications.
Now just select your slack channel where you want to receive Gremlin Notifications and Click Allow. That's all your Slack Integration with Gremlin Is now Ready.????
We will use the Gremlin CLI?attack?command to create a CPU attack. This attack will consume the CPU using the default settings of 1 core for 60 seconds.
We could use our running Gremlin container to run the attack, but for this, we'll actually create a new container that will stop once the attack is finished. Run the following to create the CPU attack:
领英推荐
docker run -d \
--net=host \
--pid=host \
--cap-add=NET_ADMIN \
--cap-add=SYS_BOOT \
--cap-add=SYS_TIME \
--cap-add=KILL \
-e GREMLIN_TEAM_ID="${GREMLIN_TEAM_ID}" \
-e GREMLIN_TEAM_SECRET="${GREMLIN_TEAM_SECRET}" \
-v /var/run/docker.sock:/var/run/docker.sock \
-v /var/log/gremlin:/var/log/gremlin \
-v /var/lib/gremlin:/var/lib/gremlin \
gremlin/gremlin attack cpu
Let's see the progress of the attack using the htop container we created earlier:
docker run -it --rm --pid=host htop
We can see that CPU Core No 7 shows 100% utilization done by Gremlin CPU Attack.
As we have set up Slack Integration with Gremlin so we also receive the Gremlin Attack Started Notification on the slack channel.
3) Creating Blackhole Attack On Nginx Docker Container using Gremlin Docker Container
Nginx is a popular web server that we will use as the target of our chaos experiments. First we will create a directory for the HTML page we will serve using Nginx:
mkdir -p ~/docker-nginx/html
cd ~/docker-nginx/html
Create a simple HTML page named?index.html.
Create a container using the Nginx Docker image (note that if you aren't a member of the?docker?group, you'll need to add?sudo?to the start of each command):
docker run -l service=nginx --name docker-nginx -p 90:80 -d -v ~/docker
nginx/html:/usr/share/nginx/html nginx
View the?docker-nginx?container:
docker ps
Next, we'll run a?blackhole attack?on the Nginx container. A blackhole attack drops all network traffic to and from a container, making it appear offline. First, run the attack (make sure to replace the container ID!):
docker run -it \
--cap-add=NET_ADMIN \
-e GREMLIN_TEAM_ID="${GREMLIN_TEAM_ID}" \
-e GREMLIN_TEAM_SECRET="${GREMLIN_TEAM_SECRET}" \
-v /var/run/docker.sock:/var/run/docker.sock \
gremlin/gremlin attack-container f291a040a6aa blackhole --ingress_port 80
We can view the progress of the attack using the htop container you created earlier (Note: Make sure to replace GREM_ID with your Nginx Docker Container.
sudo docker run -d -it --rm --pid=container:$GREM_ID htop
We can see that all the incoming and outgoing traffic from Nginx Container on Port 80 is dropping by Blackhole Attack .
We can also verify the attack by accessing the Nginx Container from outside shows that the page takes time for loading due to Blackhole Attack for 60 Seconds .
We can see Gremlin also send the notification of Blackhole Attack Status on Slack .
4) Running attacks from the Gremlin web app
Now that the Gremlin container is running in your Docker environment, you can use the Gremlin web app to run attacks on the host, or other Docker containers running on the host.
To start an attack from the web app,?log in using your Gremlin credentials?and select?Attacks?from the left panel. Then, select?New Attack?to get the following screen:
Next we'll select an attack to run against the container. Like our CLI example, we'll use the CPU attack. For more information about all our attacks, please visit?Attacks.
Once the attacks begins, you'll be taken to the following screen. You can follow the progress of attack from this page. The Stage under Details will state the current progress of the attack. If for some reason you need to stop the attack, the Halt button will stop the attack.
We can also monitor by using htop Container we created earlier as follows :
sudo docker run -it --rm --pid=host htop
We can see that htop monitoring output shows that CPU Core No 0 and 5 have 100% affected by CPU Attack as follows :
Lets check the status of CPU Attack from Gremlin Web App :
?? As you can see the CPU Attack was successfully done from Gremlin Web App . ??
?? Conclusion ??
?? We've installed Gremlin in a Docker container and validated that Gremlin works by running the "Hello, World!" of Chaos Engineering experiments: the CPU resource attack. We have run a CPU resource attack from the Gremlin Docker container against the host. We have also run blackhole attack from the Gremlin Docker container against an Nginx Docker container. ??
AWS Community Builder, multi cloud certified professional
3 年Congratulations Onkar Naik
LinkedIn Top Voice | AWS Community Builder | DevOps Expert | AWS & Kubernetes Certified | Two-Time DevOps Award Winner | Top 1% Mentor at Topmate
3 年Looks interesting! Onkar Naik
Salesforce Developer | Software Engineer at FinSpectra | 1x Salesforce Certified | 1x Microsoft Azure Certified | DevOps | Git-GitHub
3 年Kya baat hai
Senior Associate, Infrastructure Specialist
3 年Bravo Naik