Chaos engineering on Kubernetes has never been more powerful. Tools like Krkn now support over twenty scenario types, such as pod disruptions, node failures, network chaos, CPU and memory stress, zone outages, and more. Krkn's documentation is thorough, consisting of well-defined scenario types with clear parameters and defaults. But there is still a gap between knowing what you want to test and expressing it in the exact CLI syntax. The more scenarios a tool supports, the more flags and options you need to get right.
I built a solution that allows you to generate validated Krkn commands from plain English. You can describe the failure you want to simulate, and get a ready-to-run command. The skill handles the syntax translation for you.
The gap between intent and syntax
While chaos testing a Red Hat OpenShift cluster, I kept running into the following loop every time I needed to execute a new scenario.
- Find the right scenario: Is it pod-scenarios, container-scenarios, or application-outages?
- Look up every parameter: What’s the flag name? Is it
--chaos-durationor-- duration? What does it map to inkrkn-hub? - Check defaults and valid values: Can I pass random as a kill mode? What’s the default iteration count?
- Wire it all together: Construct the full
krknctlorkrkn-hubcommand with the right volume mounts andkubeconfigpaths.
A single typo in a flag name meant a failed run with a cryptic error, and I’d be right back in the documentation. I was spending more time reading about chaos tests than actually running them. If you’ve run Krkn scenarios before, you’ve probably felt this too.
What if you could just say what you want?
This question led me to build a Claude Code skill, a natural language interface for Krkn that already knows every krknctl and krkn-hub parameter, its default, type, and allowed values.
It pulls from an authoritative knowledge base containing structured JSON definitions for all 20+ Krkn scenarios, guessing nothing. Every flag name and environment variable comes directly from the source.
How the Krkn scenario generator skill works
The workflow is simple. Type a natural language description of the failure you want to simulate as follows:
/krkn-scenario kill etcd pods in openshift-etcd namespace You’ll get:
- A validated
krknctlCLI command, ready to copy and run - A validated
krkn-hubDocker command as an alternative - A parameter table with descriptions and justifications for each value
- Relevant edge cases and warnings from the scenario definition
Here are a few more examples:
/krkn-scenario add 200ms network latency to worker nodes for 5 minutes
/krkn-scenario hog 4 CPU cores at 80% on nodes labeled stress-test=true
/krkn-scenario fill PVCs to 90% in namespace my-app
/krkn-scenario simulate zone outage in us-east-1a Each of these produces a complete, validated command, not a template with placeholders to fill in.
Describe what you want—no syntax required
The most important design goal is to not need to know anything about Krkn’s CLI syntax. No flag names, no environment variable mappings, no digging through scenario JSON files. Just describe the failure condition you want to test in plain English.
The skill handles the translation as follows:
- You say: Kill etcd pods
- The skill generates:
krknctl run pod-scenarios --namespace openshift-etcd …
- The skill generates:
- You say: Stress CPU on 2 worker nodes
- The skill generates:
krknctl run node-cpu-hog --node-selector node role.kubernetes.io/worker= …
- The skill generates:
- You say: Add network latency to pods in my-app
- The skill generates:
krknctl run pod-network-chaos --namespace my-app -- latency 200 …
- The skill generates:
This opens up chaos engineering to anyone on your team, not just the person who memorized the Krkn docs. An SRE investigating a production incident can quickly set up the right failure scenario. A developer can test how their application handles pod disruptions without learning a new CLI.
The skill asks before it assumes
Unlike a static template generator, the skill understands what information is missing and asks targeted questions before generating commands. It classifies every parameter into three buckets:
- Required and missing: Parameters you must provide, such as the target namespace for pod scenarios. The skill always asks for these.
- Optional but important: Parameters where the default could be risky or surprising, such as
kill_countdefaulting to a high number or targeting all namespaces. The skill flags these for your review. - Safe defaults: Parameters where the default is sensible. The skill uses this silently to keep the output clean.
For example, if you type /krkn-scenario kill pods, the skill responds as follows:
Before I generate the commands, I need a few details:
Required:
Namespace: Which namespace are the target pods in?
Recommended to specify:
Label selector: Do you want to target specific pods by label?
Kill count: How many pods per iteration? Default is 1. It also proposes values based on the scenario context, so you’re not just filling in blanks. You'll get guided recommendations from someone (well, something) that has read all the docs.
It knows your cluster, not just the docs
Since the skill runs inside Claude Code, it has access to your terminal session and can pull context directly from your connected cluster. Instead of generating commands with placeholder values like <your-namespace>, it can:
- List namespaces and suggest the right one based on your description
- Fetch pod labels to construct precise label selectors
- Query node selectors to target the correct node pools
- Verify resources exist before generating a command that would fail
Here’s what that looks like in practice:
$ kubectl get pods -n openshift-etcd --show-labels
NAME READY STATUS LABELS
etcd-master-0 1/1 Running app=etcd,node=master-0
etcd-master-1 1/1 Running app=etcd,node=master-1
etcd-master-2 1/1 Running app=etcd,node=master-2 The skill uses this real cluster data to generate environment-aware commands you can run immediately with no manual editing.
Every Krkn scenario covered
The skill supports all Krkn scenario types available today:
- Pod-level: pod-scenarios, container-scenarios
- Node-level: node-scenarios, node-scenarios-bm, node-cpu-hog, node-memory hog, node-io-hog
- Network: network-chaos, pod-network-chaos, node-network-filter, pod network-filter, syn-flood
- Time: time-scenarios
- Application: application-outages
- Service: service-disruption-scenarios, service-hijacking
- Storage: pvc-scenarios
- Cluster-wide: power-outages, zone-outages
- Virtualization: kubevirt-outage
It fetches the knowledge base fresh on every invocation, so your scenario definitions stay up to date as Krkn evolves.
Try it yourself
Install the skill and start generating scenarios:
npx skills add krkn-chaos/krkn-skill Then open Claude Code and run your first scenario:
/krkn-scenario kill pods in namespace kube-system That’s it! Two commands and you’re generating validated chaos scenarios from natural language.
If you’re running chaos tests against OpenShift clusters, this pairs well with krknctl and the krkn-hub container images to give you a complete chaos engineering workflow, from scenario generation to execution.
What’s next
This skill has simplified my daily workflow for CCLM testing. What used to take multiple documentation lookups and careful flag wiring now takes a single prompt.
Here’s where I’d like to take it next:
- Scenario chaining: Generate multi-step chaos workflows (e.g., drain a node, then stress CPU on the remaining nodes).
- Dry-run validation: Pre-validate commands against the cluster before execution.
- Result analysis: Interpret chaos test results and suggest follow-up scenarios.
The skill is open source, and you are welcome to contribute. If you have ideas, feedback, or want to get involved, open an issue on the krkn-skills repository or check out Krkn on GitHub to learn more about the chaos engineering framework behind it.