Type what you want to break: AI-assisted chaos engineering with Krkn

Chaos engineering on Kubernetes has never been more powerful. Tools like Krkn now support over twenty scenario types, such as pod disruptions, node failures, network chaos, CPU and memory stress, zone outages, and more. Krkn's documentation is thorough, consisting of well-defined scenario types with clear parameters and defaults. But there is still a gap between knowing what you want to test and expressing it in the exact CLI syntax. The more scenarios a tool supports, the more flags and options you need to get right.

I built a solution that allows you to generate validated Krkn commands from plain English. You can describe the failure you want to simulate, and get a ready-to-run command. The skill handles the syntax translation for you.

The gap between intent and syntax

While chaos testing a Red Hat OpenShift cluster, I kept running into the following loop every time I needed to execute a new scenario.

Find the right scenario: Is it pod-scenarios, container-scenarios, or application-outages?
Look up every parameter: What’s the flag name? Is it --chaos-duration or -- duration? What does it map to in krkn-hub?
Check defaults and valid values: Can I pass random as a kill mode? What’s the default iteration count?
Wire it all together: Construct the full krknctl or krkn-hub command with the right volume mounts and kubeconfig paths.

A single typo in a flag name meant a failed run with a cryptic error, and I’d be right back in the documentation. I was spending more time reading about chaos tests than actually running them. If you’ve run Krkn scenarios before, you’ve probably felt this too.

What if you could just say what you want?

This question led me to build a Claude Code skill, a natural language interface for Krkn that already knows every krknctl and krkn-hub parameter, its default, type, and allowed values.

It pulls from an authoritative knowledge base containing structured JSON definitions for all 20+ Krkn scenarios, guessing nothing. Every flag name and environment variable comes directly from the source.

How the Krkn scenario generator skill works

The workflow is simple. Type a natural language description of the failure you want to simulate as follows:

/krkn-scenario kill etcd pods in openshift-etcd namespace

You’ll get:

A validated krknctl CLI command, ready to copy and run
A validated krkn-hub Docker command as an alternative
A parameter table with descriptions and justifications for each value
Relevant edge cases and warnings from the scenario definition

Here are a few more examples:

/krkn-scenario add 200ms network latency to worker nodes for 5 minutes 
/krkn-scenario hog 4 CPU cores at 80% on nodes labeled stress-test=true 
/krkn-scenario fill PVCs to 90% in namespace my-app 
/krkn-scenario simulate zone outage in us-east-1a

Each of these produces a complete, validated command, not a template with placeholders to fill in.

Describe what you want—no syntax required

The most important design goal is to not need to know anything about Krkn’s CLI syntax. No flag names, no environment variable mappings, no digging through scenario JSON files. Just describe the failure condition you want to test in plain English.

The skill handles the translation as follows:

You say: Kill etcd pods
- The skill generates: krknctl run pod-scenarios --namespace openshift-etcd …
You say: Stress CPU on 2 worker nodes
- The skill generates: krknctl run node-cpu-hog --node-selector node role.kubernetes.io/worker= …
You say: Add network latency to pods in my-app
- The skill generates: krknctl run pod-network-chaos --namespace my-app -- latency 200 …

This opens up chaos engineering to anyone on your team, not just the person who memorized the Krkn docs. An SRE investigating a production incident can quickly set up the right failure scenario. A developer can test how their application handles pod disruptions without learning a new CLI.

The skill asks before it assumes

Unlike a static template generator, the skill understands what information is missing and asks targeted questions before generating commands. It classifies every parameter into three buckets:

Required and missing: Parameters you must provide, such as the target namespace for pod scenarios. The skill always asks for these.
Optional but important: Parameters where the default could be risky or surprising, such as kill_count defaulting to a high number or targeting all namespaces. The skill flags these for your review.
Safe defaults: Parameters where the default is sensible. The skill uses this silently to keep the output clean.

For example, if you type /krkn-scenario kill pods, the skill responds as follows:

Before I generate the commands, I need a few details: 
Required: 
Namespace: Which namespace are the target pods in? 
Recommended to specify: 
Label selector: Do you want to target specific pods by label? 
Kill count: How many pods per iteration? Default is 1.

It also proposes values based on the scenario context, so you’re not just filling in blanks. You'll get guided recommendations from someone (well, something) that has read all the docs.

It knows your cluster, not just the docs

Since the skill runs inside Claude Code, it has access to your terminal session and can pull context directly from your connected cluster. Instead of generating commands with placeholder values like <your-namespace>, it can:

List namespaces and suggest the right one based on your description
Fetch pod labels to construct precise label selectors
Query node selectors to target the correct node pools
Verify resources exist before generating a command that would fail

Here’s what that looks like in practice:

$ kubectl get pods -n openshift-etcd --show-labels 
NAME READY STATUS LABELS
etcd-master-0 1/1 Running app=etcd,node=master-0 
etcd-master-1 1/1 Running app=etcd,node=master-1 
etcd-master-2 1/1 Running app=etcd,node=master-2

The skill uses this real cluster data to generate environment-aware commands you can run immediately with no manual editing.

Every Krkn scenario covered

The skill supports all Krkn scenario types available today:

Pod-level: pod-scenarios, container-scenarios
Node-level: node-scenarios, node-scenarios-bm, node-cpu-hog, node-memory hog, node-io-hog
Network: network-chaos, pod-network-chaos, node-network-filter, pod network-filter, syn-flood
Time: time-scenarios
Application: application-outages
Service: service-disruption-scenarios, service-hijacking
Storage: pvc-scenarios
Cluster-wide: power-outages, zone-outages
Virtualization: kubevirt-outage

It fetches the knowledge base fresh on every invocation, so your scenario definitions stay up to date as Krkn evolves.

Try it yourself

Install the skill and start generating scenarios:

npx skills add krkn-chaos/krkn-skill

Then open Claude Code and run your first scenario:

/krkn-scenario kill pods in namespace kube-system

That’s it! Two commands and you’re generating validated chaos scenarios from natural language.

If you’re running chaos tests against OpenShift clusters, this pairs well with krknctl and the krkn-hub container images to give you a complete chaos engineering workflow, from scenario generation to execution.

What’s next

This skill has simplified my daily workflow for CCLM testing. What used to take multiple documentation lookups and careful flag wiring now takes a single prompt.

Here’s where I’d like to take it next:

Scenario chaining: Generate multi-step chaos workflows (e.g., drain a node, then stress CPU on the remaining nodes).
Dry-run validation: Pre-validate commands against the cluster before execution.
Result analysis: Interpret chaos test results and suggest follow-up scenarios.

The skill is open source, and you are welcome to contribute. If you have ideas, feedback, or want to get involved, open an issue on the krkn-skills repository or check out Krkn on GitHub to learn more about the chaos engineering framework behind it.

Type what you want to break: AI-assisted chaos engineering with Krkn

The gap between intent and syntax

What if you could just say what you want?

How the Krkn scenario generator skill works

Describe what you want—no syntax required

The skill asks before it assumes

It knows your cluster, not just the docs

Every Krkn scenario covered

Try it yourself

What’s next

Red Hat OpenShift 4.22: What dynamic plugin developers need to know

What's new for developers in Red Hat OpenShift 4.22

Simplify your performance monitoring with the pmlogger PUSH model

Efficiently manage host content with Red Hat Satellite's multi-CV

New features in Python 3.14

Accelerate 5G core standalone rollout: An end-to-end testing pipeline with Red Hat...

Platforms

Build

Quicklinks

Communicate

RED HAT DEVELOPER

Red Hat legal and privacy links

Red Hat legal and privacy links