Skip to main content
Redhat Developers  Logo
  • Products

    Platforms

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat AI
      Red Hat AI
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • See all Red Hat products

    Featured

    • Red Hat build of OpenJDK
    • Red Hat Developer Hub
    • Red Hat JBoss Enterprise Application Platform
    • Red Hat OpenShift Dev Spaces
    • Red Hat OpenShift Local
    • Red Hat Developer Sandbox

      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Red Hat OpenShift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • See all technologies
    • Programming languages & frameworks

      • Java
      • Python
      • JavaScript
    • System design & architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer experience

      • Productivity
      • Tools
      • GitOps
    • Automated data processing

      • AI/ML
      • Data science
      • Apache Kafka on Kubernetes
    • Platform engineering

      • DevOps
      • DevSecOps
      • Red Hat Ansible Automation Platform for applications and services
    • Secure development & architectures

      • Security
      • Secure coding
  • Learn

    Featured

    • Kubernetes & cloud native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • AI/ML
      AI/ML Icon
    • See all learning resources

    E-books

    • GitOps cookbook
    • Podman in action
    • Kubernetes operators
    • The path to GitOps
    • See all e-books

    Cheat sheets

    • Linux commands
    • Bash commands
    • Git
    • systemd commands
    • See all cheat sheets

    Documentation

    • Product documentation
    • API catalog
    • Legacy documentation
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore the Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

How we turned Storybook into a behavioral verification engine

April 29, 2026
Riccardo Forina
Related topics:
Application development and deliveryArtificial intelligenceDevOpsPlatform engineering
Related products:
Red Hat OpenShift

    This is part 3 of a four-part series. In part 1, we covered governance: how we made the code base AI-ready. In part 2, we covered delivery: the migration strategy. This post details how we made the test suite trustworthy—specifically, how we turned Storybook from a component playground into something closer to a behavioral specification for the access management interface on Red Hat Hybrid Cloud Console.

    A common approach is to use Storybook to render components in isolation with static props. We use it to verify that the application behaves correctly end-to-end: real components, real routing, real data fetching, real permission checks, and real error handling. The only thing that's simulated is the network. A combination of patterns (typed mock factories, seed data, interaction helpers, stateful databases, and step isolation) turns individual stories into executable specifications. Together, they form a system where adding a feature means adding the proof that it works; breaking a feature means failing a test that describes exactly what broke and why.

    That's what we mean by verification engine. Here's how it works.

    The no-fake-layers principle

    Most user interface (UI) test suites mock at the wrong layer. They replace data-fetching functions with static returns. The tests pass, but the integration between the component and the network was never exercised. Bugs ship because the mock boundary was too close to the component.

    We wanted a different guarantee: if it works in Storybook, it works in the browser.

    We use Mock Service Worker (MSW) to intercept HTTP requests at the browser level and return controlled responses. Each story exercises the full stack from the component down to the network boundary. If a component makes a request to the wrong endpoint, the test fails. If the error handling path has a bug, the test catches it because the error response comes through the same code path as a real one.

    The patterns (with code)

    Making that guarantee real required enforcing a few patterns across the codebase. These are extracted from our governance documentation (the same docs that AI coding assistants read automatically, as described in part 1). Each one exists because we hit the problem it prevents.

    Handler factories, not inline mocks

    The most common source of technical debt in test code is hardcoding mock responses inline. Each test defines its own data, with its own field names. When the API changes, you update the real code and then spend a day updating test files—or you don't and the tests pass against stale data.

    We use handler factories: functions that return typed MSW handlers for a specific API domain. The types flow from the SDK through the data layer into the factory. When the SDK updates a response shape, the factory breaks at compile time.

    // DO:  Factory calls — typed, reusable, version-aware
    handlers: [...v2RolesHandlers()]                              // happy path, default seed data
    handlers: [...v2RolesHandlers(customRoles, { onList: spy })]  // custom data + spy callback
    handlers: [...groupsHandlers([])]                             // empty state
    handlers: [...groupsErrorHandlers(500)]                       // error state
    handlers: [...v2RolesLoadingHandlers()]                       // loading state (skeleton screens)
    // DONT:  Inline handlers — untyped, duplicated, invisible to SDK updates
    parameters: {
      msw: {
        handlers: [
          http.get('/api/rbac/v2/roles/', () =>
            HttpResponse.json({ data: [{ name: 'Admin', uuid: '123' }] })
          ),
        ],
      },
    },

    One factory update fixes every story that uses it. One inline handler is one more thing to forget. (We didn't start here. The first stories used inline handlers. After the third API change required updating dozens of files, we extracted the factories and never looked back.)

    Seed constants, not hardcoded strings

    Play functions (Storybook's mechanism for running user interactions) must never use string literals for entity names. When seed data changes, hardcoded strings silently break.

    // DO:  Named constants — breaks visibly when seed data changes
    await canvas.findByText(ROLE_TENANT_ADMIN.name);
    await expandWorkspaceRow(user, canvas, WS_ROOT.name);
    // DONT:  Hardcoded strings — silently wrong when seed data changes
    await canvas.findByText('Tenant admin');
    await expandWorkspaceRow(user, canvas, 'Root Workspace');

    Interaction helpers, not copy-paste

    The first test inlines every Document Object Model (DOM) query. The second test copies the first. By the twentieth test, you have twenty slightly different implementations of "open a modal and fill in a form."

    We extracted every reusable interaction into shared helpers and banned the inline alternatives:

    // DO:  Shared helpers — async-safe, reusable, maintained in one place
    const modal = await waitForModal();
    await clickWizardNext(user, modal);
    await confirmDestructiveModal(user, { buttonLabel: /remove/i });
    await selectTableRow(user, canvas, ROLE_TENANT_ADMIN.name);
    // DONT:  Banned in play functions
    document.querySelector('[role="dialog"]');        // direct DOM access
    await delay(500);                                 // arbitrary waits
    element.dispatchEvent(new MouseEvent('click'));    // raw DOM events
    canvas.getByRole('button', { name: /save/i });    // sync query after async boundary

    The helpers enforce correct async patterns. Every DOM query after an action uses findBy* (which retries) instead of getBy* (which doesn't). Every assertion that depends on an async operation uses waitFor. The banned patterns in the preceding list are codified in our governance docs; the async patterns (findBy* over getBy*, no arbitrary waits) are enforced by lint rules that fail the build.

    Step isolation in journey stories

    Component stories test individual surfaces. Journey stories test multi-step flows: create a role, see it appear, edit it, delete it, verify it's gone. These use stateful mock databases that support real create, read, update, and delete (CRUD) operations in memory—the handler factories read from and write to the same collection.

    The step() function organizes phases and creates closure boundaries. DOM references from one step can't leak into the next. This prevents a category of flaky tests where a stale reference causes a later assertion to fail:

    play: async ({ canvasElement, step }) => {
      const canvas = within(canvasElement);
      await step('Navigate to roles list', async () => {
        const table = await canvas.findByRole('grid');
        await expect(within(table).findByText(FIRST_ROLE.name)).resolves.toBeInTheDocument();
      });
      await step('Create a new role', async () => {
        await user.click(await canvas.findByRole('button', { name: /create role/i }));
        const modal = await waitForModal();
        await clearAndType(user, modal, /role name/i, 'New role');
        await clickWizardNext(user, modal);
        await expect(createRoleSpy).toHaveBeenCalled();
      });
    };

    What the engine looks like at scale

    None of these patterns are remarkable on their own. Handler factories are a convenience. Seed constants are good hygiene. Interaction helpers reduce duplication. Step isolation prevents flakiness.

    When composed together, however, they produce something qualitatively different from a test suite. Each feature in the application has a story that describes its behavior: what the user sees, what API calls are made, what happens on error, what happens with empty data, what happens without permission.

    A total of 959 of these specifications run in continuous integration (CI) on every change. When we upgraded the component library across 272 files, the suite showed exactly which behaviors survived and which did not. When we removed 216 files of legacy state management in a single commit, the suite proved the application still worked.

    The stories aren't tests that verify the code. They're the specification that defines what "working" means.

    The cross-functional win

    The most exciting benefit had nothing to do with code quality. It changed how the team collaborates.

    Every story is automatically deployed to a hosted environment via CI. That environment became the place where design reviews happen—not staging, not a branch deploy, not a screen recording. 

    Designers open the story, navigate through the flow, and see exactly what users will see. 

    A product manager reviewed one of our wizard flows in Storybook before it ever reached staging and caught a step-ordering issue that would have shipped otherwise. 

    The cost of that catch was zero. The cost of finding it in production would have been a support ticket and a confused customer.

    Stories also became the reference artifact the team points to when behavioral questions come up. "Does the delete confirmation show the resource name?" is no longer a question someone answers from memory. Someone pulls up the story. The story is the answer. We used the same stories to demo features to stakeholders outside the immediate team. The behavioral accuracy meant we were showing real product behavior, not a polished prototype that would diverge from what shipped.

    The technical investment in making stories accurate (no fake layers, real network mocking, and real permission checks) is what makes all of this possible. If the stories used fake data or skipped permission checks, nobody outside engineering would trust them. Because the stories exercise the same code paths as the real product, everyone uses them. Engineering, design, and product now share a single artifact that describes how the application behaves—and that artifact runs in CI on every change.

    What I'd tell you if you're building this

    Mock at the network boundary, not the component boundary. The further your mock is from the component, the more real code your test exercises.

    Build factories, not inline mocks. A handler factory pays for itself after the third test that uses it. When the API changes, you update the factory and every consumer gets the fix.

    Extract interactions into helpers immediately. The first time you write "open modal, fill form, submit" is the last time it should be inline.

    Invest in the cross-functional story. The engineering value of behavioral tests is real. But the organizational value of giving design and product a shared artifact they can trust is larger than any single technical benefit.

    Try Red Hat Hybrid Cloud Console at console.redhat.com.

    Learn more

    • Red Hat Hybrid Cloud Console
    • Inventory Groups are now Workspaces
    • Read part 1: Engineering an AI-ready code base: Governance lessons from the Red Hat Hybrid Cloud Console
    • Read part 2: How we rewrote a production UI without stopping it

    Related Posts

    • Engineering an AI-ready code base: Governance lessons from the Red Hat Hybrid Cloud Console

    • How we rewrote a production UI without stopping it

    • Test-driven development with Quarkus

    • How to build AI-ready applications with Quarkus

    • OpenShift AI observability summarizer: Transform metrics into meaning

    • OpenCode: A model-neutral AI coding assistant for OpenShift Dev Spaces

    Recent Posts

    • How we turned Storybook into a behavioral verification engine

    • Boosting speed: Use eBPF and netstacklat to troubleshoot latency

    • New features in GCC 16: Improved error messages and SARIF output

    • OpenShift AI observability summarizer: Transform metrics into meaning

    • Build .NET container images with Tekton

    What’s up next?

    Learning Path Feature image for Red Hat OpenShift

    Deployment of Red Hat OpenShift Data Foundation using GitOps

    Deploy Red Hat OpenShift Data Foundation (ODF), a unified data storage...
    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Platforms

    • Red Hat AI
    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform
    • See all products

    Build

    • Developer Sandbox
    • Developer tools
    • Interactive tutorials
    • API catalog

    Quicklinks

    • Learning resources
    • E-books
    • Cheat sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site status dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit
    © 2026 Red Hat

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility