Skip to main content
Redhat Developers  Logo
  • AI

    Get started with AI

    • Red Hat AI
      Accelerate the development and deployment of enterprise AI solutions.
    • AI learning hub
      Explore learning materials and tools, organized by task.
    • AI interactive demos
      Click through scenarios with Red Hat AI, including training LLMs and more.
    • AI/ML learning paths
      Expand your OpenShift AI knowledge using these learning resources.
    • AI quickstarts
      Focused AI use cases designed for fast deployment on Red Hat AI platforms.
    • No-cost AI training
      Foundational Red Hat AI training.

    Featured resources

    • OpenShift AI learning
    • Open source AI for developers
    • AI product application development
    • Open source-powered AI/ML for hybrid cloud
    • AI and Node.js cheat sheet

    Red Hat AI Factory with NVIDIA

    • Red Hat AI Factory with NVIDIA is a co-engineered, enterprise-grade AI solution for building, deploying, and managing AI at scale across hybrid cloud environments.
    • Explore the solution
  • Learn

    Self-guided

    • Documentation
      Find answers, get step-by-step guidance, and learn how to use Red Hat products.
    • Learning paths
      Explore curated walkthroughs for common development tasks.
    • Guided learning
      Receive custom learning paths powered by our AI assistant.
    • See all learning

    Hands-on

    • Developer Sandbox
      Spin up Red Hat's products and technologies without setup or configuration.
    • Interactive labs
      Learn by doing in these hands-on, browser-based experiences.
    • Interactive demos
      Click through product features in these guided tours.

    Browse by topic

    • AI/ML
    • Automation
    • Java
    • Kubernetes
    • Linux
    • See all topics

    Training & certifications

    • Courses and exams
    • Certifications
    • Skills assessments
    • Red Hat Academy
    • Learning subscription
    • Explore training
  • Build

    Get started

    • Red Hat build of Podman Desktop
      A downloadable, local development hub to experiment with our products and builds.
    • Developer Sandbox
      Spin up Red Hat's products and technologies without setup or configuration.

    Download products

    • Access product downloads to start building and testing right away.
    • Red Hat Enterprise Linux
    • Red Hat AI
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform
    • See all products

    Featured

    • Red Hat build of OpenJDK
    • Red Hat JBoss Enterprise Application Platform
    • Red Hat OpenShift Dev Spaces
    • Red Hat Developer Toolset

    References

    • E-books
    • Documentation
    • Cheat sheets
    • Architecture center
  • Community

    Get involved

    • Events
    • Live AI events
    • Red Hat Summit
    • Red Hat Accelerators
    • Community discussions

    Follow along

    • Articles & blogs
    • Developer newsletter
    • Videos
    • Github

    Get help

    • Customer service
    • Customer support
    • Regional contacts
    • Find a partner

    Join the Red Hat Developer program

    • Download Red Hat products and project builds, access support documentation, learning content, and more.
    • Explore the benefits

How to implement and monitor circuit breakers in OpenShift Service Mesh 3

September 29, 2025
Shailendra Kumar Singh
Related topics:
Application modernizationDevOpsMicroservicesService mesh
Related products:
Red Hat OpenShift Container PlatformRed Hat OpenShift Service Mesh

    In a distributed microservices architecture, the failure of one service can cascade, leading to system-wide outages. To build resilient and fault-tolerant applications, we must isolate failures and prevent them from spreading. The circuit breaker is a critical design pattern that addresses this challenge by temporarily blocking traffic to a service that it detects as unhealthy, giving it time to recover.

    This guide shows you how to configure, trigger, and monitor a circuit breaker using Red Hat OpenShift Service Mesh 3.0. By the end, you'll have a hands-on understanding of how to use Istio's outlier detection to automatically improve your application's stability on OpenShift.

    Prerequisites

    Before you begin, ensure your environment is fully prepared. This guide assumes you have the following setup:

    • An OpenShift Container Platform cluster: You will need access to a cluster running version 4.16 or newer with administrator privileges.
    • Command-line tools: The OpenShift CLI (oc) and Kubernetes CLI (kubectl) must be installed and configured to connect to your cluster.
    • Red Hat OpenShift Service Mesh and the Bookinfo sample application: You need a project (for example, bookinfo) where the OpenShift Service Mesh control plane is installed and the Bookinfo sample application is deployed. The application's product page must be accessible via the Istio ingress gateway.

      If you need to set this up, follow the official Red Hat documentation to install OpenShift Service Mesh and deploy the Bookinfo application. (Complete sections 2.1 through 2.5.3 of the tutorial.)

    • Kiali for monitoring: This tutorial uses Kiali to visualize the circuit breaker's status. Ensure you have configured access to the Kiali console.

      To set this up, follow the official documentation to expose and access the Kiali console. (Complete sections 4.1.1 through 4.1.3).

    Step-by-step instructions

    Follow these steps to deploy the application, configure the circuit breaker, and monitor the results.

    Step 1: Preparation

    First, verify that the Bookinfo application is running correctly and that all pods are in a Running state.

    oc get pods -n bookinfo

    You should see output similar to this, with pods for productpage, details, ratings, and three versions of reviews along with istio-igressgateway. 

    NAME                                   READY   STATUS    RESTARTS      AGE
    details-v1-7c799b8b4b-7npbl            2/2     Running   0             9d
    istio-ingressgateway-7bb7fb8fd-8sbxr   1/1     Running   0             9d
    productpage-v1-f8479c768-s72st         2/2     Running   0             9d
    ratings-v1-7fccfc8b8b-dr6xp            2/2     Running   4 (9d ago)    18d
    reviews-v1-8cc49957f-gswj6             2/2     Running   0             9d
    reviews-v2-5bf9856f5c-bcswn            2/2     Running   0             9d
    reviews-v3-6d8f75d44c-fqmzf            2/2     Running   3 (17h ago)   17h

    Generate load and inspect the Kiali graph for traffic (see Figure 1).

    while true; do
      echo "$(date) - Status: $(curl -s -o /dev/null -w '%{http_code}' http://istio-ingressgateway-bookinfo.<yourdomainName>/productpage)"
      sleep 1
    done
    Kiali graph showing the traffic flow for bookinfo.
    Figure 1: Kiali graph showing the traffic flow for bookinfo.

    Step 2: Configure the circuit breaker

    Circuit breaking is configured in Istio using a DestinationRule. We will apply a policy that monitors the reviews service. Specifically, we'll target the v3 subset. If an instance in this subset returns even a single 5xx error, the Envoy proxy will "eject" it from the load-balancing pool for 300 seconds.

    Apply the following DestinationRule manifest:

    apiVersion: networking.istio.io/v1
    kind: DestinationRule
    metadata:
      creationTimestamp: "2025-07-26T09:28:27Z"
      generation: 3
      name: reviews
      namespace: bookinfo
      resourceVersion: "38980107"
      uid: 27bd5ee9-ffaa-46d2-a75b-dea6db482e4c
    spec:
      host: reviews
      subsets:
      - labels:
          version: v1
        name: v1
        trafficPolicy:
          loadBalancer:
            simple: ROUND_ROBIN
      - labels:
          version: v2
        name: v2
        trafficPolicy:
          loadBalancer:
            simple: RANDOM
      - labels:
          version: v3
        name: v3
        trafficPolicy:
          connectionPool:
            http:
              http1MaxPendingRequests: 1
              maxRequestsPerConnection: 1
            tcp:
              maxConnections: 1
          outlierDetection:
            baseEjectionTime: 300s
            consecutive5xxErrors: 1
            interval: 1s
            maxEjectionPercent: 100
    • consecutive5xxErrors: 1: Trips the circuit after one consecutive 5xx error.
    • interval: 1s: The time interval for ejection analysis.
    • baseEjectionTime: 300s: The instance remains ejected for 300 seconds.
    • maxEjectionPercent: 100: Allows up to 100% of the instances to be ejected.

    Next, we will update our traffic policy to route requests only to the v1 and v3 versions of the reviews service. Apply the following VirtualService manifest to implement this rule: 

    apiVersion: networking.istio.io/v1
    kind: VirtualService
    metadata:
      creationTimestamp: "2025-07-17T02:07:48Z"
      generation: 8
      name: reviews
      namespace: bookinfo
      resourceVersion: "38981001"
      uid: abd5dfe2-2526-4541-a079-d9194da4f4fb
    spec:
      hosts:
      - reviews
      http:
      - route:
        - destination:
            host: reviews
            subset: v1
          weight: 50
        - destination:
            host: reviews
            subset: v2
          weight: 0
        - destination:
            host: reviews
            subset: v3
          weight: 50

    Step 3: Enable detailed circuit breaker metrics

    By default, Red Hat OpenShift Service Mesh collects a minimal set of statistics from its Envoy proxies to reduce resource consumption and improve performance. The specific metric we need to monitor our circuit breaker, envoy_cluster_outlier_detection_ejections_active, is not included in this default set. Refer to the documentation for more details.

    To enable it, we must add an annotation to our application's pods. This annotation instructs the Envoy sidecar to include additional metrics, specifically those related to outbound cluster statistics. 

    The following is an abbreviated example of the reviews-v3 deployment manifest, modified to include the required annotation. Note the new annotations section under spec.template.metadata.

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      labels:
        app: reviews
        version: v3
      name: reviews-v3
      namespace: bookinfo
    spec:
      progressDeadlineSeconds: 600
      replicas: 1
      revisionHistoryLimit: 10
      selector:
        matchLabels:
          app: reviews
          version: v3
      strategy:
        rollingUpdate:
          maxSurge: 25%
          maxUnavailable: 25%
        type: RollingUpdate
      template:
        metadata:
        # --- ANNOTATION ADDED HERE ---
          annotations:
            proxy.istio.io/config: |
              proxyStatsMatcher:
                inclusionPrefixes:
                - "cluster.outbound"
                - "cluster_manager"
                - "listener_manager"
                - "server"
                - "cluster.xds-grpc"
         # ---------------------------
          creationTimestamp: null
          labels:
            app: reviews
            version: v3
        spec:
         ........

    Important: For this tutorial, the annotation must be applied to all deployments involved in the requests (productpage, reviews-v1, reviews-v2, reviews-v3).

    Step 4: Generate traffic

    With our configurations in place, we need to generate a consistent stream of user traffic. Now, run the following command in a new terminal window. It will continuously send requests to the /productpage every second, printing the HTTP status code of the response.

    while true; do
      echo "$(date) - Status: $(curl -s -o /dev/null -w '%{http_code}' http://istio-ingressgateway-bookinfo.<yourdomainName>/productpage)"
      sleep 1
    done

    In the Kiali service graph (Figure 2), observe the traffic flow. You will see that requests to the reviews service are split between the v1 and v3 versions, and that only the reviews:v3 workload calls the ratings service.

    Kiali graph showing the traffic flow for bookinfo using only review v1 and v3 versions.
    Figure 2: Kiali graph showing the traffic flow for bookinfo using only review v1 and v3 versions.

    Step 5: Simulate a service failure

    Now, we will deliberately cause the reviews:v3 service to fail. This will generate the 5xx errors needed to trip the circuit breaker we configured earlier. A direct way to simulate a critical failure is to terminate the main process within the container, causing the pod to crash and become temporarily unavailable.

    Execute the kill 1 command inside the reviews container of that specific pod. This command terminates the main application process, causing the container to exit with an error. See Figure 3.

    oc exec -n bookinfo reviews-v3-6d8f75d44c-fqmzf -c reviews -- kill 1
    OCP Pod terminal to execute kill command.
    Figure 3: OpenShift Container Platform Pod terminal to execute kill command.

    Immediately after running this command, look at your traffic generation terminal from Step 4. You will see the output change from 200 to 503 (Service Unavailable) as the Envoy proxy attempts to route requests to the now-unresponsive pod.

    OpenShift will automatically restart the crashed pod, but during this failure window, our circuit breaker will detect the 5xx errors and trip.

    Step 6: Monitor the circuit breaker in the console 

    While the traffic generation script is running, let's observe the circuit breaker in action.

    1. Navigate to the Observe → Metrics section in your OpenShift web console.
    2. In the Expression field of the PromQL UI, enter the following query. This query checks for the number of hosts that are currently ejected for the reviews-v3 cluster.
    envoy_cluster_outlier_detection_ejections_active{namespace='bookinfo'} >0
    1. Select Run queries.

    You should see a graph where the value is 1, as shown in Figure 4. This indicates that the single instance of reviews:v3 has been ejected. The value will periodically drop to 0 for a brief moment before returning to 1 as the 300-second ejection period expires and the circuit is immediately re-tripped by the next failed request.

    Metrics showing the outliner active status.
    Figure 4: Metrics showing the outliner active status.

    In the Kiali service graph, you can now see the circuit breaker in action. Traffic to the reviews service is being routed exclusively to the healthy v1 version. The path to reviews:v3 shows an open circuit breaker, and no traffic is flowing to it or its downstream ratings service. See Figure 5.

    Kiali graph showing the circuit breaker is open for the reviews:v3.
    Figure 5: Kiali graph showing the circuit breaker is open for the reviews:v3.

    You will now observe that requests routed to the reviews:v1 service succeed without issue (Figure 6).

    Bookinfo page showing success for v1 review service.
    Figure 6: Bookinfo page showing success for v1 review service.

    Any traffic intended for reviews:v3 will result in an error, as shown in Figure 7. This happens because the circuit breaker is active for its 300-second ejection period, blocking calls to the v3 pod even though it is running.

    Bookinfo page showing error for v3 review service.
    Figure 7: Bookinfo page showing error for v3 review service.

    Understanding key outlier detection metrics

    While ejections_active is perfect for seeing the real-time state, Envoy provides a rich set of metrics for a deeper understanding of your circuit breaker's behavior. According to the official Envoy proxy documentation, these statistics give you a more complete picture for monitoring and tuning.

    Wrap up

    Congratulations! You have successfully configured a circuit breaker for a microservice, simulated a failure using fault injection, and monitored the circuit's state in real-time using metrics in the OpenShift console. This powerful resilience pattern is a fundamental tool for building robust, self-healing applications with Red Hat OpenShift Service Mesh.

    Related Posts

    • Troubleshooting "no healthy upstream" errors in Istio service mesh

    • How to use Gateway API with OpenShift Service Mesh 2.6

    • Multi-primary multi-cluster setup in OpenShift Service Mesh

    • 4 steps to run an application under OpenShift Service Mesh

    • Canary deployment strategy with Argo Rollouts and OpenShift Service Mesh

    • Try Istio ambient mode on Red Hat OpenShift

    Recent Posts

    • SQL Server HA on RHEL: Meet Pacemaker HA Agent v2 (tech preview)

    • Deploy with confidence: Continuous integration and continuous delivery for agentic AI

    • Every layer counts: Defense in depth for AI agents with Red Hat AI

    • Fun in the RUN instruction: Why container builds with distroless images can surprise you

    • Trusted software factory: Building trust in the agentic AI era

    What’s up next?

    Learn how Red Hat technologies combine with open source tools to achieve resiliency when employing multi-cluster applications across diverse environments. 

    Start the activity
    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Platforms

    • Red Hat AI
    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform
    • See all products

    Build

    • Developer Sandbox
    • Developer tools
    • Interactive tutorials
    • API catalog

    Quicklinks

    • Learning resources
    • E-books
    • Cheat sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site status dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit
    © 2026 Red Hat

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Chat Support

    Please log in with your Red Hat account to access chat support.