Skip to main content
Redhat Developers  Logo
  • Products

    Platforms

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat AI
      Red Hat AI
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • View All Red Hat Products

    Featured

    • Red Hat build of OpenJDK
    • Red Hat Developer Hub
    • Red Hat JBoss Enterprise Application Platform
    • Red Hat OpenShift Dev Spaces
    • Red Hat OpenShift Local
    • Red Hat Developer Sandbox

      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Openshift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • View All Technologies
    • Programming Languages & Frameworks

      • Java
      • Python
      • JavaScript
    • System Design & Architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer Productivity

      • Developer productivity
      • Developer Tools
      • GitOps
    • Automated Data Processing

      • AI/ML
      • Data Science
      • Apache Kafka on Kubernetes
    • Platform Engineering

      • DevOps
      • DevSecOps
      • Ansible automation for applications and services
    • Secure Development & Architectures

      • Security
      • Secure coding
  • Learn

    Featured

    • Kubernetes & Cloud Native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • AI/ML
      AI/ML Icon
    • View All Learning Resources

    E-Books

    • GitOps Cookbook
    • Podman in Action
    • Kubernetes Operators
    • The Path to GitOps
    • View All E-books

    Cheat Sheets

    • Linux Commands
    • Bash Commands
    • Git
    • systemd Commands
    • View All Cheat Sheets

    Documentation

    • Product Documentation
    • API Catalog
    • Legacy Documentation
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

How to implement and monitor circuit breakers in OpenShift Service Mesh 3

September 29, 2025
Shailendra Kumar Singh
Related topics:
Application modernizationDevOpsMicroservicesService Mesh
Related products:
Red Hat OpenShift Container PlatformRed Hat OpenShift Service Mesh

Share:

    In a distributed microservices architecture, the failure of one service can cascade, leading to system-wide outages. To build resilient and fault-tolerant applications, we must isolate failures and prevent them from spreading. The circuit breaker is a critical design pattern that addresses this challenge by temporarily blocking traffic to a service that it detects as unhealthy, giving it time to recover.

    This guide shows you how to configure, trigger, and monitor a circuit breaker using Red Hat OpenShift Service Mesh 3.0. By the end, you'll have a hands-on understanding of how to use Istio's outlier detection to automatically improve your application's stability on OpenShift.

    Prerequisites

    Before you begin, ensure your environment is fully prepared. This guide assumes you have the following setup:

    • An OpenShift Container Platform cluster: You will need access to a cluster running version 4.16 or newer with administrator privileges.
    • Command-line tools: The OpenShift CLI (oc) and Kubernetes CLI (kubectl) must be installed and configured to connect to your cluster.
    • Red Hat OpenShift Service Mesh and the Bookinfo sample application: You need a project (for example, bookinfo) where the OpenShift Service Mesh control plane is installed and the Bookinfo sample application is deployed. The application's product page must be accessible via the Istio ingress gateway.

      If you need to set this up, follow the official Red Hat documentation to install OpenShift Service Mesh and deploy the Bookinfo application. (Complete sections 2.1 through 2.5.3 of the tutorial.)

    • Kiali for monitoring: This tutorial uses Kiali to visualize the circuit breaker's status. Ensure you have configured access to the Kiali console.

      To set this up, follow the official documentation to expose and access the Kiali console. (Complete sections 4.1.1 through 4.1.3).

    Step-by-step instructions

    Follow these steps to deploy the application, configure the circuit breaker, and monitor the results.

    Step 1: Preparation

    First, verify that the Bookinfo application is running correctly and that all pods are in a Running state.

    oc get pods -n bookinfo

    You should see output similar to this, with pods for productpage, details, ratings, and three versions of reviews along with istio-igressgateway. 

    NAME                                   READY   STATUS    RESTARTS      AGE
    details-v1-7c799b8b4b-7npbl            2/2     Running   0             9d
    istio-ingressgateway-7bb7fb8fd-8sbxr   1/1     Running   0             9d
    productpage-v1-f8479c768-s72st         2/2     Running   0             9d
    ratings-v1-7fccfc8b8b-dr6xp            2/2     Running   4 (9d ago)    18d
    reviews-v1-8cc49957f-gswj6             2/2     Running   0             9d
    reviews-v2-5bf9856f5c-bcswn            2/2     Running   0             9d
    reviews-v3-6d8f75d44c-fqmzf            2/2     Running   3 (17h ago)   17h

    Generate load and inspect the Kiali graph for traffic (see Figure 1).

    while true; do
      echo "$(date) - Status: $(curl -s -o /dev/null -w '%{http_code}' http://istio-ingressgateway-bookinfo.<yourdomainName>/productpage)"
      sleep 1
    done
    Kiali graph showing the traffic flow for bookinfo.
    Figure 1: Kiali graph showing the traffic flow for bookinfo.

    Step 2: Configure the circuit breaker

    Circuit breaking is configured in Istio using a DestinationRule. We will apply a policy that monitors the reviews service. Specifically, we'll target the v3 subset. If an instance in this subset returns even a single 5xx error, the Envoy proxy will "eject" it from the load-balancing pool for 300 seconds.

    Apply the following DestinationRule manifest:

    apiVersion: networking.istio.io/v1
    kind: DestinationRule
    metadata:
      creationTimestamp: "2025-07-26T09:28:27Z"
      generation: 3
      name: reviews
      namespace: bookinfo
      resourceVersion: "38980107"
      uid: 27bd5ee9-ffaa-46d2-a75b-dea6db482e4c
    spec:
      host: reviews
      subsets:
      - labels:
          version: v1
        name: v1
        trafficPolicy:
          loadBalancer:
            simple: ROUND_ROBIN
      - labels:
          version: v2
        name: v2
        trafficPolicy:
          loadBalancer:
            simple: RANDOM
      - labels:
          version: v3
        name: v3
        trafficPolicy:
          connectionPool:
            http:
              http1MaxPendingRequests: 1
              maxRequestsPerConnection: 1
            tcp:
              maxConnections: 1
          outlierDetection:
            baseEjectionTime: 300s
            consecutive5xxErrors: 1
            interval: 1s
            maxEjectionPercent: 100
    • consecutive5xxErrors: 1: Trips the circuit after one consecutive 5xx error.
    • interval: 1s: The time interval for ejection analysis.
    • baseEjectionTime: 300s: The instance remains ejected for 300 seconds.
    • maxEjectionPercent: 100: Allows up to 100% of the instances to be ejected.

    Next, we will update our traffic policy to route requests only to the v1 and v3 versions of the reviews service. Apply the following VirtualService manifest to implement this rule: 

    apiVersion: networking.istio.io/v1
    kind: VirtualService
    metadata:
      creationTimestamp: "2025-07-17T02:07:48Z"
      generation: 8
      name: reviews
      namespace: bookinfo
      resourceVersion: "38981001"
      uid: abd5dfe2-2526-4541-a079-d9194da4f4fb
    spec:
      hosts:
      - reviews
      http:
      - route:
        - destination:
            host: reviews
            subset: v1
          weight: 50
        - destination:
            host: reviews
            subset: v2
          weight: 0
        - destination:
            host: reviews
            subset: v3
          weight: 50

    Step 3: Enable detailed circuit breaker metrics

    By default, Red Hat OpenShift Service Mesh collects a minimal set of statistics from its Envoy proxies to reduce resource consumption and improve performance. The specific metric we need to monitor our circuit breaker, envoy_cluster_outlier_detection_ejections_active, is not included in this default set. Refer to the documentation for more details.

    To enable it, we must add an annotation to our application's pods. This annotation instructs the Envoy sidecar to include additional metrics, specifically those related to outbound cluster statistics. 

    The following is an abbreviated example of the reviews-v3 deployment manifest, modified to include the required annotation. Note the new annotations section under spec.template.metadata.

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      labels:
        app: reviews
        version: v3
      name: reviews-v3
      namespace: bookinfo
    spec:
      progressDeadlineSeconds: 600
      replicas: 1
      revisionHistoryLimit: 10
      selector:
        matchLabels:
          app: reviews
          version: v3
      strategy:
        rollingUpdate:
          maxSurge: 25%
          maxUnavailable: 25%
        type: RollingUpdate
      template:
        metadata:
        # --- ANNOTATION ADDED HERE ---
          annotations:
            proxy.istio.io/config: |
              proxyStatsMatcher:
                inclusionPrefixes:
                - "cluster.outbound"
                - "cluster_manager"
                - "listener_manager"
                - "server"
                - "cluster.xds-grpc"
         # ---------------------------
          creationTimestamp: null
          labels:
            app: reviews
            version: v3
        spec:
         ........

    Important: For this tutorial, the annotation must be applied to all deployments involved in the requests (productpage, reviews-v1, reviews-v2, reviews-v3).

    Step 4: Generate traffic

    With our configurations in place, we need to generate a consistent stream of user traffic. Now, run the following command in a new terminal window. It will continuously send requests to the /productpage every second, printing the HTTP status code of the response.

    while true; do
      echo "$(date) - Status: $(curl -s -o /dev/null -w '%{http_code}' http://istio-ingressgateway-bookinfo.<yourdomainName>/productpage)"
      sleep 1
    done

    In the Kiali service graph (Figure 2), observe the traffic flow. You will see that requests to the reviews service are split between the v1 and v3 versions, and that only the reviews:v3 workload calls the ratings service.

    Kiali graph showing the traffic flow for bookinfo using only review v1 and v3 versions.
    Figure 2: Kiali graph showing the traffic flow for bookinfo using only review v1 and v3 versions.

    Step 5: Simulate a service failure

    Now, we will deliberately cause the reviews:v3 service to fail. This will generate the 5xx errors needed to trip the circuit breaker we configured earlier. A direct way to simulate a critical failure is to terminate the main process within the container, causing the pod to crash and become temporarily unavailable.

    Execute the kill 1 command inside the reviews container of that specific pod. This command terminates the main application process, causing the container to exit with an error. See Figure 3.

    oc exec -n bookinfo reviews-v3-6d8f75d44c-fqmzf -c reviews -- kill 1
    OCP Pod terminal to execute kill command.
    Figure 3: OpenShift Container Platform Pod terminal to execute kill command.

    Immediately after running this command, look at your traffic generation terminal from Step 4. You will see the output change from 200 to 503 (Service Unavailable) as the Envoy proxy attempts to route requests to the now-unresponsive pod.

    OpenShift will automatically restart the crashed pod, but during this failure window, our circuit breaker will detect the 5xx errors and trip.

    Step 6: Monitor the circuit breaker in the console 

    While the traffic generation script is running, let's observe the circuit breaker in action.

    1. Navigate to the Observe → Metrics section in your OpenShift web console.
    2. In the Expression field of the PromQL UI, enter the following query. This query checks for the number of hosts that are currently ejected for the reviews-v3 cluster.
    envoy_cluster_outlier_detection_ejections_active{namespace='bookinfo'} >0
    1. Select Run queries.

    You should see a graph where the value is 1, as shown in Figure 4. This indicates that the single instance of reviews:v3 has been ejected. The value will periodically drop to 0 for a brief moment before returning to 1 as the 300-second ejection period expires and the circuit is immediately re-tripped by the next failed request.

    Metrics showing the outliner active status.
    Figure 4: Metrics showing the outliner active status.

    In the Kiali service graph, you can now see the circuit breaker in action. Traffic to the reviews service is being routed exclusively to the healthy v1 version. The path to reviews:v3 shows an open circuit breaker, and no traffic is flowing to it or its downstream ratings service. See Figure 5.

    Kiali graph showing the circuit breaker is open for the reviews:v3.
    Figure 5: Kiali graph showing the circuit breaker is open for the reviews:v3.

    You will now observe that requests routed to the reviews:v1 service succeed without issue (Figure 6).

    Bookinfo page showing success for v1 review service.
    Figure 6: Bookinfo page showing success for v1 review service.

    Any traffic intended for reviews:v3 will result in an error, as shown in Figure 7. This happens because the circuit breaker is active for its 300-second ejection period, blocking calls to the v3 pod even though it is running.

    Bookinfo page showing error for v3 review service.
    Figure 7: Bookinfo page showing error for v3 review service.

    Understanding key outlier detection metrics

    While ejections_active is perfect for seeing the real-time state, Envoy provides a rich set of metrics for a deeper understanding of your circuit breaker's behavior. According to the official Envoy proxy documentation, these statistics give you a more complete picture for monitoring and tuning.

    Wrap up

    Congratulations! You have successfully configured a circuit breaker for a microservice, simulated a failure using fault injection, and monitored the circuit's state in real-time using metrics in the OpenShift console. This powerful resilience pattern is a fundamental tool for building robust, self-healing applications with Red Hat OpenShift Service Mesh.

    Related Posts

    • Troubleshooting "no healthy upstream" errors in Istio service mesh

    • How to use Gateway API with OpenShift Service Mesh 2.6

    • Multi-primary multi-cluster setup in OpenShift Service Mesh

    • 4 steps to run an application under OpenShift Service Mesh

    • Canary deployment strategy with Argo Rollouts and OpenShift Service Mesh

    • Try Istio ambient mode on Red Hat OpenShift

    Recent Posts

    • How to implement and monitor circuit breakers in OpenShift Service Mesh 3

    • Analysis of OpenShift node-system-admin-client lifespan

    • What's New in OpenShift GitOps 1.18

    • Beyond a single cluster with OpenShift Service Mesh 3

    • Kubernetes MCP server: AI-powered cluster management

    What’s up next?

    Learn how Red Hat technologies combine with open source tools to achieve resiliency when employing multi-cluster applications across diverse environments. 

    Start the activity
    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Platforms

    • Red Hat AI
    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform
    • See all products

    Build

    • Developer Sandbox
    • Developer Tools
    • Interactive Tutorials
    • API Catalog

    Quicklinks

    • Learning Resources
    • E-books
    • Cheat Sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site Status Dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit
    © 2025 Red Hat

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue