Skip to main content
Redhat Developers  Logo
  • Products

    Platforms

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat AI
      Red Hat AI
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • See all Red Hat products

    Featured

    • Red Hat build of OpenJDK
    • Red Hat Developer Hub
    • Red Hat JBoss Enterprise Application Platform
    • Red Hat OpenShift Dev Spaces
    • Red Hat OpenShift Local
    • Red Hat Developer Sandbox

      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Red Hat OpenShift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • See all technologies
    • Programming languages & frameworks

      • Java
      • Python
      • JavaScript
    • System design & architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer experience

      • Productivity
      • Tools
      • GitOps
    • Automated data processing

      • AI/ML
      • Data science
      • Apache Kafka on Kubernetes
    • Platform engineering

      • DevOps
      • DevSecOps
      • Red Hat Ansible Automation Platform for applications and services
    • Secure development & architectures

      • Security
      • Secure coding
  • Learn

    Featured

    • Kubernetes & cloud native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • AI/ML
      AI/ML Icon
    • See all learning resources

    E-books

    • GitOps cookbook
    • Podman in action
    • Kubernetes operators
    • The path to GitOps
    • See all e-books

    Cheat sheets

    • Linux commands
    • Bash commands
    • Git
    • systemd commands
    • See all cheat sheets

    Documentation

    • Product documentation
    • API catalog
    • Legacy documentation
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore the Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

Evaluate OpenShift cluster health with the cluster observability operator

Introducing the component health overview in cluster observability 1.4, now available in Dev Preview

March 24, 2026
Tomas Remes
Related topics:
KubernetesObservabilityOperators
Related products:
Red Hat OpenShift

    Evaluating overall cluster health is complex. To help, the cluster observability operator for Red Hat OpenShift now includes a component health overview, which is currently available as a Developer Preview feature.

    This overview helps you assess the status of the OpenShift control plane and other integrated components. This feature displays health information in a custom Perses dashboard and categorizes levels as OK, warning, and error. This categorization helps administrators quickly identify cluster components that require immediate attention. Component health was introduced in cluster-health-analyzer version 1.1 and is available in cluster-observability-operator 1.4 or later.

    Install the cluster observability operator

    The component health feature is part of the cluster observability operator 1.4 or later. You can install this operator using OperatorHub in the Red Hat OpenShift Container Platform web console.

    1. Select the Enable Operator recommended cluster monitoring on this Namespace check box, as shown in Figure 1. Otherwise, the component health overview will not be available.

      enable cluster monitoring on the cluster-observability-operator namespace
      Figure 1: Enabling Operator recommended cluster monitoring during namespace creation.
    2. Create the monitoring UI plug-in using the following YAML definition:

      oc apply -f - <<EOF
      apiVersion: observability.openshift.io/v1alpha1
      kind: UIPlugin
      metadata:
       name: monitoring
      spec:
       monitoring:
         clusterHealthAnalyzer:
           enabled: true
         perses:
           enabled: true
       type: Monitoring
      EOF
    3. Wait for the OpenShift web console to refresh.

    The health-analyzer pod is now running in the openshift-cluster-observability-operator namespace. You can check the status with the following command:

    oc get pod -l app.kubernetes.io/instance=health-analyzer -n openshift-cluster-observability-operator

    After the installation, the following Prometheus metrics are available in the cluster:

    • component_health
    • component_health_object
    • component_health_alert

    Components tree

    When you explore the Prometheus metrics, you might see component_health_object{component="control-plane.nodes"} and component_health{component="control-plane"}. These metrics define a parent-child relationship where control-plane has a child component called nodes. This hierarchy forms a tree defined in a configmap in the cluster.

    components:
      - name: control-plane
        children:
        - name: nodes
          objects:
          - resource: nodes
            selectors:
            - matchLabels:
                node-role.kubernetes.io/control-plane: []
          - resource: machineconfigpools
            group: machineconfiguration.openshift.io
            selectors:
            - matchLabels:
                pools.operator.machineconfiguration.openshift.io/master: []
        - name: capacity
          children:
          - name: cpu
            alerts:
              selectors:
              - matchLabels:
                  alertname: ["KubeCPUOvercommit","HighOverallControlPlaneCPU", "ExtremelyHighIndividualControlPlaneCPU"]
          - name: memory
            alerts:
              selectors:
              - matchLabels:
                  alertname: ["HighOverallControlPlaneMemory", "ExtremelyHighIndividualControlPlaneMemory", "SystemMemoryExceedsReservation"]
        - name: operators
          children:
          - name: etcd
            alerts:
              selectors:
              - matchLabels:
                  namespace: ["openshift-etcd","openshift-etcd-operator"]
      - name: addons
        children:
        - name: kubevirt
          alerts:
            selectors:
            - matchLabels:
                kubernetes_operator_part_of: ["kubevirt"]
            - matchLabels:
                namespace: ["openshift-cnv"]
          objects:
          - group: kubevirt.io
            resource: kubevirts
            namespace: openshift-cnv

    The health status for a component is provided through the status label and the metric value. The values map to 0 for OK, 1 for warning, and 2 for error. A parent component health status is determined by its child components; the most severe status propagates upward. For example, if a node is in an error state, the control-plane component also reflects an error status.

    Perses dashboard

    Components and their health statuses form a tree. We plan to visualize this hierarchy in the OpenShift web console observability overview. At this time, you can interact with this data through a Perses dashboard, which is a Developer Preview feature.

    Create the Perses dashboard with the following command:

    oc apply -f - <<'EOF'
    apiVersion: perses.dev/v1alpha2
    kind: PersesDashboard
    metadata:
      name: component-health-dashboard
      namespace: openshift-cluster-observability-operator
    spec:
      config:
        display:
          name: Component Health Dashboard
        duration: 1h
        layouts:
          - kind: Grid
            spec:
              display:
                title: Component Health Overview
              items:
                - content:
                    $ref: '#/spec/panels/0_0'
                  height: 8
                  width: 24
                  x: 0
                  'y': 0
          - kind: Grid
            spec:
              display:
                title: Component Details
              items:
                - content:
                    $ref: '#/spec/panels/1_0'
                  height: 8
                  width: 24
                  x: 0
                  'y': 0
        panels:
          '0_0':
            kind: Panel
            spec:
              display:
                name: Top level components
              plugin:
                kind: Table
                spec:
                  cellSettings:
                    - condition:
                        kind: Value
                        spec:
                          value: warning
                      text: WARNING
                      textColor: '#ffb700'
                    - condition:
                        kind: Value
                        spec:
                          value: error
                      text: ERROR
                      textColor: '#ff0000'
                    - condition:
                        kind: Value
                        spec:
                          value: OK
                      text: OK
                      textColor: '#23c200'
                  columnSettings:
                    - hide: true
                      name: timestamp
                    - hide: true
                      name: value
                  density: comfortable
              queries:
                - kind: TimeSeriesQuery
                  spec:
                    plugin:
                      kind: PrometheusTimeSeriesQuery
                      spec:
                        query: 'sum without(job,instance,container,endpoint,namespace,pod,prometheus,service) (component_health)'
                        seriesNameFormat: '{{component}}'
          '1_0':
            kind: Panel
            spec:
              display:
                name: 'Component Details: ${component}'
              plugin:
                kind: Table
                spec:
                  cellSettings:
                    - condition:
                        kind: Value
                        spec:
                          value: warning
                      text: WARNING
                      textColor: '#ffb700'
                    - condition:
                        kind: Value
                        spec:
                          value: error
                      text: ERROR
                      textColor: '#ff0000'
                    - condition:
                        kind: Value
                        spec:
                          value: OK
                      text: OK
                      textColor: '#23c200'
                  columnSettings:
                    - hide: true
                      name: timestamp
                    - hide: true
                      name: value
                    - name: component
                    - name: name
                    - name: resource
                    - name: progressing
                    - name: status
                  enableFiltering: true
                  transforms:
                    - kind: MergeColumns
                      spec:
                        columns:
                          - name
                          - src_alertname
                        name: name
              queries:
                - kind: TimeSeriesQuery
                  spec:
                    plugin:
                      kind: PrometheusTimeSeriesQuery
                      spec:
                        query: 'sum by(component,name,progressing,resource,status,src_alertname) (component_health_object{component=~"${component}.*"} or component_health_alert{component=~"${component}.*"})'
        refreshInterval: 30s
        variables:
          - kind: ListVariable
            spec:
              allowAllValue: true
              allowMultiple: false
              defaultValue: $__all
              display:
                description: Select a component to view detailed health information. Use 'All Components' to see everything.
                hidden: false
                name: Component Filter
              name: component
              plugin:
                kind: PrometheusLabelValuesVariable
                spec:
                  labelName: component
                  matchers:
                    - 'component_health{}'
    EOF

    This dashboard defines two tables. The Component Health Overview table, shown in Figure 2, provides a health overview of the top-level components (those with child components).

    top level table with componenth health overview
    Figure 2: Health overview of top-level components.

    The second table, Component Details, lists all child components (Figure 3).

    second table with all the child components overview
    Figure 3: Component Details table listing all child components.

    Limitations and next steps

    The Perses dashboard is a Developer Preview feature and might have limitations and bugs, particularly with its tables. For instance, while table column filtering appears functional, the corresponding values in other columns might display incorrectly.

    We plan to add a drill-down component to the Observability view in the OpenShift web console.

    Future plans include allowing cluster administrators to extend the component tree definition by adding custom components.

    Share your questions and recommendations with us using the Red Hat OpenShift feedback form.

    Related Posts

    • Advanced Cluster Management 2.16 right-sizing recommendation GA

    • How to use auto-instrumentation with OpenTelemetry

    • Deeper visibility in Red Hat Advanced Cluster Security

    • Monitoring OpenStack and OpenShift together

    • Enhance Kubernetes observability: Connect AI to Istio with Kiali

    • Modern Kubernetes monitoring: Metrics, tools, and AIOps

    Recent Posts

    • Run Model-as-a-Service for multiple LLMs on OpenShift

    • Evaluate OpenShift cluster health with the cluster observability operator

    • Integrate Red Hat Advanced Cluster Management with Argo CD

    • Upgrade Advanced Cluster Management hubs without disruption

    • Eval-driven development: Build and evaluate reliable AI agents

    What’s up next?

    Learning Path Feature image for Red Hat OpenShift

    Foundations of OpenShift

    Learn the foundations of OpenShift through hands-on experience deploying and...
    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Platforms

    • Red Hat AI
    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform
    • See all products

    Build

    • Developer Sandbox
    • Developer tools
    • Interactive tutorials
    • API catalog

    Quicklinks

    • Learning resources
    • E-books
    • Cheat sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site status dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit
    © 2026 Red Hat

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue