Skip to main content
Redhat Developers  Logo
  • Products

    Featured

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat OpenShift AI
      Red Hat OpenShift AI
    • Red Hat Enterprise Linux AI
      Linux icon inside of a brain
    • Image mode for Red Hat Enterprise Linux
      RHEL image mode
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • Red Hat Developer Hub
      Developer Hub
    • View All Red Hat Products
    • Linux

      • Red Hat Enterprise Linux
      • Image mode for Red Hat Enterprise Linux
      • Red Hat Universal Base Images (UBI)
    • Java runtimes & frameworks

      • JBoss Enterprise Application Platform
      • Red Hat build of OpenJDK
    • Kubernetes

      • Red Hat OpenShift
      • Microsoft Azure Red Hat OpenShift
      • Red Hat OpenShift Virtualization
      • Red Hat OpenShift Lightspeed
    • Integration & App Connectivity

      • Red Hat Build of Apache Camel
      • Red Hat Service Interconnect
      • Red Hat Connectivity Link
    • AI/ML

      • Red Hat OpenShift AI
      • Red Hat Enterprise Linux AI
    • Automation

      • Red Hat Ansible Automation Platform
      • Red Hat Ansible Lightspeed
    • Developer tools

      • Red Hat Trusted Software Supply Chain
      • Podman Desktop
      • Red Hat OpenShift Dev Spaces
    • Developer Sandbox

      Developer Sandbox
      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Openshift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • View All Technologies
    • Programming Languages & Frameworks

      • Java
      • Python
      • JavaScript
    • System Design & Architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer Productivity

      • Developer productivity
      • Developer Tools
      • GitOps
    • Secure Development & Architectures

      • Security
      • Secure coding
    • Platform Engineering

      • DevOps
      • DevSecOps
      • Ansible automation for applications and services
    • Automated Data Processing

      • AI/ML
      • Data Science
      • Apache Kafka on Kubernetes
      • View All Technologies
    • Start exploring in the Developer Sandbox for free

      sandbox graphic
      Try Red Hat's products and technologies without setup or configuration.
    • Try at no cost
  • Learn

    Featured

    • Kubernetes & Cloud Native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • Java
      Java icon
    • AI/ML
      AI/ML Icon
    • View All Learning Resources

    E-Books

    • GitOps Cookbook
    • Podman in Action
    • Kubernetes Operators
    • The Path to GitOps
    • View All E-books

    Cheat Sheets

    • Linux Commands
    • Bash Commands
    • Git
    • systemd Commands
    • View All Cheat Sheets

    Documentation

    • API Catalog
    • Product Documentation
    • Legacy Documentation
    • Red Hat Learning

      Learning image
      Boost your technical skills to expert-level with the help of interactive lessons offered by various Red Hat Learning programs.
    • Explore Red Hat Learning
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

Get started with the OpenShift Cluster Observability Operator

July 9, 2024
Michael Greenberg
Related topics:
DevOpsIntegrationObservability
Related products:
Red Hat OpenShift Container Platform

Share:

    There are cases in which a Red Hat OpenShift team supporting a central monitoring stack can struggle to fulfill application teams' requests or demands for Prometheus metrics and configuration changes. For instance:

    • One department would like metrics retention for a period of 3 months whereas another department requested metrics retention for 6 months.
    • One department requested remote_writes to an external server for metrics matching one pattern and to a different server for metrics matching another pattern. This group changes their requirements every few weeks.
    • One department uses a Prometheus exporter to dump all their data to Prometheus for easy viewing in Grafana and is overloading the Prometheus server causing it to use many G-bytes of RAM which results in poor Prometheus query response time for other departments.

    These scenarios are problematic for teams supporting existing OpenShift User Workload Monitoring. A single ConfigMap user-workload-monitoring-config is used to configure the User Workload Monitoring and the configuration is applicable to all user workload metrics. Only cluster administrators can modify this ConfigMap so there is overhead when departments require unique settings. Furthermore, some settings are global for the entire cluster and cannot be configured for a subset of cluster namespaces. This centralization of configuration can be beneficial in many cases because of its simplicity but in more complex setups and organizations it sometimes doesn’t provide enough flexibility.

    OpenShift Cluster Observability Operator

    To support more complex scenarios, Red Hat recently announced the Cluster Observability Operator (COO), a new OpenShift Operator designed to manage observability stacks on OpenShift clusters.

    COO is now available as a technology preview for all OpenShift users, introducing the Red Hat Observability MonitoringStack custom resource definition (CRD) as an initial feature set, which lets you run highly available monitoring stacks consisting of Prometheus, Alertmanager, and Thanos Querier.

    This article provides an example of how to use the Cluster Observability Operator.

    COO installation

    First, create an OpenShift project named coo-demo for resources in this demo. Run the following:

    oc new-project coo-demo

    We will use the OpenShift Operator Lifecycle Manager (OLM) to install the Cluster Observability Operator. In the OpenShift Administrator menu, select Operators and then OperatorHub. Search for cluster observability, as shown in Figure 1.

    Red Hat OpenShift OperatorHub Cluster Observability Operator installation
    Figure 1: Installing the Cluster Observability Operator on OpenShift.

    Select the operator and click Install. On the Install Operator page, accept all the default settings and click Install. Wait for the "Installed operator: ready for use" message to appear.

    COO instance creation

    Use the oc command to create a Red Hat Observability MonitoringStack (the equivalent of a Prometheus stack) using the CR description below. This specifications includes a single replica and will retain metrics for 2 days:

    apiVersion: monitoring.rhobs/v1alpha1
    kind: MonitoringStack
    metadata:
     labels:
       coo: coo-monitoring-stack
     name: coo-monitoring-stack
    spec:
     alertmanagerConfig:
       disabled: true
     prometheusConfig:
       replicas: 1
     retention: 2d
     resourceSelector:
       matchLabels:
         monitoredby: coo-monitoring-stack

    Additional directives could be used to specify persistent storage. However, that is out of the scope of this demo.

    We will create a Red Hat Observability ThanosQuerier instance to gather the data needed to evaluate PromQL queries using the CR description below:

    kind: ThanosQuerier
    apiVersion: monitoring.rhobs/v1alpha1
    metadata:
     name: coo-demo
    spec:
     selector:
       matchLabels:
         coo: coo-monitoring-stack

    Create an OpenShift Route to the ThanosQuerier by running the following command:

    oc expose service thanos-querier-coo-demo

    Demo application installation

    We will now create a Python application that includes a very simple web server. The web server accepts requests to the root URL (/) and returns HTTP 200 (OK). For any other URL, the web service returns an HTTP 404 (not found). In addition, the application includes functions that keep track of the number of HTTP 200 and HTTP 404s returned and make these metrics available for our MonitoringStack to scrape.

    Use the following CR description to create a Deployment with the Python application:

    apiVersion: apps/v1
    kind: Deployment
    metadata:
     name: coo-demo
    spec:
     replicas: 1
     selector:
       matchLabels:
         app: coo-demo
     template:
       metadata:
         labels:
           app: coo-demo
       spec:
         containers:
         - name: httpserver
           image: registry.access.redhat.com/ubi9/python-311:1
           command:
           - bash
           - -c
           - |2-
             pip install prometheus_client && python - <<EOF
             from http.server import BaseHTTPRequestHandler, HTTPServer
             from prometheus_client import start_http_server, Counter
             class HTTPRequestHandler(BaseHTTPRequestHandler):
               def do_GET(self):
                 if self.path == '/':
                   self.send_response(200)
                   self.end_headers()
                   self.wfile.write(b'<html>Hello!</html>\n')
                   respCtr.labels(response='200').inc()
                 else:
                   self.send_error(404)
                   respCtr.labels(response='404').inc()
             start_http_server(9000)
             respCtr = Counter('coo_responses','Responses',["response"])
             HTTPServer(("", 8080), HTTPRequestHandler).serve_forever()
             EOF

    Create a Service for our Deployment above using the CR description below:

    kind: Service
    apiVersion: v1
    metadata:
     labels:
       app: coo-demo
     name: coo-demo
    spec:
     ports:
       - name: http
         port: 8080
       - name: metrics
         port: 9000
     selector:
         app: coo-demo

    Create an OpenShift Route to our Python application by running the following command:

    oc expose service coo-demo

    ServiceMonitor installation

    Create a Red Hat Observability ServiceMonitor to scrape metrics from our application created above using the following CR description. Note that the label key/value pair matches the resourceSelector in the Red Hat Observability MonitoringStack above:

    apiVersion: monitoring.rhobs/v1
    kind: ServiceMonitor
    metadata:
     name: coo-demo
     labels:
       monitoredby: coo-monitoring-stack
    spec:
     endpoints:
       - port: metrics
     selector:
       matchLabels:
         app: coo-demo

    Wait for the three pods to reach a READY state (your pod names will differ):

    $ oc get pods
    NAME                                     READY STATUS  RESTARTS AGE
    coo-demo-7bc8c649dc-7sww8                1/1   Running 0        3m37s
    prometheus-coo-monitoring-stack-0        3/3   Running 0        2m1s
    thanos-querier-coo-demo-7654cd6df9-wqbcg 1/1   Running 0        8s

    Generate metrics data

    Generate valid HTTP requests by running the following several times in a bash shell:

    curl http://$(oc get route coo-demo -o jsonpath='{.spec.host}')/

    The HTML output will include a Hello! message.

    Generate invalid HTTP requests by running the following several times in a bash shell:

    curl http://$(oc get route coo-demo -o jsonpath='{.spec.host}')/notfound

    The HTML output will include an Error code: 404 message.

    Viewing the metrics

    Open your browser to the Red Hat Observability ThanosQuerier at the URL generated from the following command:

    oc get route thanos-querier-coo-demo -o jsonpath='{"http://"}{.spec.host}{"\n"}'

    After a minute or two, check the number of requests recorded by entering the following PromQL query in the Expression field of the ThanosQuerier and then pressing Execute:

    coo_responses_total

    The output should be similar to the following (Figure 2).  

    Thanos - Query PromQL query results
    Figure 2: PromQL query results.

    The totals shown for each response should be equal to the number of invocations of the application URLs above.

    Summary and outlook

    The Red Hat Cluster Observability Operator provides a full monitoring stack that each department in an organization can configure to meet its needs, thereby offloading work from the OpenShift platform support team. 

    Stay tuned for more great features to land in COO, as monitoring and alerting are just the beginning. Red Hat teams are currently working on integrating logging, distributed tracing, signal correlation, and UI features into COO.

    Cleanup

    To clean up CRs created in this demo, run the following:

    oc delete deployment coo-demo
    oc delete service coo-demo
    oc delete monitoringstack.monitoring.rhobs coo-monitoring-stack
    oc delete servicemonitor.monitoring.rhobs coo-demo
    oc delete thanosqueriers.monitoring.rhobs coo-demo
    oc delete routes coo-demo thanos-querier-coo-demo
    Last updated: September 27, 2024

    Related Posts

    • Building an observability stack for automated performance tests on Kubernetes and OpenShift (part 2)

    • Network observability with eBPF on single node OpenShift

    • Observability for Node.js applications in OpenShift

    • How to manage a fleet of heterogeneous OpenShift clusters

    • Monitor Node.js applications on Red Hat OpenShift with Prometheus

    • Observability for Node.js applications in OpenShift

    Recent Posts

    • More Essential AI tutorials for Node.js Developers

    • How to run a fraud detection AI model on RHEL CVMs

    • How we use software provenance at Red Hat

    • Alternatives to creating bootc images from scratch

    • How to update OpenStack Services on OpenShift

    What’s up next?

    Read Operating OpenShift, a practical guide to running and operating OpenShift clusters more efficiently using a site reliability engineering (SRE) approach. Learn best practices and tools that can help reduce the effort of deploying a Kubernetes platform.

    Get the e-book
    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Products

    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform

    Build

    • Developer Sandbox
    • Developer Tools
    • Interactive Tutorials
    • API Catalog

    Quicklinks

    • Learning Resources
    • E-books
    • Cheat Sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site Status Dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue