Skip to main content
Redhat Developers  Logo
  • Products

    Platforms

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat AI
      Red Hat AI
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • See all Red Hat products

    Featured

    • Red Hat build of OpenJDK
    • Red Hat Developer Hub
    • Red Hat JBoss Enterprise Application Platform
    • Red Hat OpenShift Dev Spaces
    • Red Hat OpenShift Local
    • Red Hat Developer Sandbox

      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Red Hat OpenShift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • See all technologies
    • Programming languages & frameworks

      • Java
      • Python
      • JavaScript
    • System design & architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer experience

      • Productivity
      • Tools
      • GitOps
    • Automated data processing

      • AI/ML
      • Data science
      • Apache Kafka on Kubernetes
    • Platform engineering

      • DevOps
      • DevSecOps
      • Red Hat Ansible Automation Platform for applications and services
    • Secure development & architectures

      • Security
      • Secure coding
  • Learn

    Featured

    • Kubernetes & cloud native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • AI/ML
      AI/ML Icon
    • See all learning resources

    E-books

    • GitOps cookbook
    • Podman in action
    • Kubernetes operators
    • The path to GitOps
    • See all e-books

    Cheat sheets

    • Linux commands
    • Bash commands
    • Git
    • systemd commands
    • See all cheat sheets

    Documentation

    • Product documentation
    • API catalog
    • Legacy documentation
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore the Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

Transform complex metrics into actionable insights with this AI quickstart

Transform infrastructure monitoring with AI Observability Metric Summarizer

January 20, 2026
Twinkll Sisodia
Related topics:
Artificial intelligence
Related products:
Red Hat AIRed Hat OpenShift AI

    Managing AI infrastructure in production can often feel overwhelming. A single vLLM model deployment can generate hundreds of metrics across GPU utilization, token throughput, cache efficiency, and latency measurements. Add OpenShift cluster metrics, and you're looking at thousands of data points that need constant monitoring.

    As a result, most teams either ignore metrics entirely (dangerous), get lost in dashboards full of technical jargon (ineffective), or spend hours manually correlating data across systems (inefficient). What if you could just ask "How is my GPU performing?" or "Why is my model slow?" and get clear, actionable answers?

    A team of Red Hatters, including Sandip Gahlot, Jianrong Zhang, Abdul Hameed, Manna Kong, and myself, created an AI quickstart to transform complex infrastructure monitoring into an intelligent, conversational experience. Built specifically for Red Hat OpenShift AI environments, this quickstart provides true root-cause visibility into your AI workloads and the underlying Red Hat OpenShift cluster. It goes beyond basic metrics by leveraging an LLM-powered analysis engine, which uses Korelt8r to seamlessly correlate the four pillars of observability: real-time metrics, logs, traces, and alerts. This integrated approach ensures you get clear, actionable insights through natural language interactions.

    Learn more: Introducing AI quickstarts

    Introducing the AI Observability Metric Summarizer

    Traditional monitoring tools weren't designed for this complexity. They show you what is happening, but not why it matters or what to do about it. Our AI-powered approach provides:

    • Context-aware analysis: Understands AI-specific performance patterns
    • Intelligent correlation: Connects GPU metrics to business impact
    • Actionable recommendations: Specific steps to improve performance
    • Predictive insights: Early warning before problems impact users

    Why AI observability matters for your organization

    As AI adoption accelerates, the complexity increases. There are more models, more clusters, more metrics, more dashboards. Without the right tools, it quickly becomes impossible to see the full picture, let alone act on it. That's where the AI Observability Metric Summarizer changes the game. Organizations need tools that:

    • Scale with AI workloads: Go beyond a few models in one cluster. You need visibility across hundreds of models, namespaces, and clusters, with summaries that stay clear even as your architecture grows more complex.
    • Accessible insights: Observability shouldn't be locked in the heads of SREs. Product managers, data scientists, platform engineers, and leadership should all be able to ask, "How are things going?" and get a plain-language answer they can act on.
    • Provide business value: Metrics alone don't move the needle. You need a narrative that connects GPU utilization, latency, and error rates to customer experience, cost efficiency, and SLAs so you can make better business decisions, not just better dashboards.
    • Enable proactive optimization: Get early warnings and clear recommendations: where to right-size, which models are at risk, and where performance or cost issues are likely to appear next.

    The AI Observability Metric Summarizer turns raw telemetry into stories, recommendations, and next steps, helping your teams run AI workloads with confidence, speed, and control.

    Key features of AI Observability Metric Summarizer

    The user interface of AI Observability Metric Summarizer features the multi-page Streamlit dashboard, with three specialized interfaces for a comprehensive analysis:

    • vLLM dashboard: Real-time GPU and model performance metrics
      • OpenShift dashboard: Cluster-wide and namespace-specific infrastructure monitoring
      • Chat interface: Conversational interaction with Prometheus, Alertmanager, Tempo, and Loki

    Plus, there's seamless integration for AI assistant support for Claude Desktop and Cursor IDE, through the Model Context Protocol (MCP). The integrated MCP server provides more than 20 specialized tools, including:

    • Metric discovery and search with semantic understanding
    • PromQL query generation and execution
    • Natural language result explanation
    • AI-powered analysis for vLLM and OpenShift metrics
    • Report generation in HTML, PDF, and Markdown formats

    Multi-model AI support provides flexible summarization using either your choice of external models, such as OpenAI GPT-4, Anthropic Claude, Google Gemini, or local models such as Llama 3.1-8B and Llama 3.2 (1B/3B/70B variants).

    Intelligent query processing transforms your questions, written in natural language, into precise PromQL queries with context-aware metric selection.

    Observability stack

    Of course, first and foremost this is all about observability. There are lots of tools at work in this space, and AI Observability Metric Summarizer can interface with each of them:

    • Prometheus/Thanos: Metrics collection and long-term storage with multi-cluster aggregation
    • Tempo: Distributed tracing for end-to-end request flow analysis
    • Loki: Centralized log aggregation and querying for correlated troubleshooting
    • OpenTelemetry Collector: Automatic instrumentation for Python applications with trace and metrics export
    • MinIO: Object storage backend for traces and logs For monitoring and alerting:
    • vLLM Metrics: GPU temperature, utilization, memory usage, request throughput, and token statistics
    • DCGM Integration: Deep GPU diagnostics including power consumption and error tracking
    • AI-Powered Alerting: Slack notifications with intelligent metric summaries and trend analysis
    • Custom Alert Rules: CronJob-based monitoring with configurable thresholds
    • Openshift Metrics: System level metrics about the infrastructure For backend services:
    • llama-stack: Backend API powering LLM inference and metric analysis
    • llm-service: Model serving layer handling inference requests
    • metric-ui-svc: Streamlit application service
    • mcp-server: MCP server exposing observability tools
    Architecture of AI Observability Metric Summarizer
    Figure 1: Architecture of AI Observability Metric Summarizer.

    Our Streamlit UI

    Figure 2 shows the main dashboard of the AI Observability Metric Summarizer. At the top is the vLLM Metric Summarizer view, which generates a natural-language model insights summary, alongside a metric dashboard with key vLLM GPU and latency statistics and trends over time. On the left, you can choose the namespace, model, and time range. At the bottom of the window is a space to chat with Prometheus, where you can ask questions, written in natural language, about your metrics and get AI-generated insights.

    vLLM Metric Summarizer UI
    Figure 2: The main dashboard of the AI Observability Metric Summarizer.

    Figure 3 highlights the Chat with Prometheus view. In this tab, you can ask questions about any Prometheus metric across the cluster, pick the AI model to power the analysis, and verify that MCP tools are connected. The app also pulls data points from logs and traces coming from Loki and Tempo, and uses Korel8r to correlate metrics, logs, and traces so the AI can return richer, end-to-end insights about the system.

    A screen to chat with the Prometheus monitoring tool
    Figure 3: You can chat with the Prometheus monitoring tool through AI
    Figure 3: You can chat with the Prometheus monitoring tool through AI.

    Figure 4 shows the OpenShift Metrics Dashboard view. In this tab, the app analyzes cluster-wide OpenShift metrics and generates an OpenShift insights summary that answers questions like overall fleet health, reliability concerns, and recommended actions. On the right, the OpenShift Fleet Dashboard presents key KPIs, such as total pods running and failed, cluster CPU and memory usage, GPU utilization, and GPU temperature, with trend charts over time. On the left, you can choose the analysis scope, metric category (for example, Fleet Overview), and a timestamp range for the AI-powered summary.

    OpenShift Metrics Dashboard
    Figure 4: A metrics dashboard for Red Hat OpenShift
    Figure 4: A metrics dashboard for Red Hat OpenShift.

    Installation in one command

    It's easy to install the AI Observability Metric Summarizer. First, log in to your OpenShift cluster on terminal with oc login as usual.

    Download the source code from our repository, change directory to it, and deploy a complete AI observability stack:

    $ git clone git@github.com:rh-ai-quickstart/ai-observability-summarizer.git
    $ cd ai-observability-summarizer
    $ make install NAMESPACE=your-namespace

    After the installation is complete, open OpenShift Console in a web browser and navigate to Networking > Routes > Streamlit UI. Try asking your first question, such as "How is my model performing?" Then use it to get instant insights about GPU utilization, token throughput, and latency, or to generate reports to share with your team!

    Ready to transform your AI monitoring?

    Stop drowning in dashboards and metrics. Start having real conversations with your infrastructure.

    Whether you're running a single proof-of-concept model or operating a full-scale enterprise AI platform, this intelligent observability layer helps you:

    • Slash MTTR: Turn "What's going on?" from a multi-hour war room into a few minutes of targeted insight.
    • Cut costs with confidence: Get AI-powered recommendations on where to right-size, consolidate, or optimize GPU and cloud spend.
    • Prevent outages, not just react: Use predictive analysis to spot unhealthy trends before they become incidents.
    • Scale without blind spots: Maintain clear, end-to-end visibility as you add more models, clusters, and teams.

    To get started, be sure to read the docs, and we'd love to see you join our community!

    Also, watch this video about it from KubeCon North America 2025:

    Related Posts

    • AI-driven vulnerability management with Red Hat Lightspeed MCP

    • Accelerate multi-turn LLM workloads on OpenShift AI with llm-d intelligent routing

    • Building effective AI agents with Model Context Protocol (MCP)

    Recent Posts

    • AI quickstart: How to build an AI-driven product recommender with Red Hat OpenShift AI

    • Deploy an Oracle SQLcl MCP server on Red Hat OpenShift

    • Transform complex metrics into actionable insights with this AI quickstart

    • An in-depth look at the Software Catalog and Templates

    • LLM Compressor 0.9.0: Attention quantization, MXFP4 support, and more

    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Platforms

    • Red Hat AI
    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform
    • See all products

    Build

    • Developer Sandbox
    • Developer tools
    • Interactive tutorials
    • API catalog

    Quicklinks

    • Learning resources
    • E-books
    • Cheat sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site status dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit
    © 2025 Red Hat

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue