Skip to main content
Redhat Developers  Logo
  • Products

    Platforms

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat AI
      Red Hat AI
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • View All Red Hat Products

    Featured

    • Red Hat build of OpenJDK
    • Red Hat Developer Hub
    • Red Hat JBoss Enterprise Application Platform
    • Red Hat OpenShift Dev Spaces
    • Red Hat OpenShift Local
    • Red Hat Developer Sandbox

      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Openshift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • View All Technologies
    • Programming Languages & Frameworks

      • Java
      • Python
      • JavaScript
    • System Design & Architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer Productivity

      • Developer productivity
      • Developer Tools
      • GitOps
    • Automated Data Processing

      • AI/ML
      • Data Science
      • Apache Kafka on Kubernetes
    • Platform Engineering

      • DevOps
      • DevSecOps
      • Ansible automation for applications and services
    • Secure Development & Architectures

      • Security
      • Secure coding
  • Learn

    Featured

    • Kubernetes & Cloud Native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • AI/ML
      AI/ML Icon
    • View All Learning Resources

    E-Books

    • GitOps Cookbook
    • Podman in Action
    • Kubernetes Operators
    • The Path to GitOps
    • View All E-books

    Cheat Sheets

    • Linux Commands
    • Bash Commands
    • Git
    • systemd Commands
    • View All Cheat Sheets

    Documentation

    • Product Documentation
    • API Catalog
    • Legacy Documentation
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

Your LLM is too large: How I generate production-ready failure analysis on a toaster

Why pattern preprocessing makes small models mighty

September 2, 2025
Caleb Evans
Related topics:
Artificial intelligenceAutomation and managementDevOpsKubernetesPlatform engineering
Related products:
Red Hat AIRed Hat OpenShift

Share:

    I'm running production-grade Kubernetes failure analysis on an edge computing device—a piece of hardware that costs less than what many teams spend on LLM API calls in just two to three months. The model is Llama 3.2:3B with 4-bit quantization, delivering comprehensive root cause analysis in 70 seconds that, for common production failures, matches the practical value of commercial models.

    Let me show you how pattern preprocessing fundamentally changes the economics and performance of production AI.

    The challenge with LLMs in production

    When you send 10,000 lines of raw logs to a state-of-the-art LLM, you're essentially paying it to rediscover what grep already knows. Common patterns like connection refused, out of memory errors, and permission failures were solved decades ago. Yet we're burning tokens to teach sophisticated AI systems to recognize basic connection timeouts.

    Despite their compute power, large models can struggle with the signal-to-noise ratio in raw logs. They excel at complex reasoning but may miss obvious patterns hidden in thousands of lines of output.

    Enter pattern preprocessing: The architecture

    Instead of throwing raw logs at an LLM and hoping for brilliance, I built a multilayer system:

    Raw Logs (10,000 lines) 
        ↓
    Pattern Engine (deterministic regex matching)
        ↓
    Scored & Contextualized Matches (500 lines)
        ↓
    Small LLM
        ↓
    Human-Readable Analysis

    The pattern engine does the heavy lifting—identifying known failures, extracting relevant context, and scoring matches based on severity and proximity. The LLM then takes this pre-digested information and explains it like a senior engineer would.

    Real-world results: 270 lines to root cause in 70 seconds

    Let me show you actual output from analyzing a Kubernetes pod failure with database connection issues. The system processed 270+ lines of logs and delivered this analysis:

    Output

    Root Cause: The primary root cause of the pod failure is the **REST Client Connection Error** 
    with a score of 9.775666542449912. This error occurred due to the remote service being unreachable, 
    leading to multiple timeouts and ultimately causing the REST client to fail.
    Evidence: 
    * Log message from Line 13: `ConnectException: Connection refused (Connection refused)`
    * Context Before: The Quarkus application was started in 2.234s.
    * Context After: A `java.net.ConnectException` occurred with a connection refused error.
    Fix: 
    1. Verify that the remote service is running and accessible by checking its health endpoints or status pages.
    2. If the remote service is down, restart it or investigate any underlying issues causing the downtime.
    3. Once the remote service is up and running, update the Quarkus configuration to increase the timeout 
       values for the REST client.
    Example command:
    # Increase timeout values in configuration
    echo "quarkus.rest.client.timeout=30" >> application.properties
    Prevention: 
    1. Regularly monitor the health of remote services using health endpoints or status pages.
    2. Implement circuit breakers or retries to handle temporary connectivity issues.
    3. Consider using a load balancer or service discovery mechanism to ensure that requests are routed 
       to available instances of the remote service.
    Sequence of Events: 
    * Initial trigger: The REST client attempts to connect to the remote service, but it is unreachable 
      due to being down or unavailable (Line 13).
    * Cascade: The `ConnectException` causes multiple timeouts and ultimately leads to a failure of the 
      REST client (Lines 14-17, 30-32).
    * Final state: The pod fails due to the propagation of the errors from the REST client connection error.
    By addressing the root cause of the REST Client Connection Error and implementing preventative 
    measures, you can reduce the likelihood of similar failures occurring in the future.

    The performance comparison

    Traditional approach:

    • Input: All 270 lines sent to large commercial LLM
    • Cost: $0.30-3.00 per analysis (depending on model/service)
    • Result: Often generic advice

    Preprocessed approach (above):

    • Input: Pattern engine identifies key events, sends contextualized lines to the model
    • Cost: <$0.001
    • Result: Specific root cause, evidence trail, and actionable remediation

    All from a model that fits in 2 GB of RAM.

    Why this changes everything

    • Accessible hardware: This approach runs on consumer-grade hardware rather than enterprise GPU clusters. I'm using an edge device originally designed for autonomous vehicles, but it works equally well on a decent laptop with a GPU.
    • Dramatic cost reduction: We achieve a 99.7% reduction in inference costs. In practical terms, a traditional LLM approach costs more for a single analysis than this system costs for an entire day of operations.
    • Speed without sacrifice: Prefiltered context means the model sees exactly what matters. The system focuses on relevant error patterns rather than processing thousands of lines of startup logs and normal operations.
    • Community intelligence: These patterns represent community knowledge that can be shared and improved collectively, similar to how antivirus definitions work:

      
      patterns:
        - id: "quarkus_connection_pool_exhausted"
          primary_pattern:
            regex: "Connection pool.*exhausted|Unable to acquire connection"
            confidence: 0.95
          secondary_patterns:
            - regex: "timeout.*waiting for connection"
              weight: 0.7
          remediation:
            description: "Database connection pool is exhausted"
            common_causes:
              - "Spike in traffic"
              - "Connection leak in application"
              - "Database performance degradation"

      Every pattern is reviewable, versioned, and improvable through standard Git workflows. Your senior engineers' knowledge becomes codified, shareable, and composable.

    Beyond logs: Expanding the pattern

     Once you have pattern preprocessing infrastructure, the same approach can apply to many domains:

    • Metrics anomalies: Patterns for CPU spikes, memory leaks, disk pressure
    • Security events: Known attack signatures, suspicious access patterns
    • Performance regressions: Response time degradations, throughput drops
    • User behavior: Error click patterns, rage-quit sequences

    The same architecture that makes log analysis efficient works for any structured data where domain expertise exists.

    Building your own pattern-augmented system

    The pattern is simple:

    1. Collect domain patterns: Start with your runbooks. Every "if you see X, do Y" is a pattern.
    2. Build a scoring engine: Patterns rarely appear in isolation. Score them by severity, proximity, and temporal relationships.
    3. Create context windows: Extract relevant surrounding information for each match.
    4. Choose a small model: Llama 3.2, Phi-3, or even Mistral 7B work brilliantly with good context.
    5. Iterate with feedback: Every false positive or negative improves the patterns.

    The GitOps advantage for AI knowledge

    Managing patterns in Git provides unexpected benefits:

    • Code review for AI: Pattern changes go through standard review processes.
    • Accountability: Git blame shows who added or modified each pattern.
    • Collaborative improvement: Teams can iterate on patterns based on real incidents.
    • Versioned intelligence: Roll back patterns if they cause issues.

    This approach makes AI knowledge manageable and auditable by engineering teams.

    Back to reality

    Let me be clear: Large language models are remarkable. On general knowledge tasks, creative writing, and complex reasoning, they're in a different league. But for production operations—where patterns are known, speed matters, and costs compound—pattern preprocessing with small models isn't just competitive; it's superior for the majority of common failure scenarios.

    The "toaster" in my title is a deliberate exaggeration—my edge device is considerably more capable. But the point stands: the future of production AI isn't necessarily bigger models. It's smarter engineering around smaller ones.

    Practical next steps

    Before investing in large-scale AI infrastructure, consider these questions:

    • What percentage of your problems are truly novel versus known patterns?
    • How much you're currently spending to analyze well-understood failures?
    • Can your team's expertise be codified into reviewable patterns?

    If you're interested in seeing this in action, I've open-sourced the entire stack. The Podmortem operator demonstrates pattern-augmented analysis for Kubernetes, but the principles apply everywhere.

    Note: Performance metrics are based on real-world testing with common Kubernetes failure patterns. Results may vary based on specific use cases and pattern coverage.

    A simple way to experiment: Podman AI Lab

    The idea of running powerful models on your laptop might seem complex, but tools are emerging to make it incredibly straightforward. If you want to experiment with the pattern-augmented approach I've described, one of the easiest ways to get started is with Podman AI Lab.

    Podman AI Lab lets you download and run popular, optimized models with just a few commands. It handles the environmental setup for you, so you can focus on development and experimentation with AI. For hands-on guides to getting a model running in minutes, check out these excellent articles:

    • Getting started: AI meets containers: My first step into Podman AI Lab
    • Building an application: Build your AI application with an AI Lab extension in Podman Desktop

    Conclusion: Intelligence is more than model size

    The industry has focused heavily on model size as the primary metric for capability. But effective intelligence isn't just about raw capability—it's about applying the right tool to the right problem. When you combine human expertise (patterns) with AI explanation (small LLMs), you get something powerful: production-ready AI that's fast, cheap, and reliable.

    Related Posts

    • What is GPU programming?

    • How to use LLMs in Java with LangChain4j and Quarkus

    • How we optimized vLLM for DeepSeek-R1

    • How I built an agentic application for Docling with MCP

    • AI meets containers: My first step into Podman AI Lab

    • Ollama vs. vLLM: A deep dive into performance benchmarking

    Recent Posts

    • What's New in OpenShift GitOps 1.18

    • Beyond a single cluster with OpenShift Service Mesh 3

    • Kubernetes MCP server: AI-powered cluster management

    • Unlocking the power of OpenShift Service Mesh 3

    • Run DialoGPT-small on OpenShift AI for internal model testing

    What’s up next?

    Open source AI for developers introduces and covers key features of Red Hat OpenShift AI, including Jupyter Notebooks, PyTorch, and enhanced monitoring and observability tools, along with MLOps and continuous integration/continuous deployment (CI/CD) workflows.

    Get the e-book
    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Platforms

    • Red Hat AI
    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform
    • See all products

    Build

    • Developer Sandbox
    • Developer Tools
    • Interactive Tutorials
    • API Catalog

    Quicklinks

    • Learning Resources
    • E-books
    • Cheat Sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site Status Dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit
    © 2025 Red Hat

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue