Skip to main content
Redhat Developers  Logo
  • Products

    Platforms

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat AI
      Red Hat AI
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • See all Red Hat products

    Featured

    • Red Hat build of OpenJDK
    • Red Hat Developer Hub
    • Red Hat JBoss Enterprise Application Platform
    • Red Hat OpenShift Dev Spaces
    • Red Hat OpenShift Local
    • Red Hat Developer Sandbox

      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Red Hat OpenShift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • See all technologies
    • Programming languages & frameworks

      • Java
      • Python
      • JavaScript
    • System design & architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer experience

      • Productivity
      • Tools
      • GitOps
    • Automated data processing

      • AI/ML
      • Data science
      • Apache Kafka on Kubernetes
    • Platform engineering

      • DevOps
      • DevSecOps
      • Red Hat Ansible Automation Platform for applications and services
    • Secure development & architectures

      • Security
      • Secure coding
  • Learn

    Featured

    • Kubernetes & cloud native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • AI/ML
      AI/ML Icon
    • See all learning resources

    E-books

    • GitOps cookbook
    • Podman in action
    • Kubernetes operators
    • The path to GitOps
    • See all e-books

    Cheat sheets

    • Linux commands
    • Bash commands
    • Git
    • systemd commands
    • See all cheat sheets

    Documentation

    • Product documentation
    • API catalog
    • Legacy documentation
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore the Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

Boring RAG: When similarity is just a SQL query

Retrieval-augmented generation with Apache Camel, PostgreSQL, and pgvector

March 5, 2026
Ivo Bek
Related topics:
Artificial intelligenceData integrationIntegration
Related products:
Red Hat build of Apache Camel

    Retrieval-augmented generation (RAG) is a practical way to answer questions using your own content (such as policies, docs, tickets, and product descriptions) without assuming a general LLM model already contains that information.

    At its core, RAG follows a "retrieve context, then answer" pattern. Retrieval is the part that often becomes overcomplicated. Once you store embeddings alongside text in a database, retrieval becomes a standard nearest-neighbor query. In other words: similarity is a query.

    This article demonstrates a "boring" implementation using Apache Camel, PostgreSQL, and pgvector. The goal is to create a baseline that is easy to understand and debug. You can see exactly what was indexed, what was retrieved, and the context provided to the model.

    If you want the bigger-picture framing LLMs as semantic processors and keeping the "AI parts" at the edges, read Making LLMs boring: From chatbots to semantic processors.

    A quick glossary

    An embedding is a vector (a list of numbers) produced from text. Similar text tends to end up near each other in that vector space.

    Chunking is splitting a document into smaller pieces before embedding it. It's rarely optional. Without chunking, you retrieve entire documents when you only need a paragraph.

    pgvector adds a vector(N) column type and distance operators <=> to PostgreSQL. This allows you to store embeddings and run similarity searches using SQL.

    The anatomy of a RAG pipeline

    Most RAG systems rely on three primary steps.

    • First, you index the content. This involves taking the context, chunking it, and storing the text within its vector. This is typically a batch job.
    • Second, you retrieve information. When a user asks a question, the system embeds the question and queries the database for the nearest chunks. You usually apply a similarity threshold (to avoid weak matches) and a topK (to keep context bounded).
    • Third, you provide an answer. If the retrieval finds no matches, the system returns a "not found" response or asks a clarifying question. If retrieval found something, you pass the retrieved chunks into the prompt as context and tell the model to answer using only that context.

    Let's make those steps concrete.

    Step 1: Index (chunk → embed → store)

    Indexing transforms static files into a queryable knowledge base. At a minimum, you should store the chunk text, a little metadata to help with tracing (such as the source, section ID, and document name), and the embedding vector.

    With pgvector, a basic schema looks like this:

    CREATE EXTENSION IF NOT EXISTS vector;
    CREATE TABLE IF NOT EXISTS chunks (
       id SERIAL PRIMARY KEY,
       content TEXT NOT NULL,
       source VARCHAR(255),
       chunk_index INTEGER,
       embedding vector(768),
       created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
    );

    The vector(768) dimension must match your embedding model. If you switch embedding models, you might need a different dimension (and usually a reindex).

    Use the following Camel route to implement the chunk → embed → store  process, as shown in Figure 1:

    - beans:
        - name: markdownSemanticTokenizer
          type: org.apache.camel.tokenizer.MarkdownSemanticTokenizer
          properties:
            headerMode: "RAG_CONTEXT"
    - route:
        id: index-files
        description: "Document Indexing"
        from:
          uri: file:documents
          parameters:
            noop: true
            include: ".*\\.md"
          steps:
            - setVariable:
                description: "Save File Name"
                name: fileName
                simple: "${header.CamelFileName}"
            - split:
                description: "Split Markdown"
                method:
                  ref: "markdownSemanticTokenizer"
                steps:
                  - setVariable:
                      description: "Save Index"
                      name: chunkIndex
                      simple: "${exchangeProperty.CamelSplitIndex}"
                  - setVariable:
                      description: "Save Chunk"
                      name: chunkText
                      simple: "${body.trim()}"
                  - to:
                      description: "Generate Embedding"
                      uri: openai:embeddings
                  - setVariable:
                      description: "Save Vector"
                      name: embeddingVector
                      simple: "${body.toString()}"
                  - to:
                      description: "Insert into DB"
                      uri: sql:INSERT INTO chunks (content, source, chunk_index, embedding) VALUES (:#chunkText, :#fileName, :#chunkIndex, :#embeddingVector::vector)
    Kaoto Document Q&A Indexing
    Figure 1: Document Indexing Pipeline in Kaoto Integration Designer.

    Note

    This route assumes you are starting with clean Markdown files. In the real world, enterprise knowledge is usually locked in PDFs or Word documents. To handle this, you can drop the camel-docling step docling:CONVERT_TO_MARKDOWN into your route. Powered by IBM's AI document parser, camel-docling understands complex document layouts including reading order, multi-column text, and even tables and seamlessly converts them into structured Markdown.

    You can then index the documents by running the following camel command:

    camel run index-documents.camel.yaml utils/* application.properties

    Apache Camel includes more than 300 components, allowing you to ingest documents and data from wherever your enterprise stores them: Amazon S3, Google Drive, Azure Files, Brokers, Salesforce, JIRA, or secure FTP servers.

    Step 2: Retrieve (similarity as SQL)

    At query time, the system embeds the user's question and runs a nearest-neighbor query against the stored vectors.

    In pgvector, <=> is a distance operator where a smaller value indicates a closer match). A common pattern is to convert distance into a similarity score (often 1 - distance), filter by a threshold, then take the top results.

    Using camel-openai for embeddings, this workflow involves calling the openai:embeddings endpoint and then running an sql:SELECT query with the resulting vector.

    - route:
        id: document-qa-route
        description: RAG Pipeline
        from:
          description: document-qa
          uri: direct
          parameters:
            name: document-qa
          steps:
            - setVariable:
                description: Save Question
                name: question
                simple: ${body.trim()}
            - log:
                description: Log Question
                message: "Question: ${variable.question}"
            - to:
                description: Get Embeddings
                uri: openai:embeddings
            - setVariable:
                description: Save Embedding
                name: queryEmbedding
                simple: ${body.toString()}
            - to:
                description: Vector Search
                uri: >
                  sql:SELECT content, source,
                       1 - (embedding <=> :#queryEmbedding::vector) as similarity
                  FROM chunks
                  WHERE 1 - (embedding <=> :#queryEmbedding::vector) > {{rag.similarity.threshold}}
                  ORDER BY embedding <=> :#queryEmbedding::vector
                  LIMIT {{rag.topK}}

    Two settings are important during the initial configuration:

    • threshold: (0.6) This setting prevents the system from adding weakly related chunks to the prompt.
    • topK: (5) This parameter limits the amount of context provided to the model.

    Step 3: Answer (or refuse)

    After the system retrieves the rows, it passes them into the model prompt as context and instructs the model to answer using only that information. This grounds the model in the retrieved data and helps prevent it from improvising.

    You must decide what happens if the retrieval step returns no results. For internal knowledge bases, forcing an answer is a recipe for hallucinations. A response such as, "I do not have enough information in the provided documents to answer that," is useful because it identifies where to improve your corpus, chunking strategy, or threshold limits.

            - setVariable:
                description: Save Results
                name: searchResults
                simple: ${body}
            - log:
                description: Log Results Count
                message: Found ${body.size()} relevant chunks
            - choice:
                description: Check Results
                otherwise:
                  steps:
                    - setVariable:
                        description: Prepare Context
                        name: context
                        simple: ${variable.searchResults}
                    - setBody:
                        description: Set User Prompt
                        simple: ${variable.question}
                    - setHeader:
                        description: System Instructions
                        name: CamelOpenAISystemMessage
                        simple: >
                          Answer the question using ONLY the context below.
                          If the context doesn't contain enough information, say "I
                          don't have complete information on that."
                          Be concise and cite the source when relevant.
                          Context:
                          ${variable.context}
                    - to:
                        description: Generate Answer
                        uri: openai:chat-completion

    Now you are ready to ask the questions:

    echo "What is the return policy?" | camel run document-qa.camel.yaml application.properties

    Will the model always stay perfectly inside the lines? Not always. This design makes the failure mode visible by allowing you to log retrieved rows and verify the context provided to the model, as illustrated in Figure 2:

    Kaoto Document Q&A
    Figure 2: Document Q&A RAG Pipeline in Kaoto Integration Designer.

    Beyond document Q&A: Reusing the pattern

    The beauty of storing embeddings in PostgreSQL and treating similarity as a SQL query is that you aren't limited to building Q&A chatbots. Once the core foundation of indexing and retrieval is in place, you can adapt the final answer phase to solve various engineering challenges.

    Because you are using standard SQL, you can easily join your vector similarity searches with your existing business logic.

    Semantic product search

    Standard keyword searches often fail if users are unfamiliar with your exact terminology. By embedding your product catalog, you can map fuzzy user inputs ("a large screen for design work") to the closest items in vector space. From there, you have options: you can return the database rows directly to the UI for a fast, deterministic search experience, or you can pass the retrieved rows to an LLM to generate a conversational summary of their options.

    Automated ticket deduplication

    You don't always need an LLM at the end of a RAG pipeline; sometimes, you can skip the "answer" step entirely. When a new support ticket is submitted, embed the text and run a similarity query against your historical, closed tickets. If the similarity score crosses a high threshold, you can automatically link the new ticket as a duplicate or route it to the exact engineer who solved the previous issue.

    By treating vectors as standard database rows, you can transform AI features into common backend engineering tasks.

    Limitations

    This baseline works well, but there are practical limits you should anticipate as you scale:

    • Chunking strategies significantly affect retrieval quality. If a chunk boundary separates a rule from its exception, the system might retrieve text that appears relevant but leads to an incorrect answer. This is a data-shaping problem more than a model problem.
    • Similarity is not correctness. Nearest neighbor means "close in embedding space," not "true," "complete," or "up to date." In practice you often combine vectors with metadata filters (source, version, access control) and keyword search for exact terms.
    • threshold and topK tuning is unavoidable. Too low and you inject noise. Too high and you refuse too often. You adjust based on real queries and real failure cases.
    • Cost and latency can add up as the system scales. Many RAG flows are two model calls per request (embeddings plus chat completion). At scale, caching and batching become important.

    While a more advanced stack (like hybrid keyword + vector search + reranking) can outperform this baseline on relevance, the trade-off is complexity. Start simple, and only add components when you have a concrete metric that demands it.

    Takeaway

    The primary advantage of this "boring RAG" approach is that it transforms a complex system into a standard software engineering task. By treating semantic search as a SQL query and keeping an air gap between your AI and your database, you ensure that every failure mode (e.g. a bad retrieval, wrong context, or a bad answer) is completely isolated and debuggable.

    Start simple. Once your core pipeline is running smoothly, you can confidently introduce complexity like advanced chunking or hybrid search exactly where the metrics tell you to.

    Next steps

    You can find fully runnable Apache Camel routes for document Q&A, product similarity, and ticket deduplication in the companion GitHub repository.

    Related Posts

    • Deploy an enterprise RAG chatbot with Red Hat OpenShift AI

    • Fine-tune a RAG model with Feast and Kubeflow Trainer

    • Improve RAG retrieval and training with Feast and Kubeflow Trainer

    • Retrieval-augmented generation with Llama Stack and Python

    • Retrieval-augmented generation with Node.js, Podman AI Lab & React

    • Making LLMs boring: From chatbots to semantic processors

    Recent Posts

    • How to scale enterprise federated AI with Flower and OCM

    • Boring RAG: When similarity is just a SQL query

    • How to collaborate with AI to improve your Ansible skills

    • Estimate GPU memory for LLM fine-tuning with Red Hat AI

    • Kafka Monthly Digest: February 2026

    What’s up next?

    Learning Path How to create a Camel integration and deploy feature image

    How to create a Camel integration and deploy it as a serverless service

    This activity, created by Kah Hoe Lai, walks through how to create an Apache...
    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Platforms

    • Red Hat AI
    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform
    • See all products

    Build

    • Developer Sandbox
    • Developer tools
    • Interactive tutorials
    • API catalog

    Quicklinks

    • Learning resources
    • E-books
    • Cheat sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site status dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit
    © 2026 Red Hat

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue