Skip to main content
Redhat Developers  Logo
  • Products

    Platforms

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat AI
      Red Hat AI
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • View All Red Hat Products

    Featured

    • Red Hat build of OpenJDK
    • Red Hat Developer Hub
    • Red Hat JBoss Enterprise Application Platform
    • Red Hat OpenShift Dev Spaces
    • Red Hat OpenShift Local
    • Red Hat Developer Sandbox

      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Openshift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • View All Technologies
    • Programming Languages & Frameworks

      • Java
      • Python
      • JavaScript
    • System Design & Architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer Productivity

      • Developer productivity
      • Developer Tools
      • GitOps
    • Automated Data Processing

      • AI/ML
      • Data Science
      • Apache Kafka on Kubernetes
    • Platform Engineering

      • DevOps
      • DevSecOps
      • Ansible automation for applications and services
    • Secure Development & Architectures

      • Security
      • Secure coding
  • Learn

    Featured

    • Kubernetes & Cloud Native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • AI/ML
      AI/ML Icon
    • View All Learning Resources

    E-Books

    • GitOps Cookbook
    • Podman in Action
    • Kubernetes Operators
    • The Path to GitOps
    • View All E-books

    Cheat Sheets

    • Linux Commands
    • Bash Commands
    • Git
    • systemd Commands
    • View All Cheat Sheets

    Documentation

    • Product Documentation
    • API Catalog
    • Legacy Documentation
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

Integrate Red Hat AI Inference Server & LangChain in agentic workflows

June 19, 2025
Alexander Barbosa Ayala
Related topics:
Artificial intelligenceContainersPython
Related products:
Red Hat AIRed Hat Enterprise Linux AIRed Hat OpenShift AI

Share:

    Integrating the powerful capabilities of large language models (LLMs) into real-world workflows is one of the most significant advancements in modern application development. This article explores how to integrate Red Hat AI Inference Server with LangChain to build agentic workflows, with a focus on document processing. In this example, Red Hat AI Inference Server serves the LLM, while LangChain manages the application state and workflow.

    About LangChain

    LangChain is a framework designed to facilitate the development of applications powered by large language models. LangChain implements a standard interface for LLMs and related technologies, such as embedding models and vector stores, and integration with other providers.

    LangChain can also be integrated with the LangGraph framework, which enables fine-grained control over workflows that integrate LLMs. Therefore, if the use case involves a sequence of steps requiring orchestration and decision-making at key points, LangGraph provides the structure to manage that complexity effectively.

    About Red Hat AI Inference Server

    Red Hat AI Inference Server is a robust, enterprise-grade solution for LLM inference and serving, supported and maintained by Red Hat, which is leveraged from the popular community-supported vLLM inference server.

    One of Red Hat AI Inference Server's great advantages is the flexibility for model serving options, having a wide range of validated models, parametrization, and more. Also, it is cloud native, allowing it to scale in any hybrid cloud OpenShift environment. For more details, refer to the  Red Hat AI Hugging Face space and the Red Hat AI Inference Server official documentation.

    Use case description

    To demonstrate the integration between LangChain and Red Hat AI Inference Server, a document processing use case was developed. Depending on the LLM's reasoning, the document will be either approved or rejected. 

    The application contains a document-processing agent that extracts information from an uploaded PDF and embeds it in a vector store. This vectorized information is processed by the LLM, which checks compliance rules, and finally responds with a status, Approved or Rejected. All coordinated via LangGraph states.

    Planning the app workflow

    Having described the use case, the next step is to design the graph nodes and workflow (Figure 1). For this use case, there are three main nodes to build:

    • Upload the file: Locate the document in a valid local path. Its contents will be processed by the LLM.
    • Checking compliance rules: Verify if the previously uploaded document contains the desired rule, in this case, if it includes a termination clause.
    • Response: Generated a final response approving or rejecting the document. It also includes a reasoning from the LLM.
    App Workflow
    Figure 1: Designed application workflow.

     

    Translating this to LangChain in Python code, an option is to define the states as functions (you could also define them as classes, depending on each use case and development practices). In the current case, each node is defined as a function:

    def upload_and_extract(state):
        print("Loading & Chunking PDF...")
    ...
    def check_compliance_with_retrieval(state):
        print("Checking for 'termination clause'...")
    ...
    def respond(state):
        if state["compliant"]:
            msg = "Document Approved."
        else:
            msg = "Document Rejected. Missing termination clause."
    ...

    Once we define the node functions and logic, it should build the state graph to manage the use case workflow:

    # build the state graph
    graph = StateGraph(State)
    graph.add_node("upload", RunnableLambda(upload_and_extract))
    graph.add_node("check", RunnableLambda(check_compliance_with_retrieval))
    graph.add_node("respond", RunnableLambda(respond))
    graph.set_entry_point("upload")
    graph.add_edge("upload", "check")
    graph.add_edge("check", "respond")
    graph.set_finish_point("respond")
    # execute the workflow
    workflow = graph.compile()
    workflow.invoke({})

    The full application example, along with its local execution configuration, is detailed in the next section.

    Set up the environment and execute the application

    Once the use case workflow and the nodes' purposes are designed, it’s time to implement the code details. The below-described procedure was configured in an environment with the following technical details:

    • OS: Fedora 41
    • GPU: NVIDIA RTX 4060 Ti (16 GB vRAM)
    • RHAIIS 3.0.0 image
    • Python 3.12 

    Configure the application environment

    1. Clone the GitHub repository:

      git clone https://github.com/alexbarbosa1989/rhaiis-langchain.git
    2.  Move to the application directory:

      cd rhaiis-langchain
    3. Create an .env file where will be configured sensitive information, such as variables and passwords. In this case, define the DOCUMENT_PATH variable with the full path to the PDF document that will be analyzed. An example PDF document is provided in the application repository in the rhaiis-langchain/docs/example/contract-template.pdf path:

      echo "DOCUMENT_PATH=/home/user/rhaiis-langchain/docs/example/contract-template.pdf" >> .env

      Verifying the contents:

      cat .env 
      DOCUMENT_PATH=/home/user/Downloads/rhaiis-langchain/docs/example/contract-template.pdf
    4. Create a Python virtual environment:

      python3.12 -m venv --upgrade-deps venv
    5. Activate the environment:

      source venv/bin/activate
    6. Install the required libraries to run the application correctly:

      pip install -r requirements.txt

    Start the Red Hat AI Inference Server container

    In another terminal window, start a Red Hat AI Inference Server instance to serve an LLM; in this case, the quantized RedHatAI/Meta-Llama-3.1-8B-Instruct-quantized.w8a8 model was served. Make sure that the Podman environment is previously configured by following the suggested procedure in the product documentation, specifically the Serving and inferencing with AI Inference Server section:

    podman run -ti --rm --pull=newer \
    --user 0 \
    --shm-size=0 \
    -p 127.0.0.1:8000:8000 \
    --env "HUGGING_FACE_HUB_TOKEN=$HF_TOKEN" \
    --env "HF_HUB_OFFLINE=0" \
    -v ./rhaiis-cache:/opt/app-root/src/.cache  \
    --device nvidia.com/gpu=all \
    --security-opt=label=disable \
    --name rhaiis \
    registry.redhat.io/rhaiis/vllm-cuda-rhel9:3.0.0 \
    --model RedHatAI/Meta-Llama-3.1-8B-Instruct-quantized.w8a8 \
    --max_model_len=4096

    Run the application

    Go back to the terminal session where the application was cloned and the Python virtual environment is activated. There, run the Python application:

    python app.py

    The application will automatically follow the design & build workflow, processing the set PDF document specified in the DOCUMENT_PATH variable. It will analyze the document's contents using the served LLM to check for the presence of a termination clause. Based on this review, the document will either be approved or rejected. The provided contract-template.pdf document complies with the established rule; therefore, it should be approved:

    Loading & Chunking PDF...
    Checking for 'termination clause'...
    Final Response: Document Approved.
    Reasoning: yes.
    the document contains a termination clause in section 3, which states: "either party may terminate this agreement with 30 days written notice. upon termination, all outstanding balances must be paid." this clause outlines the conditions for terminating the agreement.

    For a rejected document test, only need to update the DOCUMENT_PATH variable in the .env file with another document that doesn’t contain any termination clause, and re-run the application:

    Loading & Chunking PDF...
    Checking for 'termination clause'...
    Final Response: Document Rejected. Missing termination clause.
    Reasoning: no, the document does not contain a termination clause. 
    a termination clause is a provision in a contract that specifies the conditions under which the contract may be terminated. the document provided appears to be an instruction manual for installing a topcase for a royal enfield himalayan motorcycle, and it does not mention anything about terminating a contract or agreement.

    Finally, it is possible to print the graph based on the configured workflow by uncommenting the last line in app.py and re-running the application:

    # print the graph in ASCII format (OPTIONAL)
    #print(workflow.get_graph().draw_ascii())
    # print the graph in ASCII format (OPTIONAL)
    print(workflow.get_graph().draw_ascii())

    Additional expected output:

    +-----------+  
    | __start__ |  
    +-----------+  
          *        
          *        
          *        
      +--------+   
      | upload |   
      +--------+   
          *        
          *        
          *        
      +-------+    
      | check |    
      +-------+    
          *        
          *        
          *        
     +---------+   
     | respond |   
     +---------+   
          *        
          *        
          *        
     +---------+   
     | __end__ |   
     +---------+

    Wrapping up

    This is just a basic example that shows one of the alternatives to integrate LLMs into an application's design. Using frameworks like LangChain takes the power of LLMs to another level by adding workflow and state management to fit into specific business needs. Incorporating agentic approaches and tools like vector stores for document processing is just one of many advanced capabilities you can leverage with these integrations.

    Next, you can explore more alternatives for RHAIIS serving and integrations. We encourage you to walk through the steps to run OpenAI’s Whisper model using Red Hat AI Inference Server on a Red Hat Enterprise Linux 9 environment: Speech-to-text with Whisper and Red Hat AI Inference Server

    Related Posts

    • How to use LLMs in Java with LangChain4j and Quarkus

    • Enable 3.5 times faster vision language models with quantization

    • Introducing Podman AI Lab: Developer tooling for working with LLMs

    • LLMs and Red Hat Developer Hub: How to catalog AI assets

    • Structured outputs in vLLM: Guiding AI responses

    • vLLM V1: Accelerating multimodal inference for large language models

    Recent Posts

    • Cloud bursting with confidential containers on OpenShift

    • Reach native speed with MacOS llama.cpp container inference

    • A deep dive into Apache Kafka's KRaft protocol

    • Staying ahead of artificial intelligence threats

    • Strengthen privacy and security with encrypted DNS in RHEL

    What’s up next?

    Configure your RHEL AI machine, download, serve, and interact with large language models (LLM) using RHEL AI and InstructLab, and discover how you can benefit from AI models tailored to your needs.

    Start the activity
    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Products

    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform

    Build

    • Developer Sandbox
    • Developer Tools
    • Interactive Tutorials
    • API Catalog

    Quicklinks

    • Learning Resources
    • E-books
    • Cheat Sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site Status Dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit
    © 2025 Red Hat

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue