Skip to main content
Redhat Developers  Logo
  • AI

    Get started with AI

    • Red Hat AI
      Accelerate the development and deployment of enterprise AI solutions.
    • AI learning hub
      Explore learning materials and tools, organized by task.
    • AI interactive demos
      Click through scenarios with Red Hat AI, including training LLMs and more.
    • AI/ML learning paths
      Expand your OpenShift AI knowledge using these learning resources.
    • AI quickstarts
      Focused AI use cases designed for fast deployment on Red Hat AI platforms.
    • No-cost AI training
      Foundational Red Hat AI training.

    Featured resources

    • OpenShift AI learning
    • Open source AI for developers
    • AI product application development
    • Open source-powered AI/ML for hybrid cloud
    • AI and Node.js cheat sheet

    Red Hat AI Factory with NVIDIA

    • Red Hat AI Factory with NVIDIA is a co-engineered, enterprise-grade AI solution for building, deploying, and managing AI at scale across hybrid cloud environments.
    • Explore the solution
  • Learn

    Self-guided

    • Documentation
      Find answers, get step-by-step guidance, and learn how to use Red Hat products.
    • Learning paths
      Explore curated walkthroughs for common development tasks.
    • Guided learning
      Receive custom learning paths powered by our AI assistant.
    • See all learning

    Hands-on

    • Developer Sandbox
      Spin up Red Hat's products and technologies without setup or configuration.
    • Interactive labs
      Learn by doing in these hands-on, browser-based experiences.
    • Interactive demos
      Click through product features in these guided tours.

    Browse by topic

    • AI/ML
    • Automation
    • Java
    • Kubernetes
    • Linux
    • See all topics

    Training & certifications

    • Courses and exams
    • Certifications
    • Skills assessments
    • Red Hat Academy
    • Learning subscription
    • Explore training
  • Build

    Get started

    • Red Hat build of Podman Desktop
      A downloadable, local development hub to experiment with our products and builds.
    • Developer Sandbox
      Spin up Red Hat's products and technologies without setup or configuration.

    Download products

    • Access product downloads to start building and testing right away.
    • Red Hat Enterprise Linux
    • Red Hat AI
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform
    • See all products

    Featured

    • Red Hat build of OpenJDK
    • Red Hat JBoss Enterprise Application Platform
    • Red Hat OpenShift Dev Spaces
    • Red Hat Developer Toolset

    References

    • E-books
    • Documentation
    • Cheat sheets
    • Architecture center
  • Community

    Get involved

    • Events
    • Live AI events
    • Red Hat Summit
    • Red Hat Accelerators
    • Community discussions

    Follow along

    • Articles & blogs
    • Developer newsletter
    • Videos
    • Github

    Get help

    • Customer service
    • Customer support
    • Regional contacts
    • Find a partner

    Join the Red Hat Developer program

    • Download Red Hat products and project builds, access support documentation, learning content, and more.
    • Explore the benefits

Deliver generative AI at scale with NVIDIA NIM on OpenShift AI

Accelerate your application development at scale

March 26, 2025
Tomer Figenblat
Related topics:
Artificial intelligenceData scienceHybrid cloudMicroservices
Related products:
Red Hat OpenShift AI

    Native support for NVIDIA NIM microservices is now generally available on Red Hat OpenShift AI to help streamline inferencing for dozens of AI/ML models on a consistent, flexible hybrid cloud platform. NVIDIA NIM, part of the NVIDIA AI Enterprise software platform, is a set of easy-to-use inference microservices for accelerating the deployment of foundation models and keeping your data secured.

    With NVIDIA NIM on OpenShift AI, data scientists, engineers, and application developers can collaborate in a single destination that promotes consistency, security and scalability, driving faster time-to-market of applications.

    This how-to article will help you get started with creating and delivering AI-enabled applications with NVIDIA NIM on OpenShift AI. 

    Enable NVIDIA NIM

    First, go to the NVIDIA NGC catalog to generate an API key. From the top right profile menu, select the Setup option and click to generate your API key, as shown in Figure 1.

    Change me.
    Figure 1: Generate the API key to use the NVIDIA NGC catalog.

    In your Red Hat OpenShift AI dashboard, locate and click the NVIDIA NIM tile. See Figure 2.

    Figure2: Locate NVIDIA NIM app in your OpenShift AI instance.
    Figure 2: Explore and locate NVIDIA NIM app in your OpenShift AI instance.

    Next, click Enable and input the API key that you generated from the NVIDIA NGC catalog in the previous step (Figure 1), and click Submit to enable NVIDIA NIM. See Figure 3.

    Note

    Enabling NVIDIA NIM requires being logged in to OpenShift AI as a user with OpenShift AI administrator privileges.

    eafaeda
    Figure 3: Enable NVIDIA NIM.

    Watch for the notification informing your API key was validated successfully. See Figure 4.

    daada
    Figure 4: Verify validation of the API key.

    Verify the enablement by selecting the Enabled option from the left navigation bar, as marked in Figure 4. Note the NVIDIA NIM card as one of your apps. See Figure 5.

    daadfda
    Figure 5: Verify NVIDIA NIM enablement.

    Create and deploy a model

    Next, we will create a data science project. Data science projects allow you to collect your work—including Jupyter workbenches, storage, data connections, models, and servers—into a single project.

    From the left navigation bar, select Data Science Projects, and click to create a project. Enter a project and description name, then click Create, as shown in Figure 6.

    eada
    Figure 6: Create a new data science project.

    Once the project is created, select the model serving platform for your project, demonstrated in Figure 7.

    DFSA
    Figure 7: Select a model serving platform.

    After selecting the platform, you will be able to click Deploy model; see Figure 8.

    fafas
    Figure 8: Click to configure a model serving deployment.

    Select your desired model, configure your deployment, and click Deploy. Check Figure 9.

    dada
    Figure 9: Describe the model serving and deploy it.

    Wait for the model to be available. See Figure 10.

    daa
    Figure 10: A green check mark appears in the tile when the model is available.

    Switch over to the Models tab and take note of your external URL and access token. These are marked in Figure 11.

    ddd
    Figure 11: Grab model's external URL and token from the Models tab of the project.

    Configure and create a workbench

    Now that the model is deployed, let’s create a workbench. A workbench is an instance of your development environment. In it, you'll find all the tools required for your data science work.

    From the same Data Science Projects Overview tab (or the Workbenches tab), click Create a workbench, as shown in Figure 12.

    dd
    Figure 12: Click to create a workbench.

    Describe and create your workbench. Follow Figure 13.

    ffs
    Figure 13: Describe and deploy your Workbench.

    Wait for the workbench to be in a running state and click Open, as seen in Figure 14. You'll return to the opened workbench next.

    fssdf
    Figure 14: Open the workbench when running.

    Execute example code

    To demonstrate the model accessibility, we'll use the code excerpt from NVIDIA's build cloud. We will use it from the workbench we previously created, replacing only the URL and the token in the excerpt with the ones from our deployed model.

    Locate your model of choice in NVIDIA build cloud and copy the example excerpt, demonstrated in Figure 15. Following Figure 15 is the code snippet used in this example.

    dadfada
    Figure 15: Copy example excerpt from NVIDIA's build cloud.
    from openai import OpenAI
    client = OpenAI(
      base_url = "https://integrate.api.nvidia.com/v1",
      api_key = "$API_KEY_REQUIRED_IF_EXECUTING_OUTSIDE_NGC"
    )
    completion = client.chat.completions.create(
      model="meta/llama3-8b-instruct",
      messages=[{"role":"user","content":""}],
      temperature=0.5,
      top_p=1,
      max_tokens=1024,
      stream=True
    )
    for chunk in completion:
      if chunk.choices[0].delta.content is not None:
        print(chunk.choices[0].delta.content, end="")
    

    Switch to the workbench you opened in Figure 14, launch a new Python Notebook, and install the openai library required to run this example. See Figure 16 (again, the snippet will follow).

    caca
    Figure 16: Install the openai Python library inside the Workbench.
    !pip install openai

    In the same notebook, paste the code excerpt from Figure 15, replacing base_url and api_key with the external URL and token you noted in Figure 11, and execute it. See example in Figure 17.

    dadas
    Figure 17: Execute modified excerpt from the Workbench against the deployed model's external URL.

    Observe metric graphs

    Now that your model is up and running, and you have an environment to work from, let's observe the model serving's performance.

    From your project's Models tab, click the model's name, and observe the endpoint performance metric graphs shown in Figure 18.

    ddd
    Figure 18: Observe performance metrics.

    Switch to the NIM Metrics tab and observe NIM-specific inference-related metric graphs. See Figure 19.

    ddd
    Figure 19: Observe NIM-specific metrics.

    Get started with NVIDIA NIM on OpenShift AI

    We hope you found this short tutorial helpful!

    NVIDIA NIM integration on Red Hat OpenShift AI is now generally available. With this integration, enterprises can increase productivity by implementing generative AI to address real business use cases like expanding customer service with virtual assistants, case summarization for IT tickets, and accelerating business operations with domain-specific copilots.

    Get started today with NVIDIA NIM on Red Hat OpenShift AI. You can also find more information on the OpenShift AI product page.

    Last updated: June 11, 2025

    Related Posts

    • How to use AMD GPUs for model serving in OpenShift AI

    • Why GPUs are essential for AI and high-performance computing

    • How InstructLab enables accessible model fine-tuning for gen AI

    • Deploy a coding copilot model with OpenShift AI

    • How to fine-tune Llama 3.1 with Ray on OpenShift AI

    • Model training in Red Hat OpenShift AI

    Recent Posts

    • Bring your own evaluation framework to EvalHub

    • Integrate OpenShift AI and PG Airman MCP Server

    • Build a local voice agent with Red Hat OpenShift AI

    • Gang autoscaling on OpenShift with Kueue and ProvisionRequest

    • Installing Red Hat Enterprise Linux 10 from a bootc image with bootc

    What’s up next?

    This hands-on learning path demonstrates how retrieval-augmented generation (RAG) works and how users can implement a RAG workflow using Red Hat OpenShift AI and Elasticsearch vector database.

    Start the activity
    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Platforms

    • Red Hat AI
    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform
    • See all products

    Build

    • Developer Sandbox
    • Developer tools
    • Interactive tutorials
    • API catalog

    Quicklinks

    • Learning resources
    • E-books
    • Cheat sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site status dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit
    © 2026 Red Hat

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Chat Support

    Please log in with your Red Hat account to access chat support.