Skip to main content
Redhat Developers  Logo
  • AI

    Get started with AI

    • Red Hat AI
      Accelerate the development and deployment of enterprise AI solutions.
    • AI learning hub
      Explore learning materials and tools, organized by task.
    • AI interactive demos
      Click through scenarios with Red Hat AI, including training LLMs and more.
    • AI/ML learning paths
      Expand your OpenShift AI knowledge using these learning resources.
    • AI quickstarts
      Focused AI use cases designed for fast deployment on Red Hat AI platforms.
    • No-cost AI training
      Foundational Red Hat AI training.

    Featured resources

    • OpenShift AI learning
    • Open source AI for developers
    • AI product application development
    • Open source-powered AI/ML for hybrid cloud
    • AI and Node.js cheat sheet

    Red Hat AI Factory with NVIDIA

    • Red Hat AI Factory with NVIDIA is a co-engineered, enterprise-grade AI solution for building, deploying, and managing AI at scale across hybrid cloud environments.
    • Explore the solution
  • Learn

    Self-guided

    • Documentation
      Find answers, get step-by-step guidance, and learn how to use Red Hat products.
    • Learning paths
      Explore curated walkthroughs for common development tasks.
    • Guided learning
      Receive custom learning paths powered by our AI assistant.
    • See all learning

    Hands-on

    • Developer Sandbox
      Spin up Red Hat's products and technologies without setup or configuration.
    • Interactive labs
      Learn by doing in these hands-on, browser-based experiences.
    • Interactive demos
      Click through product features in these guided tours.

    Browse by topic

    • AI/ML
    • Automation
    • Java
    • Kubernetes
    • Linux
    • See all topics

    Training & certifications

    • Courses and exams
    • Certifications
    • Skills assessments
    • Red Hat Academy
    • Learning subscription
    • Explore training
  • Build

    Get started

    • Red Hat build of Podman Desktop
      A downloadable, local development hub to experiment with our products and builds.
    • Developer Sandbox
      Spin up Red Hat's products and technologies without setup or configuration.

    Download products

    • Access product downloads to start building and testing right away.
    • Red Hat Enterprise Linux
    • Red Hat AI
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform
    • See all products

    Featured

    • Red Hat build of OpenJDK
    • Red Hat JBoss Enterprise Application Platform
    • Red Hat OpenShift Dev Spaces
    • Red Hat Developer Toolset

    References

    • E-books
    • Documentation
    • Cheat sheets
    • Architecture center
  • Community

    Get involved

    • Events
    • Live AI events
    • Red Hat Summit
    • Red Hat Accelerators
    • Community discussions

    Follow along

    • Articles & blogs
    • Developer newsletter
    • Videos
    • Github

    Get help

    • Customer service
    • Customer support
    • Regional contacts
    • Find a partner

    Join the Red Hat Developer program

    • Download Red Hat products and project builds, access support documentation, learning content, and more.
    • Explore the benefits

Perform inference using Intel OpenVINO Model Server on OpenShift

September 30, 2022
Audrey Reznik First Author: Darius Trawinski, Second Author: Ryan Loney
Related topics:
Artificial intelligenceData science
Related products:
Red Hat OpenShift Container Platform

    Model servers, as illustrated in Figure 1, are very convenient for AI applications. They act as microservices and can abstract the entirety of inference execution, making them agnostic to the training framework and hardware. They also offer easy scalability and efficient resource utilization.

    Diagram showing a model server as part of an AI application
    Figure 1: A model server as part of an AI application.

    Red Hat OpenShift and Kubernetes are optimal places for deploying model servers. However, managing them directly can be a complex task in a large-scale environment. In this article, you'll learn how the OpenVINO Model Server Operator can make it straightforward.

    Operator installation

    The operator can be easily installed from the OpenShift console. Just navigate to the OperatorHub menu (Figure 2), search for OpenVINO™ Toolkit Operator, then click the Install button.

    Screenshot showing the installation of the OpenVINO Toolkit Operator
    Figure 2: Install the OpenVINO Toolkit Operator.

    Deploying an OpenVINO Model Server in OpenShift

    Creating a new instance of the model server is easy in the OpenShift console interface (Figure 3). Click the Create ModelServer and then fill in the interactive form.

    Screenshot showing the creation of a Model Server
    Figure 3: Create a Model Server

    The default exemplary parameters deploy a fully functional model server with the well-known image classification model ResNet-50. This model is available in the public cloud for anyone to use. Why are we using this model? Because it saves us time from creating our own image classification model from scratch.

    A bit more information on the ResNet-50 model just in case you have never heard of it before: The model is a pre-trained deep learning model for image classification of the convolutional neural network, which is a class of deep neural networks most commonly applied to analyzing images. The 50 in the name represents the model being 50 layers deep. The model is trained on a million images in a thousand categories from the ImageNet database.

    If you'd rather use the command-line interface (CLI) instead of the OpenShift console, you would use a command like this:

    oc apply -f https://raw.githubusercontent.com/openvinotoolkit/operator/main/config/samples/intel_v1alpha1_ovms.yaml
    

    More complex deployments with multiple models or DAG pipelines can also be deployed fairly easily by adding a config.json file into a configmap and linking it with the ModelServer resource.

    In this article, let's check the usage with the default Resnet model. While deployed, it will create the resources shown in Figure 4.

    Screenshot showing resources for model deployment
    Figure 4: Resources for model deployment.

    How to run inferences from ovmsclient

    In this demonstration, let's create a pod in our OpenShift cluster that will act as a client. This can be done from the OpenShift console or from the CLI. We'll use a python:3.8.13 image with a sleep infinity command just to have a place for an interactive shell. We will submit a jpeg image of a zebra and see if the image can be identified by our model.

    oc create deployment client-test --image=python:3.8.13 -- sleep infinity
    
    oc exec -it $(oc get pod -o jsonpath="{.items[0].metadata.name}" -1
    app=client-test) -- bash
    

    From the interactive shell inside the client container, let's quickly test connectivity with the model server and check the model parameters.

    
     curl http://model-server-sample-ovms:8081/v1/config
     {
      "resnet" :
      {
       "model_version_status": [
       {
        "version": "1",
        "state": "AVAILABLE",
        "status": {
        "error_code": "OK",
        "error_message": "OK"
       }
      }
     }
    }
    
    

    Other REST API calls are described in the OpenVINO API reference guide.

    Now let's use the Python library ovmsclient to run the inference request:

    
     python3 -m venv /tmp/venv
     source /tmp/venv/bin/activate
     pip install ovmsclient
    
    

    We'll download a zebra picture to test out the classification:

    curl https://raw.githubusercontent.com/openvinotoolkit/model_server/main/demos/common/static/images/zebra.jpeg -o /tmp/zebra.jpeg
    
    
    Image of a zebra
    Figure 5: Picture of a zebra used for prediction.

    Below are the Python commands that will display the model metadata using the ovmsclient library:

    
     from ovmsclient import make_grpc_client
     client = make_grpc_client("model-server-sample-ovms:8080")
     model_metadata = client.get_model_metadata(model_name="resnet")
     print(model_metadata)
    
    

    Those commands produce the following response:

    
    {'model_version': 1, 'inputs': 
    {'map/TensorArrayStack/TensorArrayGatherV3:0': {'shape': [-1,
     -1, -1, -1], 'dtype': 'DT_FLOAT'}}, 'outputs':
    {'softmax_tensor': {'shape': [-1, 1001], 'dtype': 'DT_FLOAT'}}}
    
    

    Now you can create a Python script with basic client content:

    
     cat >> /tmp/predict.py <<EOL
     from ovmsclient import make_grpc_client
     import numpy as np
     client = make_grpc_client("model-server-sample-ovms:8080")
     with open("/tmp/zebra.jpeg", "rb") as f:
     data = f.read()
     inputs = {"map/TensorArrayStack/TensorArrayGatherV3:0": data}
     results = client.predict(inputs=inputs, model_name="resnet")
     print("Detected class:", np.argmax(results))
     EOL
    
     python /tmp/predict.py
     Detected class: 341
    
    

    Based on the ImageNet database which contains a thousand classes, our zebra image was matched to their zebra image, which happens to have the class ID 341 associated with it. This means that our image was successfully matched and is confirmed as a zebra image!

    Conclusion

    As you've seen, the OpenVINO Model Server can be easily deployed and used in OpenShift and Kubernetes environments. In this article, you learned how to run predictions using the ovmsclient Python library.

    You can learn more about the Operator and check out other demos with OpenVINO Model Server.

    Last updated: October 20, 2023

    Recent Posts

    • Tekton joins the CNCF as an incubating project

    • Federated identity across the hybrid cloud using zero trust workload identity manager

    • Confidential virtual machine storage attack scenarios

    • Introducing virtualization platform autopilot

    • Integrate zero trust workload identity manager with Red Hat OpenShift GitOps

    What’s up next?

    book cover

    Open Source Data Pipelines for Intelligent Applications provides data engineers and scientists insight into how Kubernetes provides a platform for building data platforms that increase an organization’s data agility. 

    Download the free e-book
    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Platforms

    • Red Hat AI
    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform
    • See all products

    Build

    • Developer Sandbox
    • Developer tools
    • Interactive tutorials
    • API catalog

    Quicklinks

    • Learning resources
    • E-books
    • Cheat sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site status dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit
    © 2026 Red Hat

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Chat Support

    Please log in with your Red Hat account to access chat support.