Skip to main content
Redhat Developers  Logo
  • Products

    Featured

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat OpenShift AI
      Red Hat OpenShift AI
    • Red Hat Enterprise Linux AI
      Linux icon inside of a brain
    • Image mode for Red Hat Enterprise Linux
      RHEL image mode
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • Red Hat Developer Hub
      Developer Hub
    • View All Red Hat Products
    • Linux

      • Red Hat Enterprise Linux
      • Image mode for Red Hat Enterprise Linux
      • Red Hat Universal Base Images (UBI)
    • Java runtimes & frameworks

      • JBoss Enterprise Application Platform
      • Red Hat build of OpenJDK
    • Kubernetes

      • Red Hat OpenShift
      • Microsoft Azure Red Hat OpenShift
      • Red Hat OpenShift Virtualization
      • Red Hat OpenShift Lightspeed
    • Integration & App Connectivity

      • Red Hat Build of Apache Camel
      • Red Hat Service Interconnect
      • Red Hat Connectivity Link
    • AI/ML

      • Red Hat OpenShift AI
      • Red Hat Enterprise Linux AI
    • Automation

      • Red Hat Ansible Automation Platform
      • Red Hat Ansible Lightspeed
    • Developer tools

      • Red Hat Trusted Software Supply Chain
      • Podman Desktop
      • Red Hat OpenShift Dev Spaces
    • Developer Sandbox

      Developer Sandbox
      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Openshift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • View All Technologies
    • Programming Languages & Frameworks

      • Java
      • Python
      • JavaScript
    • System Design & Architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer Productivity

      • Developer productivity
      • Developer Tools
      • GitOps
    • Secure Development & Architectures

      • Security
      • Secure coding
    • Platform Engineering

      • DevOps
      • DevSecOps
      • Ansible automation for applications and services
    • Automated Data Processing

      • AI/ML
      • Data Science
      • Apache Kafka on Kubernetes
      • View All Technologies
    • Start exploring in the Developer Sandbox for free

      sandbox graphic
      Try Red Hat's products and technologies without setup or configuration.
    • Try at no cost
  • Learn

    Featured

    • Kubernetes & Cloud Native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • Java
      Java icon
    • AI/ML
      AI/ML Icon
    • View All Learning Resources

    E-Books

    • GitOps Cookbook
    • Podman in Action
    • Kubernetes Operators
    • The Path to GitOps
    • View All E-books

    Cheat Sheets

    • Linux Commands
    • Bash Commands
    • Git
    • systemd Commands
    • View All Cheat Sheets

    Documentation

    • API Catalog
    • Product Documentation
    • Legacy Documentation
    • Red Hat Learning

      Learning image
      Boost your technical skills to expert-level with the help of interactive lessons offered by various Red Hat Learning programs.
    • Explore Red Hat Learning
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

Perform inference using Intel OpenVINO Model Server on OpenShift

September 30, 2022
Audrey Reznik First Author: Darius Trawinski, Second Author: Ryan Loney
Related topics:
Artificial intelligenceData Science
Related products:
Red Hat OpenShift Container Platform

Share:

    Model servers, as illustrated in Figure 1, are very convenient for AI applications. They act as microservices and can abstract the entirety of inference execution, making them agnostic to the training framework and hardware. They also offer easy scalability and efficient resource utilization.

    Diagram showing a model server as part of an AI application
    Figure 1: A model server as part of an AI application.

    Red Hat OpenShift and Kubernetes are optimal places for deploying model servers. However, managing them directly can be a complex task in a large-scale environment. In this article, you'll learn how the OpenVINO Model Server Operator can make it straightforward.

    Operator installation

    The operator can be easily installed from the OpenShift console. Just navigate to the OperatorHub menu (Figure 2), search for OpenVINO™ Toolkit Operator, then click the Install button.

    Screenshot showing the installation of the OpenVINO Toolkit Operator
    Figure 2: Install the OpenVINO Toolkit Operator.

    Deploying an OpenVINO Model Server in OpenShift

    Creating a new instance of the model server is easy in the OpenShift console interface (Figure 3). Click the Create ModelServer and then fill in the interactive form.

    Screenshot showing the creation of a Model Server
    Figure 3: Create a Model Server

    The default exemplary parameters deploy a fully functional model server with the well-known image classification model ResNet-50. This model is available in the public cloud for anyone to use. Why are we using this model? Because it saves us time from creating our own image classification model from scratch.

    A bit more information on the ResNet-50 model just in case you have never heard of it before: The model is a pre-trained deep learning model for image classification of the convolutional neural network, which is a class of deep neural networks most commonly applied to analyzing images. The 50 in the name represents the model being 50 layers deep. The model is trained on a million images in a thousand categories from the ImageNet database.

    If you'd rather use the command-line interface (CLI) instead of the OpenShift console, you would use a command like this:

    oc apply -f https://raw.githubusercontent.com/openvinotoolkit/operator/main/config/samples/intel_v1alpha1_ovms.yaml
    

    More complex deployments with multiple models or DAG pipelines can also be deployed fairly easily by adding a config.json file into a configmap and linking it with the ModelServer resource.

    In this article, let's check the usage with the default Resnet model. While deployed, it will create the resources shown in Figure 4.

    Screenshot showing resources for model deployment
    Figure 4: Resources for model deployment.

    How to run inferences from ovmsclient

    In this demonstration, let's create a pod in our OpenShift cluster that will act as a client. This can be done from the OpenShift console or from the CLI. We'll use a python:3.8.13 image with a sleep infinity command just to have a place for an interactive shell. We will submit a jpeg image of a zebra and see if the image can be identified by our model.

    oc create deployment client-test --image=python:3.8.13 -- sleep infinity
    
    oc exec -it $(oc get pod -o jsonpath="{.items[0].metadata.name}" -1
    app=client-test) -- bash
    

    From the interactive shell inside the client container, let's quickly test connectivity with the model server and check the model parameters.

    
     curl http://model-server-sample-ovms:8081/v1/config
     {
      "resnet" :
      {
       "model_version_status": [
       {
        "version": "1",
        "state": "AVAILABLE",
        "status": {
        "error_code": "OK",
        "error_message": "OK"
       }
      }
     }
    }
    
    

    Other REST API calls are described in the OpenVINO API reference guide.

    Now let's use the Python library ovmsclient to run the inference request:

    
     python3 -m venv /tmp/venv
     source /tmp/venv/bin/activate
     pip install ovmsclient
    
    

    We'll download a zebra picture to test out the classification:

    curl https://raw.githubusercontent.com/openvinotoolkit/model_server/main/demos/common/static/images/zebra.jpeg -o /tmp/zebra.jpeg
    
    
    Image of a zebra
    Figure 5: Picture of a zebra used for prediction.

    Below are the Python commands that will display the model metadata using the ovmsclient library:

    
     from ovmsclient import make_grpc_client
     client = make_grpc_client("model-server-sample-ovms:8080")
     model_metadata = client.get_model_metadata(model_name="resnet")
     print(model_metadata)
    
    

    Those commands produce the following response:

    
    {'model_version': 1, 'inputs': 
    {'map/TensorArrayStack/TensorArrayGatherV3:0': {'shape': [-1,
     -1, -1, -1], 'dtype': 'DT_FLOAT'}}, 'outputs':
    {'softmax_tensor': {'shape': [-1, 1001], 'dtype': 'DT_FLOAT'}}}
    
    

    Now you can create a Python script with basic client content:

    
     cat >> /tmp/predict.py <<EOL
     from ovmsclient import make_grpc_client
     import numpy as np
     client = make_grpc_client("model-server-sample-ovms:8080")
     with open("/tmp/zebra.jpeg", "rb") as f:
     data = f.read()
     inputs = {"map/TensorArrayStack/TensorArrayGatherV3:0": data}
     results = client.predict(inputs=inputs, model_name="resnet")
     print("Detected class:", np.argmax(results))
     EOL
    
     python /tmp/predict.py
     Detected class: 341
    
    

    Based on the ImageNet database which contains a thousand classes, our zebra image was matched to their zebra image, which happens to have the class ID 341 associated with it. This means that our image was successfully matched and is confirmed as a zebra image!

    Conclusion

    As you've seen, the OpenVINO Model Server can be easily deployed and used in OpenShift and Kubernetes environments. In this article, you learned how to run predictions using the ovmsclient Python library.

    You can learn more about the Operator and check out other demos with OpenVINO Model Server.

    Last updated: October 20, 2023

    Recent Posts

    • Unleashing multimodal magic with RamaLama

    • Integrate Red Hat AI Inference Server & LangChain in agentic workflows

    • Streamline multi-cloud operations with Ansible and ServiceNow

    • Automate dynamic application security testing with RapiDAST

    • Assessing AI for OpenShift operations: Advanced configurations

    What’s up next?

    book cover

    Open Source Data Pipelines for Intelligent Applications provides data engineers and scientists insight into how Kubernetes provides a platform for building data platforms that increase an organization’s data agility. 

    Download the free e-book
    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Products

    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform

    Build

    • Developer Sandbox
    • Developer Tools
    • Interactive Tutorials
    • API Catalog

    Quicklinks

    • Learning Resources
    • E-books
    • Cheat Sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site Status Dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue