Now that the model server is ready, we can deploy the model.

Click on the Deployed models tab. At the start, it shows a “0” since no models are deployed. Now click Deploy model (Figure 10).

This shows the Deploy model button.
Figure 10: This shows the Deploy model button.

Enter the following information for your new model:

  • Model Name: The name you want to give to your model (e.g., coolstore).
  • Model framework: The framework used to save this model. At this time, ONNX or OpenVino IR are supported.
  • Model location: Select the data connection that you created to store the model. Alternatively, you can create another data connection directly from this menu.
  • Folder path: If your model is not located at the root of the bucket of your data connection, you must enter the path to the folder it is in. In this example, the model was stored in the “coolstore-model” folder. 

When your configuration is done, click on Deploy (Figure 11).

Deploy the model.
Figure 11: Deploy the model.

Now the model will be deployed. You will see the status icon turning during this process (Figure 12).

The status shows the model deploying.
Figure 12: The status shows the model deploying.

When the model has finished deploying, the status icon will be a green checkmark (Figure 13).

The status is now green indicating the model deployment is complete.
Figure 13: The status is now green indicating the model deployment is complete.

The model is now accessible through the API endpoint of the model server. The information about the endpoint is different, depending on how you configured the model server.

  • If you did not expose the model externally through a route, click on the Internal Service link in the Inference endpoint section (Figure 14).

 

The Internal Service Inference endpoint link.
Figure 14: The Internal Service Inference endpoint link.

A popup will display the address for the gRPC and the REST URLs (Figure 15).

The gRPC and REST URLs.
Figure 15: The gRPC and REST URLs.

Notes:

  • The REST URL displayed is only the base address of the endpoint. You must append /v2/models/name-of-your-model/infer to it to have the full address. Example: http://modelmesh-serving.model-serving:8008/v2/models/coolstore/infer
  • The full documentation of the API (REST and gRPC) is available here.
  • The gRPC proto file for the Model Server is available here.
  • If you have exposed the model through an external route, the Inference endpoint displays the full URL that you can copy (Figure 16)
The external route to the inference endpoint.
Figure 16: The external route to the inference endpoint.

Note: Even when you expose the model through an external route, the internal ones are still available. They use this format:

  • REST: http://modelmesh-serving.name-of-your-project:8008/v2/models/name-of-your-model/infer
  • gRPC: grpc://modelmesh-serving.name-of-your-project:8033

Your model is now deployed and ready to use!