Now that the model server is ready, we can deploy the model.
Click on the Deployed models tab. At the start, it shows a “0” since no models are deployed. Now click Deploy model (Figure 10).
Enter the following information for your new model:
- Model Name: The name you want to give to your model (e.g., coolstore).
- Model framework: The framework used to save this model. At this time, ONNX or OpenVino IR are supported.
- Model location: Select the data connection that you created to store the model. Alternatively, you can create another data connection directly from this menu.
- Folder path: If your model is not located at the root of the bucket of your data connection, you must enter the path to the folder it is in. In this example, the model was stored in the “coolstore-model” folder.
When your configuration is done, click on Deploy (Figure 11).
Now the model will be deployed. You will see the status icon turning during this process (Figure 12).
When the model has finished deploying, the status icon will be a green checkmark (Figure 13).
The model is now accessible through the API endpoint of the model server. The information about the endpoint is different, depending on how you configured the model server.
- If you did not expose the model externally through a route, click on the Internal Service link in the Inference endpoint section (Figure 14).
A popup will display the address for the gRPC and the REST URLs (Figure 15).
- The REST URL displayed is only the base address of the endpoint. You must append /v2/models/name-of-your-model/infer to it to have the full address. Example: http://modelmesh-serving.model-serving:8008/v2/models/coolstore/infer
- The full documentation of the API (REST and gRPC) is available here.
- The gRPC proto file for the Model Server is available here.
- If you have exposed the model through an external route, the Inference endpoint displays the full URL that you can copy (Figure 16)
Note: Even when you expose the model through an external route, the internal ones are still available. They use this format:
- REST: http://modelmesh-serving.name-of-your-project:8008/v2/models/name-of-your-model/infer
- gRPC: grpc://modelmesh-serving.name-of-your-project:8033
Your model is now deployed and ready to use!