Open Data Hub is an open source project providing an end-to-end artificial intelligence and machine learning (AI/ML) platform that runs on Red Hat OpenShift. As we explained in our previous article, we see real potential and value in the Kubeflow project, and we've enabled Kubeflow 0.7 on RedHat OpenShift 4.2. Kubeflow installs multiple AI/ML components and requires Istio to control and route service traffic.
As part of the Open Data Hub project, we've also integrated Kubeflow with Red Hat OpenShift Service Mesh. In this article, we present Red Hat OpenShift Service Mesh as an alternative to the native Kubeflow Istio installation, especially for users who already have OpenShift Service Mesh installed on their cluster.
Red Hat OpenShift Service Mesh
For scalability and fault tolerance, most cloud-based applications are designed using microservices and dependencies. A service mesh is an infrastructure layer that connects microservices working together in a shared-services architecture. OpenShift Service Mesh is based on Istio and provides similar mechanisms for securing, controlling, and routing microservices. It also offers additional features and ease of installation.
As a key feature, OpenShift Service Mesh supports a multi-tenant control plane, which makes it easy to manage multiple service-mesh ecosystems within a single cluster. Simply declare a namespace in your instance of the Istio Service Mesh Member Roll and it will be included in a service mesh.
OpenShift Service Mesh also comes with Kiali and Jaeger installed by default. Kiali provides an interactive graph of the microservices in your service mesh, where you can see how services are connected and what controls are imposed for each one. Jaeger is an open source distributed-tracing platform that lets you monitor and troubleshoot interactions between microservices.
Later in the article, we'll guide you through the process of installing OpenShift Service Mesh and its required components, which you can do easily from Red Hat OpenShift Container Platform's OperatorHub.
Note: See the Red Hat OpenShift Service Mesh documentation for more detailed information.
Integrating Kubeflow 0.7 with Red Hat OpenShift Service Mesh
We had to make several changes to integrate Kubeflow 0.7 with Red Hat OpenShift Service Mesh. In this section, we'll briefly describe the changes, which you can also find listed on Open Data Hub's GitHub repository.
Allow traffic to the Katib controller service
OpenShift Service Mesh defines network policies for all namespaces that are listed as a member of a given service mesh. Traffic that is not defined in a network policy is blocked. We added an overlay to the Kubeflow Istio component that creates a network policy allowing traffic to the Katib controller service:
kind: NetworkPolicy apiVersion: networking.k8s.io/v1 metadata: name: allow-ingress-tokubelfow namespace: kubeflow spec: podSelector: matchLabels: app: katib-controller ingress: - from: - namespaceSelector: {} - ports: - protocol: TCP port: 443 policyTypes: - Ingress
Support for the multi-tenant control plane
In the same overlay, we replaced Istio's ClusterRbacConfig
with ServiceMeshRbacConfig
, which supports a multi-tenant control plane. We also replaced the ClusterRoleBinding
with a project-scoped RoleBinding
.
A new component and Kfctl file
We added a new component within the istio-system
namespace that creates a ServiceMeshMemberRoll
resource and adds the kubeflow
namespace as a member:
apiVersion: maistra.io/v1 kind: ServiceMeshMemberRoll metadata: # the service must be named default name: default namespace: istio-system spec: members: # a list of projects joined into the service mesh - kubeflow
We also added a new kfctl
file to the service mesh, kfdef/kfctl_openshift-servicemesh.yaml
. Doing this updates the control plane for deploying and managing Kubeflow within OpenShift Service Mesh.
Installing Kubeflow 0.7 with Red Hat OpenShift Service Mesh
We described the prerequisites for installing Kubeflow 0.7 in our previous article. Mainly, you will need an OpenShift 4.2 or higher cluster and the kfctl
command-line tool. This section guides you through the installation.
Step 1: Install Red Hat OpenShift Service Mesh
It is crucial to install the correct version of Red Hat OpenShift Service Mesh for your OpenShift cluster. For this example, we'll use the OpenShift Container Platform 4.2 installation. The OpenShift Service Mesh installation automatically includes the following operators:
- Elasticsearch Operator
- Jaeger Operator
- Kiali Operator
- Red Hat Service Mesh Operator
Be sure to wait for confirmation of each successful installation before continuing.
Step 2: Create an instance of ServiceMeshControlPlane
After installing all of the required operators, create an instance of ServiceMeshControlPlane
. The default in the example will work. You do not need to install a ServiceMeshMemberRoll
instance; that will be created automatically when you install Kubeflow.
Step 3: Clone the opendatahub-manifest fork repo
From a terminal, log in to the OpenShift Container Platform cluster and clone the opendatahub-manifest
fork repo, which defaults to the branch, v0.7.0-branch-openshift
, as shown:
$ git clone https://github.com/opendatahub-io/manifests.git $ cd manifests
Step 4: Install Kubeflow
Next, use the OpenShift Service Mesh configuration file and the locally downloaded manifests to install Kubeflow:
$ sed -i 's#uri: .*#uri: '$PWD'#' ./kfdef/kfctl_openshift_servicemesh.yaml (on mac try $ sed -i "" 's#uri: .*#uri: '$PWD'#' ./kfdef/kfctl_openshift_servicemesh.yaml) $ kfctl build --file=kfdef/kfctl_openshift_servicemesh.yaml -V $ kfctl apply --file=./kfdef/kfctl_openshift_servicemesh.yaml -V
Note that at the time of this writing, we are addressing a Kubeflow installation bug that does not allow downloading the manifests during a build process.
Step 5: Check the virtual services
Check the virtual services that were created by Kubeflow components:
$ oc get virtualservices -n kubeflow NAME GATEWAYS HOSTS AGE argo-ui [kubeflow-gateway] [*] 4m20s centraldashboard [kubeflow-gateway] [*] 4m19s google-api-vs [www.googleapis.com] 4m23s google-storage-api-vs [storage.googleapis.com] 4m23s grafana-vs [kubeflow-gateway] [*] 4m22s jupyter-web-app [kubeflow-gateway] [*] 4m12s katib-ui [kubeflow-gateway] [*] 4m1s kfam [kubeflow-gateway] [*] 3m52s metadata-grpc [kubeflow-gateway] [*] 4m9s metadata-ui [kubeflow-gateway] [*] 4m9s ml-pipeline-tensorboard-ui [kubeflow-gateway] [*] 3m56s ml-pipeline-ui [kubeflow-gateway] [*] 3m55s tensorboard [kubeflow-gateway] [*] 4m6s
Also, check the Kubeflow gateway:
$ oc get gateways -n kubeflow NAME AGE kubeflow-gateway 5m35s
That completes the installation. Next, let's get started with Kubeflow on OpenShift Service Mesh.
Access the Kubeflow portal
To access the Kubeflow portal, go to the istio-system
namespace and click on the istio-ingressgateway
route from the Networking menu item. Kubeflow will ask you to specify a namespace where you can run pipelines and Jupyter Notebook servers. Enter a namespace, as shown in Figure 1.
Kubeflow creates this namespace, so we have to add it to the ServiceMeshMemberRoll
. (The ServiceMeshMemberRoll
was created when we installed Kubeflow, but at the moment we have to add the namespace manually. We're working on automating this process.)
To add the namespace, go to istio-system namespace --> Installed Operators --> Red Hat OpenShift Service Mesh --> Istio Service Mesh Member Roll -->default --> yaml. Add your namespace under spec.members
, as shown in Figure 2. Note that this is for the namespace nakfour
.
Create a Jupyter Notebook server
At this point, you should be able to create a Jupyter Notebook server. Be sure to select the namespace you added to the ServiceMeshMemberRoll
, and select the custom notebook image from our previous article. These steps are shown in Figure 3.
Figure 3. Create a Jupyter Notebook server.">
Conclusion and next steps
The Open Data Hub team is currently working on enhancements and features for Open Data Hub. These include being able to use the kfctl
operator to co-install integrated Open Data Hub and Kubeflow components; automating the Red Hat OpenShift Service Mesh installation; and automating the process to add namespaces created by the Kubeflow profile controller in the ServiceMeshMemberRoll
. You can find all of our work in the Open Data Hub GitHub repository and the Open Data Hub GitLab repository. We also invite you to join us for our bi-weekly community meetings.