Featured image for machine learning

Machine learning and artificial intelligence applications require substantial resources to run in production scenarios. But you can develop and test these applications on a cluster environment that runs on your laptop. In this article, you'll learn how to properly customize Red Hat OpenShift and Red Hat CodeReady Containers so that you can quickly set up a clustering environment where you can run open source machine learning tools from Open Data Hub.

An AI environment on your laptop

It may seem surprising to say that you can do serious work in the AI/ML space on your home computer. But these days, laptops are no longer just for internet surfing. Typical laptop resources are unbelievably high. For instance, the specifications of my laptop are on par with those of a typical server:

  • CPU: Intel 10th gen Comet Lake H Series Core i7 (12 cores)
  • Memory: 64GB
  • Disk: 1TB

With such a great computer, you can now easily try hot technologies such as Kubernetes and deploy machine learning tools locally. This article discusses how to install Red Hat OpenShift, which is a Kubernetes-based environment, on a laptop, and how to prepare Red Hat OpenShift to install AI tools.

Red Hat OpenShift provides a command-line interface (oc), a versatile graphical interface, and other conveniences. Red Hat offers a fully managed cloud environment based on OpenShift, but you can also download OpenShift to your local system and get essentially the same environment for development and testing. CodeReady Containers provides the easiest way to run a local version of Red Hat OpenShift. If you haven't already installed CodeReady Containers, check out Red Hat's Getting Started guide to learn how; it includes information on libvirt and other software packages you'll need to install to get CodeReady Containers up and running.

To obtain machine learning tools, you can visit Open Data Hub, an open source project based on Kubeflow. Open Data Hub provides open source tools that can run large, distributed AI workloads on Red Hat OpenShift.

What resources do you need to run Open Data Hub tools?

By default, CodeReady Containers reserves the following system resources for Red Hat OpenShift:

  • 4 virtual CPUs
  • 8GB of memory
  • 35GB of storage space

These allocations are insufficient for running tools from Open Data Hub, which require a minimum of 6 CPUs and 16GB of RAM. In addition to this default requirement, you need a bit more resources to deploy popular AI tools. My testing has determined that the following system resources will allow you to deploy the Open Data Hub Operator, the Open Data Hub dashboard, the NFS Provisioner, JupyterHub, Ceph Nano, and Pachyderm:

  • 8 virtual CPUs
  • 20GB of memory
  • 70GB of storage space

You'll need to configure CodeRead Containers to make sure your development environment has access to these resources. Now let's get your hands dirty making to get that all set up properly.

Configure CodeReady Containers

If you previously deployed a CodeReady Containers image, remove it with the following command:

$ crc delete

Update the CPU, memory, disk space, and kubeadmin password configuration for CodeReady Containers as follows:

$ crc setup
$ crc config set memory 20000
$ crc config set cpus 8
$ crc config set disk-size 70
$ crc config set kubeadmin-password kubeadmin

Start CodeReady Containers. The information displayed should show that Red Hat OpenShift is installed:

$ crc start
WARN A new version (2.0.1) has been published on https://developers.redhat.com/content-gateway/file/pub/openshift-v4/clients/crc/2.0.1/crc-linux-amd64.tar.xz 
INFO A CodeReady Containers VM for OpenShift 4.9.15 is already running 
Started the OpenShift cluster.

The server is accessible via web console at:

Log in as administrator:
  Username: kubeadmin
  Password: kubeadmin

Log in as user:
  Username: developer
  Password: developer

Use the 'oc' command line interface:
  $ eval $(crc oc-env)
  $ oc login -u developer https://api.crc.testing:6443

The following screenshots illustrate how you can verify that you've correctly configured your OpenShift environment using the virt-manager package, which is a desktop user interface for managing virtual machines through libvirt. You can install virt-manager with the following command:

sudo dnf install virt-manager -y

These screenshots from the virt-viewer UI shows that the virtualized cluster has 8 virtual CPUs (Figure 1), 20,000MiB of allocated memory (Figure 2), and 70GiB of virtual storage (Figure 3).

OpenShift has 8 virtual CPUs.
Figure 2: OpenShift has 8 virtual CPUs.
Figure 1: Red Hat OpenShift has 8 virtual CPUs.
OpenShift has 20,000MiB of allocated memory.
Figure 3: OpenShift has 20,000MiB of allocated memory.
Figure 2: Red Hat OpenShift has 20,000MiB of allocated memory.
OpenShift has 70GiB of virtual storage.
Figure 4: OpenShift has 70GiB of virtual storage.
Figure 3: OpenShift has 70GiB of virtual storage.

Congratulations—you now have an OpenShift cluster that can play with Open Data Hub. Have fun exploring the possibilities of machine learning and artificial intelligence! For help getting started doing work with AI/ML and data science on Red Hat OpenShift, check out the learning paths on Red Hat OpenShift Data Science from Red Hat Developer.

Last updated: November 8, 2023