Classify interactive images with Jupyter Notebook on Red Hat OpenShift AI

Jupyter Notebook works with OpenShift AI to interactively classify images. In this learning path, you will use TensorFlow and ipywidgets to simulate real-time data streaming and visualization and interact directly with AI models.

Try it in our Developer Sandbox

This first lesson will walk you through the first two steps of launching your workbench.

Prerequisites:

In this lesson, you will:

  • Launch a workbench in OpenShift AI.
  • Prepare a dataset.
  • Automate the fetching of datasets using Kaggle's API.
  • Enable direct access to large datasets for machine learning.

Launch a workbench in OpenShift AI

To launch and OpenShift AI workbench, follow these steps:

  1. Navigate to the OpenShift AI dashboard, as shown in Figure 1.

    Alt text: Developer Sandbox selection: OpenShift AI
    Figure 1: Selecting a component of the Developer Sandbox.
  2. Click Launch for the OpenShift AI component of the sandbox.
    Once you select the OpenShift AI sandbox instance, it will take you to the OpenShift AI console dashboard. The interface includes a navigation menu on the left with options like Home, Applications, Data Science Projects, Data Science Pipelines, Experiments, Model Serving, and Resources. 
  3. Navigate to Home (Figure 2).

    Alt text: A screenshot of the OpenShift AI dashboard.
    Figure 2: OpenShift AI dashboard.
  4. Click Data Science Projects. Here, you can choose or create a project where you'll conduct your experiments (Figure 3). 

    The OpenShift AI dashboard shows the data science projects.
    Figure 3: Dashboard of data science projects in OpenShift AI.

Create a new workbench

  1. Click on your project name from the list of project names, then navigate to the workbenches section in the main content area (Figure 4).
  2. Click Create a workbench to initiate a new workbench, as shown in Figure 4. 
      

    The workbench interface shows the “Create a workbench” button.
    Figure 4: Create a workbench interface.
  3. Configure your project, as shown in Figure 5
    1. Name: Give it a descriptive name (e.g., interactive_classification).
    2. Notebook image: Select TensorFlow, as our model will use this framework.
    3. Deployment size: Choose Medium for a balanced performance.
    4. Cluster storage: When you add a name, the system will automatically create storage for your notebook. If you chose interactiveclassification, it will provision a storage component called interactiveclassification-storage with “Max 20GB”).  

      The workbench configuration page.
      Figure 5: Configure your workbench.
  4. Launch the workbench by selecting Create workbench. Wait for the status to indicate the workbench is running (Figure 6).    

    The status column shows the workbench is running.
    Figure 6: The workbench is running.

Obtain and prepare the dataset

This step involves two phases: Downloading the API key and setting it up on your workbench.

Download your Kaggle API key

  1. Navigate to the Kaggle website and log in with your account credentials to access your account (Figure 7).

    The Kaggle sign-in page lists multiple sign-in options.
    Figure 7: The Kaggle sign-in page lists multiple sign-in options.
  2. Go to your account settings. Click your profile icon at the top right corner of the page, then select Account from the dropdown menu (Figure 8).

    The Kaggle API download page.
    Figure 8: The Kaggle API download page.

Download the API key

  1. Scroll down to the section labeled API.
  2. Select Create New Token
  3. A file named kaggle.json will automatically download to your local machine. This file contains your API credentials. Check your downloads directory for kaggle.json (Figure 9). 

    The Kaggle API is set up in the JupyterLab.
    Figure 9: Kaggle API setup in JupyterLab.

Set up the API key on your workbench

  1. Access your workbench environment by clicking on the name of the workbench in your Developer Sandbox OpenShift AI dashboard. When the workbench appears (i.e., a Jupyter Notebook), you will have an interface comprising a control panel on the left with a file browser, and a content panel on the right, which will initially contain the Launcher set of tiles. Above the file browser, are a set of file action icons.
  2. Select the Upload horizontal line with a vertical arrow.
  3. Navigate to where you downloaded the kaggle.json file, and click on it. The file will upload and appear on the left-hand file browser list.
  4. Click the Git pulldown at the top of the workbench. Select Clone a repository.
  5. Set the URI for the Git repo to https://github.com/redhat-developer-demos/openshift-ai and click Clone.
  6. At this point, you will have a new directory in the file browser, openshift-ai. Click the openshift-ai directory in the file browser.
  7. Now click the 2_interactive_classification directory in the file browser.
  8. Now click the Interactive_Image_Classification_Notebook.ipynb file. This will load a pre-defined notebook into the content panel with extensive self-documentation.
  9. Scroll down to the first executable cell of the notebook. Executable cells are prefixed with [ ]. In this case the cell starts with !pip install kaggle. Make sure this cell is highlighted and click the run icon at the top of the workbench (the solid sideways triangle) to execute this cell. This will download the test dataset which will appear in the file browser.

Good job. Now the documentation in the cell matches the lessons to the steps. You've also launched JuptyterLab. Now we're ready to move on to real-time data streaming.

Previous resource
Overview: Classify interactive images with Jupyter Notebook on Red Hat OpenShift AI
Next resource
Real-time data streaming models with AI applications