Classify interactive images with Jupyter Notebook on Red Hat OpenShift AI

Jupyter Notebook works with OpenShift AI to interactively classify images. In this learning path, you will use TensorFlow and ipywidgets to simulate real-time data streaming and visualization and interact directly with AI models.

Try it in our Developer Sandbox

Now that you’ve launched JupyterLab and prepared your dataset, it's time to use real-time data streaming models with AI applications.

Prerequisites:

In this lesson, you will:

  • Understand model performance dynamics.
  • Simulate real-time data streaming.
  • Assess your model’s accuracy.
  • Enhance the interactivity of your AI applications.

Build and train the model

This step can be broken down into two phases:

Data preparation 

  • Set up data generators for preprocessing images through scaling, augmentation, or normalization to enhance training efficiency. Architect the CNN with convolutional, pooling, and dense layers to extract and classify image features.

Model training

  • Compile: Use an optimizer, loss function, and metrics like accuracy.
  • Train: The model adjusts weights through backpropagation to reduce loss over 5 epochs, with each epoch organized into 50 steps in the output, "Found 20000 images belonging to 2 classes." This dataset has 20,000 images divided into cats and dogs, ensuring a balanced dataset for equitable model learning. This also ensures you have enough data to train on and the classes are balanced, which is crucial for model fairness.

Step-by-step analysis equips learners with the tools to interpret training outputs, troubleshoot issues (i.e., overfitting, underfitting), and refine their models to achieve better performance. Summaries and analysis from training logs are essential for learners to understand model performance dynamics, diagnose issues, and strategize improvements in machine learning projects. 

  1. In the workbench, scroll down past the text for Build and train the Model. Highlight the next executable cell prefixed with [ ]. This should start with, “from tensorflow.keras.layers import Input”. 

  2. With this cell highlighted, click the Play icon to execute the code. 

  3. You will see CUDA errors because the image is trying to take advantage of the GPU and the underlying container of the sandbox but does not have access to GPUs. These errors can be ignored.

  4. After 5 epochs, you should see a model accuracy and loss around 0.5624 (accuracy) 0.6891 (loss). The model is now trained.

Interactive real-time data streaming and visualization

Now it’s time to simulate real-time data interaction and demonstrate how you can use AI interactively in a simplified context by leveraging TensorFlow and ipywidgets to simulate real-time data streaming and visualization. Follow these steps:

  1. Create interactive dropdown menus. Implement dropdown menus to allow users to select and visually interact with predictions on cat or dog images.
  2. Utilize widgets for simulation. Use Jupyter Notebook widgets to simulate real-time data streaming. This approach is particularly useful for educational and demonstration purposes, showing how to dynamically process and visualize data in an interactive environment.
  3. Scroll down to the next executable cell. Highlight the next executable cell, which starts with # Import necessary libraries for handling images.
  4. Hit the play icon again.
  5. If all is well, a random image from the dataset will appear with the model’s prediction, and hopefully, this prediction should match the animal.
  6. Scroll past the result and highlight the next executable cell. If you look at the code, it should say predict_random_image(model, dog_path). The previous cell actually picked a random image from the set of cat photos, and this one picks from the dogs.
  7. Click the play icon to see if the model correctly identifies the picture as a dog (it will be a dog). If it doesn’t, re-highlight the cell and click play again. 

The advantage of using OpenShift AI workbenches/Jupyter is that you can execute any of the cells multiple times. It should identify the picture as a dog for the majority of executions of this cell.

Test the dataset with random image prediction

This step is crucial for evaluating the trained model's performance by testing it on unseen data in a real-world-like scenario. By randomly selecting images and predicting their classes, we can visually assess the model's accuracy and reliability. Run the cell in Step 5. The process is as follows:

  1. Random image selection: The function predict_random_image takes a directory path as input, checks if the path exists, and randomly selects an image from this directory, ensuring that each test is unbiased and represents a realistic use case.
  2. Image loading and preprocessing: The randomly chosen image file is then loaded and preprocessed to match the input format expected by the model. This involves resizing the image and normalizing its pixel values.
  3. Model prediction: The preprocessed image tensor is fed into the model to predict whether the image is of a cat or a dog. This step directly utilizes the neural network to interpret the image data.
  4. Visualization: The image along with its predicted class is displayed. This visual feedback is crucial for understanding the model's decision-making process and immediately seeing the result of the prediction.
  5. Interactive testing: By running the function with different directories (e.g., cats and dogs), users can interactively test how the model performs across varied inputs, making this a dynamic tool for demonstration and educational purposes. Testing the model with a random selection of images simulates how the model might perform in a production environment where inputs are not predetermined. It helps to identify potential biases, underfitting, or overfitting issues in the model. Additionally, visual feedback from test predictions is an excellent way to demonstrate the model's capabilities to a non-technical audience, making complex machine learning concepts more accessible and understandable.
  6. Now scroll down to the next executable cell. This code should start with import matplotlib.pyplot as plt.
  7. Highlight this cell. Click the Run icon again. It may get the prediction correct or incorrect because the model is not trained well.
  8. Repeat the execution to see the distribution of correct and incorrect responses.

Interactive real-time image prediction with widgets

This step integrates interactive web widgets to provide a user-friendly interface for real-time image prediction, showcasing how you can use TensorFlow and Jupyter Notebook widgets to enhance the interactivity and accessibility of AI applications.

  1. Scroll down to the next executable cell. This code will render a single button that allows you to repeat the prediction action via a widget in the workbench. 

  2. Execute the code and an HTML button will appear below the cell with the title Predict

  3. Click this button as many times as you like.

Address misclassification in your AI model

You will notice that the model behavior is not that accurate, so we will look at improving that accuracy.

Misclassification in machine learning models can significantly hinder your model's accuracy and reliability. To combat this, it's crucial to verify dataset balance, align preprocessing methods, and tweak model parameters. These steps are essential for ensuring that your model not only learns well but also generalizes to new, unseen data.

Why you should experiment with training adjustments

Before making these changes, go back to Step 3: Build and train the model. Adjusting the training process, such as the number of epochs and steps per epoch, can provide quicker feedback on model performance, which allows you to iteratively improve your model in a more controlled and informed manner. Here’s what you can do:

  • Adjust the number of epochs to optimize training speed. Changing the number of epochs can help you find the sweet spot where your model learns enough to perform well without overfitting. This is crucial for building a robust model that performs consistently.
     
  • Try different values for steps per epoch. Modifying steps_per_epoch affects how many batches of samples are used in one epoch. This can influence the granularity of the model updates and can help in dealing with imbalanced datasets or overfitting.

Example code to modify your training process

Make these modifications in your Notebook or another Python environment. Here’s how you might modify the training to see how these changes can impact your model's learning curve and overall performance:

# Adjust the number of epochs and steps per epoch
model.fit(train_generator, steps_per_epoch=100, epochs=10)

Summary

By following this learning path, you are not only gaining practical experience in image classification but also developing the skills necessary to extend these methods to more complex and diverse datasets. Whether you are doing this for educational purposes or professional projects, the knowledge you acquire here will set a solid foundation for future exploration and innovation in the field of artificial intelligence.

Ready to learn more?

Try these learning paths:

Previous resource
Launch JupyterLab and prepare your dataset