Configure a Jupyter notebook to use GPUs for AI/ML modeling

Graphics processing units (GPU) have become the foundation of artificial intelligence. Machine learning was slow, inaccurate, and inadequate for many of today's applications. The inclusion and utilization of GPUs made a remarkable difference to large neural networks. Deep learning discovered solutions for image and video processing, putting things like autonomous driving or facial recognition into mainstream technology.

The connection between GPUs and Red Hat OpenShift does not stop at data science. High-performance computing is one of the hottest trends in enterprise tech. Cloud computing creates a seamless process enabling various tasks designated for supercomputers, better than any other computing power you use, saving you time and money.

How GPUs work

Let’s back up and make sure we understand how GPUs do what they do.

The term, graphics processing unit, was popularized in 1999 when Nvidia marketed its GeForce 256 with the capabilities of graphics transformation, lighting, and triangle clipping. These are math-heavy computations, which ultimately help render three-dimensional spaces. The engineering is tailored towards these actions, which allows processes to be increasingly optimized and accelerated. Performing millions of computations or using floating point values creates repetition. This is the perfect scenario for tasks to be run in parallel.

GPUs can dominate dozens of CPUs in performance with the help of caching and additional cores. Imagine we are attempting to process high-resolution images. For example, if one CPU takes one minute to process a single image, we would be stuck if we needed to go through nearly a million images for a video. It would take several years to run on a single CPU.

Scaling CPUs will linearly speed up the process. However, even at 100 CPUs, the process would take over a week, not to mention adding quite an expensive bill. A few GPUs, with parallel processing, can solve the problem within a day. We made impossible tasks possible with this hardware.

The evolution of GPUs

Eventually, the capabilities of GPUs expanded to include numerous processes, such as artificial intelligence, which often requires running computations on gigabytes of data. Users can easily integrate high-speed computing with simple queries to APIs and coding libraries with the help of complementary software packages for these beasts.

In November 2006, NVIDIA introduced CUDA, a parallel computing platform and programming model. This enables developers to use GPUs efficiently by leveraging the parallel compute engine in NVIDIA’s GPUs and guiding them to partition their complex problems into smaller, more manageable problems where each sub-problem is independent of the other's result.

NVIDIA further spread its roots by partnering with Red Hat OpenShift to adapt CUDA to Kubernetes, allowing customers to develop and deploy applications more efficiently. Prior to this partnership, customers interested in leveraging Kubernetes on top of GPUs had to manually write containers for CUDA and software to integrate Kubernetes with GPUs. This process was time-consuming and prone to errors. Red Hat OpenShift simplified this process by enabling the GPU operator to automatically containerize CUDA and other required software when a customer deploys OpenShift on top of a GPU server. 

Red Hat OpenShift Data Science (RHODS) expanded the mission of leveraging and simplifying GPU usage for data science workflows. Now when customers start their Jupyter notebook server on RHODS, they have the option to customize the number of GPUs required for their workflow and select Pytorch and TensorFlow GPU-enabled notebook images. You may be able to select 1 or more GPUs, depending on the GPU machine pool added to your cluster. Customers have the power to use GPUs in their data mining and model processing tasks. 

GPUs in RHODS

Interested in hearing more about using GPUs in RHODS?  Then check out our new learning path Configure a Jupyter notebook to use GPUs for AI/ML modeling, which can be found under the Getting Started section in our RHODS public sandbox.

Last updated: December 11, 2023