Learn how to build, train, and run a PyTorch model

We hope you have enjoyed the first four Red Hat OpenShift Data Science learning paths:

To complement these resources, we have released a new data science learning path that will guide you through developing a PyTorch model that will be used to predict the onset of diabetes. This article describes the PyTorch learning path and provides an overview of OpenShift Data Science.

Note: Visit the OpenShift Data Science page to see our complete library of learning paths and other resources for developers and data scientists collaborating on intelligent applications.

Build, train, and run a PyTorch model

In How to create a PyTorch model, you will perform the following tasks:

Start your Jupyter notebook server for PyTorch.
Explore the diabetes data set.
Build, train, and run your PyTorch model.

This learning path is the first in a three-part series about working with PyTorch models. In the first learning path, we show you how to explore your data set and create a basic PyTorch model. The model will help us predict if a person might have diabetes based on current medical readings. You will work with a data set that contains a number of diabetes readings for female patients with and without diabetes.

The Diabetes data set

The Diabetes data set can be used to predict the onset of diabetes based on medical diagnostic measurements. This database is available through the Kaggle environment and is described as follows:

“This data set is originally from the National Institute of Diabetes and Digestive and Kidney Diseases. The objective of the data set is to diagnostically predict whether a patient has diabetes based on diagnostic measurements included in the data set. Several constraints were placed on selecting these instances from a larger database. In particular, all patients here are females at least 21 years old of Pima Indian heritage.”

The data set consists of about 800 examples of various medical readings for female patients who are members of an indigenous nation. Some of the patients have diabetes. Knowing what medical readings look like for a person with diabetes, can we predict which people might have diabetes based on the medical readings we have gathered?

Let's dive in and see if we can create a PyTorch model to achieve this. Start the learning path now.

What is OpenShift Data Science?

OpenShift Data Science is a platform that makes it easier for developers and data scientists to develop, deploy, and monitor machine learning models. As a comprehensive environment built on top of Red Hat OpenShift, OpenShift Data Science integrates Jupyter notebooks—the core IDE where data scientists train models—with model development frameworks such as TensorFlow and PyTorch.

You can think of OpenShift Data Science as a meta-operator that sits above other Kubernetes Operators and combines them into a coherent, integrated environment. Currently, OpenShift Data Science partner technologies include:

Anaconda Commercial Edition for secure distribution and package management
IBM Watson Studio for building and managing models at scale and for AutoML
Intel OpenVINO and oneAPI AI analytics toolkits for optimizing and tuning models
Seldon Deploy for deploying, managing, and monitoring models
Starburst Galaxy for data integration

Support for NVIDIA accelerated computing is also coming soon.

Note: You can also try OpenShift Data Science in the Developer Sandbox for Red Hat OpenShift.

Where can I learn more?

Visit the OpenShift Data Science landing page to learn more about how data scientists, data engineers, and application developers use this service to collaborate across the intelligent application life cycle.

Last updated: September 20, 2023

Report a website issue

Linux

Java runtimes & frameworks

Kubernetes

Integration & App Connectivity

AI/ML

Automation

Developer tools

Developer Sandbox

Programming Languages & Frameworks

System Design & Architecture

Developer Productivity

Secure Development & Architectures

Platform Engineering

Automated Data Processing

Start exploring in the Developer Sandbox for free

Interactive Lessons and Learning Paths

Developer Sandbox Activities

E-Books

Tutorials

Cheat Sheets

Documentation

Red Hat Learning

Learn how to build, train, and run a PyTorch model

Build, train, and run a PyTorch model

The Diabetes data set

What is OpenShift Data Science?

Where can I learn more?

Kafka Monthly Digest: February 2025

OpenShift Monitoring and Webhook Alert Notifications

Deployment-ready reasoning with quantized DeepSeek-R1 models

Benchmarking the Vertical Pod Autoscaler

vLLM V1: Accelerating multimodal inference for large language models

Products

Build

Quicklinks

Communicate

RED HAT DEVELOPER

Red Hat legal and privacy links

Red Hat legal and privacy links

Report a website issue

Learn how to build, train, and run a PyTorch model

Share:

Build, train, and run a PyTorch model

The Diabetes data set

What is OpenShift Data Science?

Where can I learn more?

Products

Build

Quicklinks

Communicate

RED HAT DEVELOPER

Red Hat legal and privacy links

Red Hat legal and privacy links

Report a website issue