We hope you enjoyed the first two Red Hat OpenShift Data Science learning paths: Launch Red Hat OpenShift Data Science and OpenShift Data Science documentation and resources.
This week, we've released two new learning paths, which address the common data science challenges of accessing Amazon S3 data and creating a TensorFlow model. Developers and data scientists can use these hands-on courses to learn how to access data and create machine learning models. You'll also learn how much easier common data science procedures are with OpenShift Data Science.
This article introduces the learning paths and provides an overview of OpenShift Data Science, including where to find more information.
Accessing Amazon S3 data with OpenShift Data Science
In this learning path you will learn how to:
- Set up your JupyterHub image to use Amazon S3.
- Access and download Amazon S3 data from Amazon Web Services (AWS).
- Analyze your Amazon S3 data using Python data frames.
Creating TensorFlow models in OpenShift Data Science
In this learning path you will learn how to:
- Set up your JupyterHub image to use TensorFlow.
- Explore a large public data set (MNIST).
- Build, train, and test a TensorFlow model.
Remember that you can also try OpenShift Data Science in the Developer Sandbox for Red Hat OpenShift.
Note: Visit the OpenShift Data Science page to see our complete library of learning paths and other resources for developers and data scientists collaborating on intelligent applications.
What is OpenShift Data Science?
OpenShift Data Science is a platform that makes it easier for developers and data scientists to develop, deploy, and monitor machine learning models. As a comprehensive environment built on top of Red Hat OpenShift, OpenShift Data Science integrates Jupyter notebooks—the core IDE where data scientists train models—with model development frameworks such as TensorFlow and PyTorch.
You can think of OpenShift Data Science as a meta-operator that sits above other Kubernetes Operators and combines them into a coherent, integrated environment. Currently, OpenShift Data Science partner technologies include:
- Anaconda Commercial Edition for secure distribution and package management
- IBM Watson Studio for building and managing models at scale and for AutoML
- Intel OpenVINO and oneAPI AI analytics toolkits for optimizing and tuning models
- Seldon Deploy for deploying, managing, and monitoring models
- Starburst Galaxy for data integration
Support for NVIDIA accelerated computing is coming very soon!
Where can I learn more?
Visit the OpenShift Data Science landing page to learn more about how data scientists, data engineers, and application developers use this service to collaborate across the intelligent application life cycle.
Last updated: June 3, 2024