Machine learning with Apache Spark on Kubernetes | DevNation Tech Talk

By Erik Erlandson, Burr Sutter
May 19, 2020

The first challenge for an AI/ML practitioner is to gather the data inputs needed to feed a learning model. This is where a solution such as Apache Spark’s unified DataFrame API and a scale-out compute model allows you to execute parallelized queries against SQL, Kafka, and S3. In this session, we are going to explore the use of and on top of Kubernetes/OpenShift to demonstrate a dynamically scalable ETL pipeline for federated data ingestion.

Show More Show Less

More Like This