Deep Dive from KubeCon 2018: Big Data SIG - Erik Erlandson, Red Hat & Yinan Li, Google

This presentation will cover two projects from sig-big-data: Apache Spark on Kubernetes and Apache Airflow on Kubernetes. Kubernetes became a native scheduler backend for Spark in 2.3 and we have been working on expanding the feature set as well as hardening the integration since then. Apache Airflow on Kubernetes achieved a big milestone with the new Kubernetes Operator for natively launching arbitrary Pods and the Kubernetes Executor that is a Kubernetes native scheduler for Airflow. We will give an overview of the current state and present the roadmap of both projects, and give attendees opportunities to ask questions and provide feedback on roadmaps.

Last updated: February 5, 2024