Skip to main content
Redhat Developers  Logo
  • Products

    Platforms

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat AI
      Red Hat AI
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • View All Red Hat Products

    Featured

    • Red Hat build of OpenJDK
    • Red Hat Developer Hub
    • Red Hat JBoss Enterprise Application Platform
    • Red Hat OpenShift Dev Spaces
    • Red Hat OpenShift Local
    • Red Hat Developer Sandbox

      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Openshift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • View All Technologies
    • Programming Languages & Frameworks

      • Java
      • Python
      • JavaScript
    • System Design & Architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer Productivity

      • Developer productivity
      • Developer Tools
      • GitOps
    • Automated Data Processing

      • AI/ML
      • Data Science
      • Apache Kafka on Kubernetes
    • Platform Engineering

      • DevOps
      • DevSecOps
      • Ansible automation for applications and services
    • Secure Development & Architectures

      • Security
      • Secure coding
  • Learn

    Featured

    • Kubernetes & Cloud Native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • AI/ML
      AI/ML Icon
    • View All Learning Resources

    E-Books

    • GitOps Cookbook
    • Podman in Action
    • Kubernetes Operators
    • The Path to GitOps
    • View All E-books

    Cheat Sheets

    • Linux Commands
    • Bash Commands
    • Git
    • systemd Commands
    • View All Cheat Sheets

    Documentation

    • Product Documentation
    • API Catalog
    • Legacy Documentation
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

Implement MLOps with Kubeflow Pipelines

An approach to implementing machine learning operations in productive environments

January 25, 2024
Tarcisio Oliveira
Related topics:
Artificial intelligenceKubernetes
Related products:
Red Hat OpenShift

Share:

    MLOps, short for machine learning operations, is a set of practices and tools that combines DevOps principles applied to the development cycle of artificial intelligence applications.

    Kubeflow Pipelines is an open source platform for implementing MLOps, providing a framework for building, deploying, and managing machine learning workflows in a scalable, repeatable, secure, and cloud-oriented manner on Kubernetes.

    With the ability to drive agility and efficiency in the development and deployment of machine learning models, MLOps with Kubeflow Pipelines can also improve collaboration between data scientists and machine learning engineers, ensuring consistency and reliability throughout every step of the workflow.

    MLOps

    Created from the DevOps discipline, MLOps can be defined as a combination of cultural philosophies, practices and processes supported by platforms and tools with the goal of increasing an organization's ability to deliver artificial intelligence and machine learning applications and services with greater speed, without compromising safety and quality.

    In general, a machine learning model development cycle has these steps, as shown in Figure 1 and described below:

    • Gather and prepare data
    • Develop model
    • Deploy models in an application
    • Model monitoring and management
    Figure 1: Machine learning workflow
    Figure 1: Machine Learning Workflow
    Figure 1: Machine Learning Workflow

    Gather and prepare data

    Gather and prepare structured and/or unstructured data from data storages, data lakes, databases, and real-time data from streams like Kafka.

    Develop model

    Develop the model using libraries such as TensorFlow and PyTorch.

    Deploy models in an application

    Automate the model build and deployment process, implementing CI/CD, using tools such as Kubeflow Pipelines.

    Model monitoring and management

    Use tools for observability, such as Prometheus and Grafana, implementing logging and monitoring and using this feedback to improve model performance, in addition to detecting biases and model drift.

    Kubeflow Pipelines

    In the context of MLOps, an AI/ML pipeline refers to the end-to-end process of deploying and managing machine learning models in a production environment. It encompasses various stages from data preparation and model training to deployment, monitoring, and continuous improvement. An AI/ML pipeline is crucial for automating and streamlining the workflow of machine learning projects, ensuring efficiency, reproducibility, and scalability.

    Kubeflow Pipelines is an open source machine learning platform designed to orchestrate and automate the building, deployment, and management of machine learning workflows on Kubernetes, with the ability to implement every step of operations on development of machine learning models, MLOps.

    By leveraging Kubeflow Pipelines, organizations can streamline their machine learning operations and enable collaboration between team members, data scientists and machine learning engineers. Figure 2 shows an example of a pipeline run.

    Figure 2: Kubeflow Pipelines example
    Figure 2: Kubeflow Pipelines Example
    Figure 2: Kubeflow Pipelines example

    Creating a pipeline

    As a requirement to create your pipeline, you need access to a Kubernetes environment with Kubeflow Pipelines installed. In addition to the infrastructure, Python (also possible with R) and Jupyter notebook (optional) will be used.

    There are no restrictions, but we suggest using Red Hat OpenShift AI, as all these components are already installed by default. It is possible to create a sandbox environment through the Red Hat Developer website at no cost.

    The pipeline to be created will have 3 tasks: generating a message for a given name, generating a random number within a given range and generating a final message, an odds or evens game message, based on the previous message and the random number.

    All the code can be found in the ml_pipelines repository.

    Creating the tasks

    The tasks will be functions implemented in Python and executed by the pipeline. There are other ways to implement tasks, but in this format it is simple to maintain and reuse the code, making it possible to scale and make it also possible for different teams to work together.

    You can find the implementation for all tasks in the components folder of the repository.

    Create Hello World message

    Creates a personalized greeting message for the given name. Here is the implementation of this task, in which you can find in the script create_hello_world_message.py:

    def create_hello_world_message(name : str) -> str:
        """
        Creates a personalized greeting message for the given name.
    
        Parameters:
            - name (str) : The name for which the message is created.
    
        Returns:
            - hello_world_message (str) : A personalized greeting message for the given name.
    
        Raises:
            - ValueError : If the given name is empty or None.
        """
    
        if not name:
    
            raise ValueError
    
        hello_world_message = f'Hello World, {name}!'
    
        print(f'name                : { name }')
        print(f'hello_world_message : { hello_world_message }')
    
        return hello_world_message

    Create random number

    Creates a random integer within the specified range, minimum and maximum, both included.

    Create odds or evens message

    Creates a game result message based on the given random number, containing the given hello world message.

    Create the pipeline using Kubeflow Pipelines SDK for Tekton

    There is an SDK that makes creating pipelines easier: Kubeflow Pipelines SDK. In this example, we will use the Kubeflow Pipelines SDK for Tekton to compile, upload, and run Kubeflow Pipeline DSL scripts on a Kubeflow Pipelines back end with Tekton.

    The code for this pipeline implementation can be found in the kfp_tekton folder of the repository, specifically on the notebook 01_hello_world.ipynb.

    Install kfp-tekton package

    !pip install kfp-tekton==1.5.9

    Import the necessary packages, including task functions

    import os
    import sys
    sys.path.append(os.path.dirname(os.getcwd()))
    
    import kfp
    import kfp_tekton
    
    from components.create_hello_world_message   import create_hello_world_message
    from components.create_odds_or_evens_message import create_odds_or_evens_message
    from components.create_random_number         import create_random_number

    Create the tasks components

    Building Python function-based components.

    task_base_image = 'registry.access.redhat.com/ubi9/python-311'
    
    create_hello_world_message_op = kfp.components.create_component_from_func(
        func       = create_hello_world_message,
        base_image = task_base_image
    )
    
    create_random_number_op = kfp.components.create_component_from_func(
        func       = create_random_number,
        base_image = task_base_image
    )
    
    create_odds_or_evens_message_op = kfp.components.create_component_from_func(
        func       = create_odds_or_evens_message,
        base_image = task_base_image
    )

    Create the pipeline

    pipeline_name        = '01_hello_world'
    pipeline_description = 'Hello World Pipeline'
    
    @kfp.dsl.pipeline(
        name        = pipeline_name,
        description = pipeline_description
    )
    def pipeline(
        name    : str,
        minimum : int,
        maximum : int
    ):
    
        create_hello_world_message_task = create_hello_world_message_op(
            name = name
        )
    
        create_random_number_task = create_random_number_op(
            minimum = minimum,
            maximum = maximum
        )
    
        create_odds_or_evens_message_op(
            hello_world_message = create_hello_world_message_task.output,
            random_number       = create_random_number_task.output
        )

    Create pipeline YAML

    In this step, a Tekton-compatible PipelineRun YAML file will be created.

    pipeline_package_path = os.path.join('yaml', f'{ pipeline_name }.yaml')
    
    kfp_tekton.compiler.TektonCompiler().compile(
        pipeline_func = pipeline,
        package_path  = pipeline_package_path
    )

    Using the Kubeflow Pipelines graphical interface, you can use the generated YAML file to import the pipeline, as shown in Figure 3.

    Figure 3: Importing the pipeline YAML
    Figure 3: Importing pipeline yaml
    Figure 3: Importing the pipeline YAML

    The newly imported pipeline should be visible, showing the flow of tasks and available to be executed (Figure 4).

    Figure 4: Viewing the imported pipeline through the YAML file
    Figure 4: Pipeline
    Figure 4: Viewing the imported pipeline through the YAML file

    Once imported, we can run the pipeline through the graphical interface and monitor its progress (Figure 5).

    Figure 5: Running the pipeline
    Figure 5: Running the pipeline
    Figure 5: Running the pipeline

    Create pipeline run

    It is possible to execute a pipeline via code, either through the function annotated by @pipeline or by executing the pipeline definition YAML file.

    kubeflow_host  = ''
    kubeflow_token = ''
    
    pipeline_arguments = {
        'name'    : 'ML Pipelines',
        'minimum' : 1,
        'maximum' : 100
    }
    
    kfp_tekton.TektonClient(host = kubeflow_host, existing_token = kubeflow_token).create_run_from_pipeline_package(
        pipeline_file = pipeline_package_path,
        arguments     = pipeline_arguments
    )

    As before, we can observe the pipeline execution through the Kubeflow Pipelines graphical interface (Figure 6).

    Figure 6: Running pipeline from Jupyter Notebook
    Figure 6: Pipeline run
    Figure 6: Running pipeline from Jupyter Notebook

    Creating the pipeline using Elyra

    Elyra is an open source set of extensions to JupyterLab notebooks focused on AI/ML development. It provides a Pipeline Visual Editor for building AI pipelines from notebooks, Python scripts, and R scripts, simplifying the conversion of multiple notebooks or scripts files into batch jobs or workflows.

    Pipelines created by this plug-in are .json files with a .pipeline extension. You can check the Hello World Pipeline in the elyra folder of the repository, and it can be opened directly in a Jupyter notebook (make sure you have the extension installed).

    Before creating the pipeline, configure the runtime and runtime images to run the tasks. If you are using Red Hat OpenShift AI, all configuration will be set by default.

    Drag and drop the components to create the tasks

    You can also connect the outputs and inputs of the tasks (see Figure 7).

    Figure 7: Elyra pipeline tasks
    Figure 7: Elyra Pipeline Tasks
    Figure 7: Elyra pipeline tasks

    Configure tasks

    Create the pipeline parameters and inject them into the corresponding tasks. Also, configure the runtime image and output file. Everything can be done from the graphical interface.

    create_hello_world_message.py
    runtime image: Python 3
    pipeline parameter: name (str)
    output: create_hello_world.json
    create_random_number.py
    runtime image: Python 3
    pipeline parameter: minimum (int)
    pipeline parameter: maximum (int)
    output: create_random_number.json
    create_odd_or_evens_message.py
    runtime image: Python 3
    output: create_odds_or_evens_message.json
     

    Warning alert:

    Elyra Pipeline

    Using the Elyra Pipeline plug-in, components are executed in script format. It is necessary to code a wrapper to handle the execution. Example from create_hello_world_message.py:

    if __name__ == '__main__':
        """
        Elyra Pipelines
        """
    
        import os
        import json
    
        name = os.getenv('name')
    
        hello_world_message = create_hello_world_message(
            name = name
        )
    
        output = {
            'name'                : name,
            'hello_world_message' : hello_world_message
        }
    
        with open('create_hello_world_message.json', 'w', encoding = 'utf-8') as output_file:
    
            json.dump(output, output_file, ensure_ascii = False, indent = 4)

    If everything is configured correctly, you will be able to run the pipeline. Figure 8 shows the Hello World pipeline from the repository 01_hello_world.pipeline:

    Figure 8: Elyra Hello World pipeline
    Figure 8: Elyra Hello World Pipeline
    Figure 8: Elyra Hello World pipeline

    When running the pipeline, you should receive a job submission confirmation message, as illustrated in Figure 9.

    Figure 9: Elyra Hello World pipeline job submission
    Figure 9: Elyra Hello World Pipeline Job Submission
    Figure 9: Elyra Hello World pipeline job submission

    You can follow the execution through the Kubeflow Pipelines graphical interface (Figure 10).

    Figure 10: Elyra Hello World pipeline running
    Figure 10: Elyra Hello World Pipeline Run
    Figure 10: Elyra Hello World pipeline running

    Red Hat OpenShift AI

    If you are using OpenShift AI, just like before using kfp-tekton, you can import PipelineRun YAML files or check imported pipelines directly through the OpenShift console (Figure 11).

    Figure 11: OpenShift AI pipeline
    Figure 11: RHOAI Pipeline
    Figure 11: OpenShift AI pipeline

    You can also search and check previous execution of pipelines executed through kfp-tekton or Elyra, as shown in Figure 12.

    Figure 12: OpenShift AI pipeline runs
    Figure 12: RHOAI Pipeline Runs
    Figure 12: OpenShift AI pipeline runs

    Finally, you can create new runs and experiments (Figure 13).

    Red Hat OpenShift AI Pipeline Run.
    Figure 13: RHOAI Pipeline Run
    Figure 13: OpenShift AI pipeline run.

    Conclusion

    There are several tools for automating machine learning workflows. In this article, we presented 2 ways to create and run pipelines in a code (kfp-tekton) or visual way (elyra). Choose the way that best suits your process, that you and your team feel most comfortable with.

    In conclusion, implementing MLOps with Kubeflow Pipelines represents a fundamental advancement in establishing a robust and efficient machine learning lifecycle.

    As organizations increasingly recognize the importance of seamless collaboration between data scientists, machine learning engineers, and operations teams, Kubeflow Pipelines emerges as a powerful orchestration platform that simplifies deployment, monitoring, and management of machine learning workflows.

    Last updated: April 1, 2024

    Related Posts

    • Integrating Kubeflow with Red Hat OpenShift Service Mesh

    • How to install Kubeflow 1.2 on Red Hat OpenShift

    • From notebooks to pipelines: Using Open Data Hub and Kubeflow on OpenShift

    • How to use LLMs in Java with LangChain4j and Quarkus

    • How to integrate Quarkus applications with OpenShift AI

    • Why GPUs are essential for AI and high-performance computing

    Recent Posts

    • Clang bytecode interpreter update

    • How Red Hat has redefined continuous performance testing

    • Simplify OpenShift installation in air-gapped environments

    • Dynamic GPU slicing with Red Hat OpenShift and NVIDIA MIG

    • Protecting virtual machines from storage and secondary network node failures

    What’s up next?

    In our How to create a PyTorch model learning path, you will set up options for your Jupyter notebook server and select your PyTorch preferences, then explore the dataset you'll use to create your model. Finally, you will learn how to build, train, and run your PyTorch model.

    Start learning
    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Platforms

    • Red Hat AI
    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform
    • See all products

    Build

    • Developer Sandbox
    • Developer Tools
    • Interactive Tutorials
    • API Catalog

    Quicklinks

    • Learning Resources
    • E-books
    • Cheat Sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site Status Dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit
    © 2025 Red Hat

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue