Skip to main content
Redhat Developers  Logo
  • Products

    Featured

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat OpenShift AI
      Red Hat OpenShift AI
    • Red Hat Enterprise Linux AI
      Linux icon inside of a brain
    • Image mode for Red Hat Enterprise Linux
      RHEL image mode
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • Red Hat Developer Hub
      Developer Hub
    • View All Red Hat Products
    • Linux

      • Red Hat Enterprise Linux
      • Image mode for Red Hat Enterprise Linux
      • Red Hat Universal Base Images (UBI)
    • Java runtimes & frameworks

      • JBoss Enterprise Application Platform
      • Red Hat build of OpenJDK
    • Kubernetes

      • Red Hat OpenShift
      • Microsoft Azure Red Hat OpenShift
      • Red Hat OpenShift Virtualization
      • Red Hat OpenShift Lightspeed
    • Integration & App Connectivity

      • Red Hat Build of Apache Camel
      • Red Hat Service Interconnect
      • Red Hat Connectivity Link
    • AI/ML

      • Red Hat OpenShift AI
      • Red Hat Enterprise Linux AI
    • Automation

      • Red Hat Ansible Automation Platform
      • Red Hat Ansible Lightspeed
    • Developer tools

      • Red Hat Trusted Software Supply Chain
      • Podman Desktop
      • Red Hat OpenShift Dev Spaces
    • Developer Sandbox

      Developer Sandbox
      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Openshift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • View All Technologies
    • Programming Languages & Frameworks

      • Java
      • Python
      • JavaScript
    • System Design & Architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer Productivity

      • Developer productivity
      • Developer Tools
      • GitOps
    • Secure Development & Architectures

      • Security
      • Secure coding
    • Platform Engineering

      • DevOps
      • DevSecOps
      • Ansible automation for applications and services
    • Automated Data Processing

      • AI/ML
      • Data Science
      • Apache Kafka on Kubernetes
      • View All Technologies
    • Start exploring in the Developer Sandbox for free

      sandbox graphic
      Try Red Hat's products and technologies without setup or configuration.
    • Try at no cost
  • Learn

    Featured

    • Kubernetes & Cloud Native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • Java
      Java icon
    • AI/ML
      AI/ML Icon
    • View All Learning Resources

    E-Books

    • GitOps Cookbook
    • Podman in Action
    • Kubernetes Operators
    • The Path to GitOps
    • View All E-books

    Cheat Sheets

    • Linux Commands
    • Bash Commands
    • Git
    • systemd Commands
    • View All Cheat Sheets

    Documentation

    • API Catalog
    • Product Documentation
    • Legacy Documentation
    • Red Hat Learning

      Learning image
      Boost your technical skills to expert-level with the help of interactive lessons offered by various Red Hat Learning programs.
    • Explore Red Hat Learning
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

Open source edge detection with OpenCV and Pachyderm

June 1, 2022
JooHo Lee
Related topics:
Artificial intelligenceCI/CDKubernetes
Related products:
Red Hat OpenShift

Share:

    Edge detection is central to image recognition, which is one of the most common applications of machine learning. This article introduces a Jupyter notebook for creating a Pachyderm pipeline that performs edge detection. For convenience, the article uses a Red Hat OpenShift cluster configuration described in an earlier Red Hat Developer article, How to install an open source tool for creating machine learning pipelines, but you can use the notebook on any Kubernetes cluster.

     

    Edge detection with OpenCV

    A good way to understand edge detection is to look at Figure 1. Compare the picture my son drew of the cartoon character Shrek, on the left, with the image produced by an edge detection algorithm on the right. Edge detection is one of the first steps in many machine learning processes that alter images or identify their content.

    Edge detection, performed here on a child's picture, is the first step in identifying the elements an image.
    Figure 1: Edge detection, performed here on a child's picture, is the first step in identifying the elements an image.

    The most popular open source tool for image and video manipulation is OpenCV. I use its edge detection library along with Pachyderm to create the machine learning pipeline in the Jupyter notebook.

    There are several advantages to using Pachyderm for this task. Like Git, Pachyderm's data versioning allows you to manage your data and iterate over it using repositories and commits. Pachyderm is not limited to text files and structured data, but can version any data (image, audio, video, text). Pachyderm's version control system is optimized to scale to large datasets of all types, providing consistent reproducibility.

    Pachyderm's pipelines allow you to connect your code to data repositories. It can be used to automate many components of the machine learning lifecycle, such as data preparation, testing, model training, and more, by rerunning the pipeline when new data is committed. Pachyderm's pipelines and version control capabilities work together to visualize the end-to-end flow of your machine learning workflow.

    The notebook in this article creates two repositories. The first, named images, receives the input images. The second, named edges, stores the results of the Pachyderm pipeline (Figure 2). Once the flow is configured, execution of the pipeline is triggered by committing an image to the images source repository. The Python source code I use for edge detection is in a GitHub repository.

    Images in an input Pachyderm repository pass through the edges.py pipeline to generate edges in an output Pachyderm repository.
    Figure 2: Images in an input Pachyderm repository pass through the edges.py pipeline to generate edges in an output Pachyderm repository.

    Pachyderm's pipeline monitors the images source repository and detects when a new image is pushed to it. Once the pipeline pod is running, you can reuse it by pushing other images to the source repository. The notebook includes detailed explanations for each of its cells.

    Obtaining and running the Jupyter notebook

    The environment in my previous article went to Open Data Hub to download Pachyderm and JupyterHub on an OpenShift instance. The steps in this article start with that environment. Using Open Data Hub, you can also deploy Pachyderm and JupyterHub on any Kubernetes cluster you have.

    First, visit your installed instance of JupyterHub. If you installed JupyterHub through Open Data Hub, you can find a jupyterhub route in the Open Data Hub project (Figure 3).

    Choose Routes in the left-hand menu and then click the jupyterhub route.
    Figure 3: Choose Routes in the left-hand menu and then click the jupyterhub route.

    OpenShift's OAuth proxy is integrated with JupyterHub, so you can get into JupyterHub after you enter your username and password in OpenShift (Figure 4). The web page prompts you to give JupyterHub access to your account (Figure 5).

    Log in to OpenShift Container Platform to get access to JupyterHub.
    Figure 4: Log in to OpenShift Container Platform to get access to JupyterHub.
    Press "Allow selected permissions" to giv e JupyterHub access to your information.
    Figure 5: Press "Allow selected permissions" to giv e JupyterHub access to your information.
    Figure 5: Click "Allow selected permissions" to give JupyterHub access to your information.

    We turn now to the JupyterHub web interface, Swan. On the Start a notebook server page, choose the Standard Data Science notebook (Figure 6).

    Start the server for the Standard Data Science notebook.
    Figure 6: Start the server for the Standard Data Science notebook.

    The image takes a few minutes to load (Figure 7).

    A pop-up shows the progress while the image is being loaded.
    Figure 7: A pop-up shows the progress while the image is being loaded.

    Using the menu at the top of the interface (Figure 8), clone the GitHub repository, which is named https://github.com/Jooho/pachyderm-operator-manifests.git (Figure 9).

    The icon to clone the GitHub repository is the rightmost icon on the top menu on the left of the screen.
    Figure 8: The icon to clone the GitHub repository is the rightmost icon on the top menu on the left of the screen.
    A dialog allows you to paste in the URL of the Git repository.
    Figure 9: A dialog allows you to paste in the URL of the Git repository.

    Run the notebook by opening the file on your system (Figure 10). The notebook is at /pachyderm-operator-manifests/notebooks/pachyderm-opencv.ipynb in the repository you downloaded from GitHub (Figure 11). You can now interact with the cells in the OpenCV Edge Detection Jupyter notebook.

    From the File menu, choose "Open from Path."
    Figure 10: From the File menu, choose "Open from Path."
    In the dialog box, paste in the path to the local notebook.
    Figure 11: In the dialog box, paste in the path to the local notebook.

    Congratulations: You're now ready to start experimenting with image recognition, which has a number of use cases in machine learning. The following two-minute video outlines the steps in this article so you can see how it works in action!

    Last updated: November 6, 2023

    Related Posts

    • How to install an open source tool for creating machine learning pipelines

    • Configure CodeReady Containers for AI/ML development

    Recent Posts

    • How Kafka improves agentic AI

    • How to use service mesh to improve AI model security

    • How to run AI models in cloud development environments

    • How Trilio secures OpenShift virtual machines and containers

    • How to implement observability with Node.js and Llama Stack

    What’s up next?

    book cover

    Open Source Data Pipelines for Intelligent Applications provides data engineers and scientists insight into how Kubernetes provides a platform for building data platforms that increase an organization’s data agility. 

    Download the free e-book
    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Products

    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform

    Build

    • Developer Sandbox
    • Developer Tools
    • Interactive Tutorials
    • API Catalog

    Quicklinks

    • Learning Resources
    • E-books
    • Cheat Sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site Status Dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue