Skip to main content
Redhat Developers  Logo
  • AI

    Get started with AI

    • Red Hat AI
      Accelerate the development and deployment of enterprise AI solutions.
    • AI learning hub
      Explore learning materials and tools, organized by task.
    • AI interactive demos
      Click through scenarios with Red Hat AI, including training LLMs and more.
    • AI/ML learning paths
      Expand your OpenShift AI knowledge using these learning resources.
    • AI quickstarts
      Focused AI use cases designed for fast deployment on Red Hat AI platforms.
    • No-cost AI training
      Foundational Red Hat AI training.

    Featured resources

    • OpenShift AI learning
    • Open source AI for developers
    • AI product application development
    • Open source-powered AI/ML for hybrid cloud
    • AI and Node.js cheat sheet

    Red Hat AI Factory with NVIDIA

    • Red Hat AI Factory with NVIDIA is a co-engineered, enterprise-grade AI solution for building, deploying, and managing AI at scale across hybrid cloud environments.
    • Explore the solution
  • Learn

    Self-guided

    • Documentation
      Find answers, get step-by-step guidance, and learn how to use Red Hat products.
    • Learning paths
      Explore curated walkthroughs for common development tasks.
    • See all learning

    Hands-on

    • Developer Sandbox
      Spin up Red Hat's products and technologies without setup or configuration.
    • Interactive labs
      Learn by doing in these hands-on, browser-based experiences.
    • Interactive demos
      Click through product features in these guided tours.

    Browse by topic

    • AI/ML
    • Automation
    • Java
    • Kubernetes
    • Linux
    • See all topics

    Training & certifications

    • Courses and exams
    • Certifications
    • Skills assessments
    • Red Hat Academy
    • Learning subscription
    • Explore training
  • Build

    Get started

    • Red Hat build of Podman Desktop
      A downloadable, local development hub to experiment with our products and builds.
    • Developer Sandbox
      Spin up Red Hat's products and technologies without setup or configuration.

    Download products

    • Access product downloads to start building and testing right away.
    • Red Hat Enterprise Linux
    • Red Hat AI
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform
    • See all products

    Featured

    • Red Hat build of OpenJDK
    • Red Hat JBoss Enterprise Application Platform
    • Red Hat OpenShift Dev Spaces
    • Red Hat Developer Toolset

    References

    • E-books
    • Documentation
    • Cheat sheets
    • Architecture center
  • Community

    Get involved

    • Events
    • Live AI events
    • Red Hat Summit
    • Red Hat Accelerators
    • Community discussions

    Follow along

    • Articles & blogs
    • Developer newsletter
    • Videos
    • Github

    Get help

    • Customer service
    • Customer support
    • Regional contacts
    • Find a partner

    Join the Red Hat Developer program

    • Download Red Hat products and project builds, access support documentation, learning content, and more.
    • Explore the benefits

Run Red Hat OpenShift Container Platform 4.13 on VMware Cloud Foundation 5.1 with NVIDIA AI Enterprise

February 22, 2024
Vivien Wang Victor (Shi) Chen - Broadcom
Related topics:
Artificial intelligenceKubernetes
Related products:
Red Hat OpenShift Container Platform

    VMware Cloud Foundation 5.1 is now available on Red Hat OpenShift Container Platform 4.13 with NVIDIA AI Enterprise, integrating automation within a consistent and replicable infrastructure model along with the latest data science and AI application deployment offered by NVIDIA. Read VMware’s official solution brief here, and their official blog announcement here.

    What does this mean?

    OpenShift on VMware Cloud Foundation provides you with the advantages of a contemporary private cloud, leveraging the established VMware Software-Defined Data Center architecture:

    • A standard, replicable methodology for standardized infrastructure.
    • Enhanced automation that reduces manual errors and boosts administrative efficiency.
    • Cloud flexibility and scalability allowing for rapid expansion in line with business growth.
    • Integration with VMware's proven networking solution, VMware NSX.
    • Integration with VMware's proven storage solution, vSAN.
    • Utilization of VMware vSphere capabilities, including VMware vSphere Distributed Resource Scheduler (DRS), vSphere vMotion, Fault Tolerance, among others.
    • Integration with NVIDIA AI Enterprise (NVAIE) and the use of vGPU for detailed GPU resource allocation.

    NVIDIA AI Enterprise

    NVIDIA AI Enterprise (NVAIE) streamlines AI application development, from data science processes to deploying advanced AI, including Generative AI. Certified for both enterprise data centers and cloud environments, this platform enhances infrastructure efficiency, simplifies workload management, and ensures workload compatibility in multi-cloud and hybrid-cloud setups. NVAIE regularly updates security, maintains API stability, and provides NVIDIA Enterprise Support, making it suitable for AI-dependent enterprises seeking secure, stable, and supported transitions from pilot projects to full-scale production.

    Virtualization brings operational savings to containerized workloads

    OpenShift, integrated with vSphere, delivers an infrastructure primed for developers, supporting both OpenShift container platforms and traditional VMs. vSphere enables management of diverse workloads through familiar vCenter tools across hybrid clouds. It incorporates proven features like vSphere High Availability (HA) and policy-driven management for workload resilience. vSphere bolsters container security by leveraging VM isolation and simplifies operations with its lifecycle management and enterprise-grade resilience, reducing the effort needed for bare metal maintenance and recovery.

    Higher container pod density results in lower capex

    OpenShift 4.x caps at 500 pods per Worker Node, often leading to hardware underuse. With virtualization for worker nodes on vSphere, pods per server can surpass 500, reducing capital expenses. This higher pod density cuts costs by needing fewer physical servers for the same container volume. Running OpenShift on vSphere improves resource utilization over bare metal, highlighting virtualization's efficiency advantage.

    VMware Cloud Foundation 5.1

    A cohesive SDDC platform integrates VMware vSphere, vSAN, NSX, and, if chosen, vRealize Suite components, into an inherently unified stack, providing ready-to-use cloud infrastructure for both private and public clouds. This solution employs VMware Cloud Foundation 5.1, although subsequent minor versions are compatible as well.

    See Preparing to install on vSphere for VMware vSphere infrastructure requirements.

    VMware Cloud Foundation installation

    The key steps for VMware Cloud Foundation installation are as follows:

    1. Establish a management domain.
    2. Integrate ESXi hosts into the system.
    3. Create a workload domain using the unused ESXi hosts.

    Refer to Deployment Overview of VMware Cloud Foundation for comprehensive instructions on VMware Cloud Foundation setup procedures.

    After the installation, you will need to verify that NSX and vSAN are activated within this workload domain, as shown in Figure 1. NSX and vSAN are integral to this solution and will be utilized in the subsequent OpenShift configuration steps.

    verifying-setup
    Figure 1: Verifying the Setup and Status of VMware Cloud Foundation Deployment
    Figure 1: Verifying the Setup and Status of VMware Cloud Foundation Deployment.

    Workload domain preparation

    Following the establishment of the workload domain for OpenShift, NSX Managers are set up in the Management Domain. However, the NSX Edge cluster isn't automatically deployed. The addition of an edge cluster is required post the creation of the workload domain, as illustrated in Figure 2. Once the edge cluster is in place, its status should be "active." The names of the edge nodes are displayed underneath.

    workload-domain-prep
    Figure 2. Workload Domain Preparation - adding the Edge Cluster
    Figure 2: Workload Domain Preparation - adding the Edge Cluster.

    vSAN configuration

    The validation of the solution utilized a 4-node vSAN cluster as the foundational unit. Testing for validation was performed with the vSAN datastore's standard storage policy, which includes RAID 1 FTT=1 and activated checksums. This vSAN cluster operated with both deduplication and compression turned off, and without any encryption. The subsequent sections detail the specific configurations of the vSAN cluster and certain aspects of the Storage Policy Based Management (SPBM).

    Deduplication and compression

    The "Deduplication and Compression" feature was set up at the cluster level, allowing it to be either enabled or turned off for the entire vSAN cluster. In our experiments, we chose to turn it off. Activating this feature can decrease the vSAN storage consumption, at the cost of increased latencies for the OpenShift application. 

    Failures to Tolerance (FTT)

    Failures to Tolerate (FTT) is an option within vSAN's storage policy settings. For StorageClass in OpenShift and its corresponding vSAN storage policy, setting vSAN's FTT to 1 is advisable. We used FTT set to 1 as our standard during testing. Avoid configuring FTT to 0 in an OpenShift with vSAN setup, as FTT=0 could lead to data from the same pod's replications being stored on a single physical disk. Such a scenario increases the risk of data loss if there is a failure of the physical disk.

    Network configuration

    Figure 3 illustrates the network setup for the OpenShift cluster within the workload domain of VMware Cloud Foundation, using the VMware vSphere Distributed Switch.

    network-ocp
    Figure 3. Network setup for the OpenShift cluster
    Figure 3: Network setup for the OpenShift cluster.

     

    The underlying network for this vSphere infrastructure is provided by NSX, which is essential for the OpenShift cluster's networking. Deployment of an NSX Edge cluster is necessary to facilitate external access to the OpenShift cluster. Configuring BGP peering and route distribution with the upstream network is mandatory. 

    • VMware Cloud Foundation automatically generates the "VCF-xxxx-External-1" and "External-2" port groups for NSX use. 
    • The "xxxx-management", "xxxx-vsan", and "xxxx-vmotion" are likewise auto-created by VMware Cloud Foundation. They serve distinct purposes: management, vSAN, and vMotion, respectively. 
    • The "ocp-segment" represents a logical switch that is manually configured within NSX and designated for use by OpenShift VM nodes.

    Dual 100 GbE vmnics are employed and set up with teaming policies. This management domain can then be utilized by various workloads.

    We’ll cover more about configuring Red Hat OpenShift Container Platform on NVIDIA AI Enterprise in part 2—stay tuned. 

    References

    • Red Hat OpenShift Container Platform 4.13 on VMware Cloud Foundation 5.1 with NVIDIA AI Enterprise
    • NVIDIA AI Software Platform for Enterprise
    • Preparing to install on vSphere
    • Deployment Overview of VMware Cloud Foundation  
    Last updated: March 15, 2024

    Related Posts

    • VMware Antrea is now Generally Available on OpenShift

    • Simplify multitenancy with Stakater's Multi Tenant Operator

    • How to install Red Hat Ansible Automation Platform on RHEL 9

    • Automate your SSO with Ansible and Keycloak

    • Introducing Ansible Molecule with Ansible Automation Platform

    • 5 examples of security automation with Ansible

    Recent Posts

    • Confidential virtual machine storage attack scenarios

    • Introducing virtualization platform autopilot

    • Integrate zero trust workload identity manager with Red Hat OpenShift GitOps

    • Best Practice Configuration and Tuning for Linux and Windows VMs

    • Red Hat UBI 8 builders have been promoted to the Paketo Buildpacks organization

    What’s up next?

    Operating OpenShift Share image

    Read Operating OpenShift, a practical guide to running and operating OpenShift clusters more efficiently using a site reliability engineering (SRE) approach. Learn best practices and tools that can help reduce the effort of deploying a Kubernetes platform.

    Get the e-book
    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Platforms

    • Red Hat AI
    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform
    • See all products

    Build

    • Developer Sandbox
    • Developer tools
    • Interactive tutorials
    • API catalog

    Quicklinks

    • Learning resources
    • E-books
    • Cheat sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site status dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit
    © 2026 Red Hat

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Chat Support

    Please log in with your Red Hat account to access chat support.