Skip to main content
Redhat Developers  Logo
  • Products

    Platforms

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat AI
      Red Hat AI
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • See all Red Hat products

    Featured

    • Red Hat build of OpenJDK
    • Red Hat Developer Hub
    • Red Hat JBoss Enterprise Application Platform
    • Red Hat OpenShift Dev Spaces
    • Red Hat OpenShift Local
    • Red Hat Developer Sandbox

      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Red Hat OpenShift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • See all technologies
    • Programming languages & frameworks

      • Java
      • Python
      • JavaScript
    • System design & architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer experience

      • Productivity
      • Tools
      • GitOps
    • Automated data processing

      • AI/ML
      • Data science
      • Apache Kafka on Kubernetes
    • Platform engineering

      • DevOps
      • DevSecOps
      • Red Hat Ansible Automation Platform for applications and services
    • Secure development & architectures

      • Security
      • Secure coding
  • Learn

    Featured

    • Kubernetes & cloud native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • AI/ML
      AI/ML Icon
    • See all learning resources

    E-books

    • GitOps cookbook
    • Podman in action
    • Kubernetes operators
    • The path to GitOps
    • See all e-books

    Cheat sheets

    • Linux commands
    • Bash commands
    • Git
    • systemd commands
    • See all cheat sheets

    Documentation

    • Product documentation
    • API catalog
    • Legacy documentation
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore the Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

Deploying OpenShift hosted clusters on bare metal

5 Things I wish I knew before deploying OpenShift hosted clusters on bare metal

November 19, 2025
Andre Rocha
Related topics:
Developer ProductivityKubernetesVirtualization
Related products:
Red Hat OpenShift

    Deploying Red Hat OpenShift hosted clusters on bare metal (often referred to as Hypershift) is a game-changer for infrastructure efficiency. By decoupling control planes from worker nodes, you can dramatically reduce hardware costs and dedicate more physical resources to running the virtual machines that matter.

    As with any powerful technology, the official documentation provides the “what,” but the “how” and “why” are often discovered in the field. The rest is a collection of practical knowledge: the non-obvious truths and critical procedures you only learn through hands-on deployment, troubleshooting, and long hours in the command line. This is my guide to the five most surprising and impactful lessons learned from the field; things I wish someone had told me before I started.

    1. The network is your foundation

    While any OpenShift installation has its complexities, I've learned that the most delicate, high-impact, and critical component of a hosting cluster deployment is the network setup. If you get this wrong, nothing else matters.

    A common scenario we encounter is migrating from VMware environments where the underlying physical switches are not configured for link aggregation protocols like LACP. This is because VMware has its own sophisticated load balancing that doesn't require it. Replicating this behavior in OpenShift without forcing a reconfiguration of the client's physical switches requires a very specific network configuration. 

    The solution is to use a balance-xor bond combined with a specific transmit hash policy. This policy analyzes L2 information (the source MAC address and the VLAN tag) to distribute traffic across interfaces, effectively mimicking VMware's default behavior without needing any switch-side changes.

    This advanced configuration isn't straightforward to apply, and it's non-negotiable for virtualization workloads that you also set your MTU to 9000 for performance. By far, the most efficient and reliable way to implement it is through the agent-based installation method. It allows you to inject these precise network settings into the discovery ISO, ensuring the nodes come online correctly from the very beginning.

    Getting the network right from the start isn't just a task, it's the foundation for the entire platform's stability and performance. This is where you should focus the majority of your initial attention.

    2. The Kube API endpoint is an IP address, not a name

    In a standard OpenShift deployment, you interact with the cluster's API via a fully qualified domain name (FQDN), like api.mycluster.example.com. It's intuitive and standard practice. With a hosted cluster, this expectation can lead to a common roadblock.

    One of the first issues you'll encounter is that the API server for a hosted cluster is exposed via a raw IP address, not an FQDN. This is the expected and designed behavior. A common pitfall is trying to force the use of an FQDN, which inevitably leads to certificate validation errors when you try to access the cluster. 

    If you try to use an API with the cluster's domain name when you try to access it, it will fail on the certificate and give that x509 error. Consequently, you won't be able to use it.

    While it's technically possible to work around this, it requires a significant amount of extra effort, such as setting up a custom Certificate Authority and managing a complex chain of custom certificates. In almost every case, it's not worth the trouble. This isn't just an inconvenience, it's a fundamental shift in how you must think about cluster identity and access, especially when designing automation that needs to be portable between standard and hosted environments. My advice is simple: accept it, configure your tools to use the IP address, and move on. For now, that's how it is.

    3. It's not “batteries included”

    In a typical OpenShift installation, Ingress just works. The necessary load balancing services are provisioned automatically. When deploying a hosted cluster on bare metal, this is not the case. It's a manual, multi-step process that is important to anticipate.

    Here's what happens. After the agent nodes for your hosted cluster are provisioned and ready, the deployment will appear to stall. You'll see persistent errors on the console and ingress operators in the operator list. This isn't a failure. It's your cue to intervene.

    The solution is to manually install and configure MetalLB (not on the hosting cluster where you've been working, but directly on the new, partially-deployed hosted cluster). This involves defining its entire operational context through a series of YAML manifests:

    1. Creating a namespace for MetalLB.
    2. Creating an OperatorGroup.
    3. Creating a subscription to install the operator. This is a critical step where you must copy the startingCSV version from the MetalLB instance on your hosting cluster to ensure version alignment (a classic problem).
    4. Creating the MetalLB instance.
    5. Defining an IPAddressPool with the IP address for your Ingress.
    6. Creating an L2Advertisement.

    The final step (configuring the L2Advertisement) is particularly delicate. It requires you to reference the br-ex bridge interface, a core component that you should never modify or reference under any other circumstances. An error here can take your entire cluster offline.

    This manual step reveals a core design principle of bare metal Hypershift: the platform provides the control plane, but you own the network integration entirely. Understanding this division of responsibility is the key to successfully managing the platform long term.

    4. The installation is not broken

    The user experience during a hosted cluster creation has a specific flow that can be surprising at first. You'll watch as many components in the installation UI turn green, showing progress. Then, suddenly, everything will stop. The progress bar won't advance, and you'll see errors related to the console and Ingress.

    Your first instinct will be to assume the installation has failed. It hasn't. The process is simply paused, waiting for you to perform the manual MetalLB setup on the hosted cluster, as described in the previous point. The system is waiting for you to provide the Ingress layer it can't create for itself.

    Another important detail is a status check in the UI labeled external-dns-reachable. This UI element is most relevant for public cloud deployments. So for bare metal, it's safe to disregard an indeterminate status here for on-premise, bare metal deployments. Knowing to ignore it separates experienced engineers from those who will lose hours chasing a non-existent problem.

    This phase of the process perfectly illustrates the sense of being caught between the official steps: where things start to get complicated, is a perfect example of where field knowledge bridges the gap, guiding you on the next move.

    Understanding the rhythm of the installation is key. You have to learn when the system has truly failed versus when it is simply waiting for your next move. This knowledge separates a smooth deployment from a challenging one.

    5. The installer deletes its blueprints

    This last point is a simple but critical piece of practical advice that can save you from a world of pain. The agent-based installer has a peculiar but important behavior: after it successfully generates the discovery ISO, it deletes the configuration files you used to create it.

    The problem this creates is obvious. If the installation fails at a later stage, or if you simply need to make a small change and regenerate the ISO, your original configuration files are gone. You're forced to recreate them from scratch, which, given their complexity, is both time-consuming and prone to error.

    The solution is incredibly simple, as captured in this piece of advice from the field.

    Save your files with a .bkp extension, both agent-config and install-config. Why? Because these files can be deleted during the ISO generation process.

    Before you run the command to generate the ISO, just make a backup of your configuration files. It's the small, practical tips like this one that are often the most valuable. They represent the distilled experience of engineers who have already navigated this process, and they can save you hours of rework.

    Final thoughts

    Successfully deploying OpenShift hosted clusters on bare metal is as much about navigating these unwritten rules and operational problems as it is about following the official steps. The technology offers unparalleled efficiency for managing OpenShift at scale, but it rewards those who are willing to dive deep into its practical realities.

    The lessons learned from the field passed between engineers are what transform a complex process into a repeatable and reliable one. What piece of knowledge has been a lifesaver in your most complex infrastructure projects?

    Related Posts

    • How to install single node OpenShift on bare metal

    • How to deploy confidential containers on bare metal

    • Master the art of bare metal deployments with image mode for RHEL

    • Protecting virtual machines from storage and secondary network node failures

    • How Trilio secures OpenShift virtual machines and containers

    Recent Posts

    • Deploying OpenShift hosted clusters on bare metal

    • Get started with language model post-training using Training Hub

    • Speculators: Standardized, production-ready speculative decoding

    • The strategic choice: Making sense of LLM customization

    • Building the digital substation: Exploring the LF Energy SEAPATH architecture on Red Hat Enterprise Linux

    What’s up next?

    OS-Virt-vmware-cheat-sheet-tile-card

    OpenShift Virtualization for VMware administrators cheat sheet

    Ryan Capra +1
    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Platforms

    • Red Hat AI
    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform
    • See all products

    Build

    • Developer Sandbox
    • Developer tools
    • Interactive tutorials
    • API catalog

    Quicklinks

    • Learning resources
    • E-books
    • Cheat sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site status dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit
    © 2025 Red Hat

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue