Skip to main content
Redhat Developers  Logo
  • Products

    Platforms

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat AI
      Red Hat AI
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • View All Red Hat Products

    Featured

    • Red Hat build of OpenJDK
    • Red Hat Developer Hub
    • Red Hat JBoss Enterprise Application Platform
    • Red Hat OpenShift Dev Spaces
    • Red Hat OpenShift Local
    • Red Hat Developer Sandbox

      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Openshift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • View All Technologies
    • Programming Languages & Frameworks

      • Java
      • Python
      • JavaScript
    • System Design & Architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer Productivity

      • Developer productivity
      • Developer Tools
      • GitOps
    • Automated Data Processing

      • AI/ML
      • Data Science
      • Apache Kafka on Kubernetes
    • Platform Engineering

      • DevOps
      • DevSecOps
      • Ansible automation for applications and services
    • Secure Development & Architectures

      • Security
      • Secure coding
  • Learn

    Featured

    • Kubernetes & Cloud Native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • AI/ML
      AI/ML Icon
    • View All Learning Resources

    E-Books

    • GitOps Cookbook
    • Podman in Action
    • Kubernetes Operators
    • The Path to GitOps
    • View All E-books

    Cheat Sheets

    • Linux Commands
    • Bash Commands
    • Git
    • systemd Commands
    • View All Cheat Sheets

    Documentation

    • Product Documentation
    • API Catalog
    • Legacy Documentation
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

5 tips for developing Kubernetes Operators with the new Operator SDK

September 11, 2020
Laurent Broudoux
Related topics:
DevOpsGoKubernetesOperators
Related products:
Red Hat OpenShift

Share:

    Kubernetes Operators are all the rage this season, and the fame is well deserved. Operators are evolving from being used primarily by technical-infrastructure gurus to becoming more mainstream, Kubernetes-native tools for managing complex applications. Kubernetes Operators today are important for cluster administrators and ISV providers, and also for custom applications developed in house. They provide the base for a standardized operational model that is similar to what cloud providers offer. Operators also open the door to fully portable workloads and services on Kubernetes.

    The new Kubernetes Operator Framework is an open source toolkit that lets you manage Kubernetes Operators in an effective, automated, and scalable way. The Operator Framework consists of three components: the Operator SDK, the Operator Lifecycle Manager, and OperatorHub. In this article, I introduce tips and tricks for working with the Operator SDK. The Operator SDK 1.0.0 release shipped in mid-August, so it's a great time to have a look at it.

    Note: Started by CoreOS and pursued by Red Hat for the past year, the Operator Framework initiative entered incubation with the Cloud Native Computing Foundation in July 2020.

    Operator SDK tips and tricks

    Exploring the Kubernetes Operator SDK

    I took advantage of the summer holidays to explore the new Operator SDK 1.0.0 release. For my experimentation, I developed Operators using Helm, Ansible, and Go, and deployed them on both vanilla Kubernetes and Red Hat OpenShift. These languages are the ones proposed by Operator SDK and they offer a range of capabilities, from simple to very sophisticated Operators. Of course, you can use other technologies to develop your Operator as well, like Python or Quarkus. I found good resources to guide me—namely, the 'Hello, World' tutorial with Kubernetes Operators, Operator best practices, and Kubernetes Operators best practices for Go—but I am not that familiar with Go or Ansible, so I scratched my head a lot. The tips I'm sharing are all things that I wish I had known before I started. I hope that they will also help you.

    Note: All of the code examples and resources we'll use are available in the GitHub repository for this article.

    Tip 1: Handling default CRD values

    Every Kubernetes Operator comes with its own custom resource definition (CRD), which is the grammar used to describe high-level resource specifications in a Kubernetes cluster. From a first-time user perspective, a simpler CRD is better; however, experienced users will appreciate the advanced tweaking options. Handling default values for all of your custom resource instances is crucial for keeping things simple and configurable, but each tool does it a little differently.

    As an example, let's say that we want to deploy an application made of two components: a web application and a database. First-time users would deploy it using a simple custom resource like the one below:

    apiVersion: redhat.com/v1beta1
    kind: FruitsCatalog
    metadata:
      name: fruitscatalog-sample
    spec:
      appName: my-fruits-catalog
    

    We will also need advanced options for the number of replicas, persistent storage, ingress, and so on.

    Custom resource default values with Helm

    A Helm chart defines a values.yaml file for handling custom resource default values. Using the Helm-based Operator SDK, it's pretty easy to add consistent values to our example file:

    # Default values for fruitscatalog.
    appName: fruits-catalog-helm
    webapp:
      replicaCount: 1
      image: quay.io/lbroudoux/fruits-catalog:latest
      [...]
    mongodb:
      install: true
      image: centos/mongodb-34-centos7:latest
      persistent: true
      volumeSize: 2Gi
      [...]
    

    Custom resource default values with Ansible

    The Ansible-based Operator SDK does not provide an out-of-the-box way to add handle custom resource default values. The trick I've found requires that you make three modifications to your Operator project.

    First, create a roles/fruitscatalog/default/main.yml file for handling default values. Be aware of Ansible's usage of snake case, which is different from the camel case normally used for custom resource attributes. As an example, Ansible transforms replicaCount into replica_count, so you have to use this form in your Operator:

    ---
    # defaults file for fruitscatalog
    name: fruits-catalog-ansible
    webapp:
      replica_count: 1
      image: quay.io/lbroudoux/fruits-catalog:latest
      [...]
    mongodb:
      install: true
      image: centos/mongodb-34-centos7:latest
      persistent: true
      volume_size: 2Gi
      [...]
    

    Once this file is present in your role, the Operator SDK will use it to initialize the missing parts in the user-supplied custom resource. The limit of this approach is that the SDK only realizes a first-level merge. If a user only puts the webapp.replicaCount into the custom resource, the other default child attributes will not be merged into the webapp variable. Basically, you will have to handle the merge process explicitly, using Ansible's combine() filter.

    So, at the very beginning of the role, we need to ensure that we will have a complete resource based on what's provided by the user and merged with default:

    - name: Load default values from defaults/main.yml
      include_vars:
        file: ../defaults/main.yml
        name: default_cr
    
    - name: Complete Custom Resource spec with default values
      set_fact:
        webapp_full: "{{ default_cr.webapp|combine(webapp, recursive=True) }}"
        mongodb_full: "{{ default_cr.mongodb|combine(mongodb, recursive=True) }}"
    

    The trick here is that the webapp and mongodb variables initialized by the SDK cannot be written; you will have to recreate new variables like webapp_full and base your Ansible template on this later one. What's nice is that this approach is fully functional when running your Kubernetes Operator locally using make run or ansible-operator run.

    Custom resources default values with Go

    The Go-based Operator SDK also requires its own approach. You can define an initialization method in the controller (as described in Kubernetes Operators best practices), but I believe there's a better way of handling it.

    Using the Kubernetes apiextensions.k8s.io/v1 API, it is now possible to define default values directly within the CRD. In Helm and Ansible, you can complete the OpenAPI part of the CRD manually. For a Go-based Operator, you can use the +kubebuilder comments in your Go code:

    // WebAppSpec defines the desired state of WebApp
    // +k8s:openapi-gen=true
    type WebAppSpec struct {
        // +kubebuilder:default:=1
        ReplicaCount int32 `json:"replicaCount,omitempty"`
        // +kubebuilder:default:="quay.io/lbroudoux/fruits-catalog:latest"
        Image   string      `json:"image,omitempty"`
        [...]
    }
    

    To enable this option, you have to tweak the project's Makefile to force the SDK to generate apiextensions.k8s.io/v1 manifests:

    CRD_OPTIONS ?= "crd:trivialVersions=true,crdVersions=v1"
    

    Running the make manifests command in your project generates a full CRD with default values for future custom resource instances:

    ---
    apiVersion: apiextensions.k8s.io/v1
    kind: CustomResourceDefinition
    metadata:
      annotations:
        controller-gen.kubebuilder.io/version: v0.3.0
      creationTimestamp: null
      name: fruitscatalogs.redhat.com
    spec:
      group: redhat.com
      names:
        kind: FruitsCatalog
        listKind: FruitsCatalogList
        plural: fruitscatalogs
        singular: fruitscatalog
      scope: Namespaced
      versions:
        - name: v1beta1
        schema:
          openAPIV3Schema:
            [...]
              [...]
              spec:
                description: FruitsCatalogSpec defines the desired state of FruitsCatalog
                properties:
                  [...]
                  webapp:
                    description: WebAppSpec defines the desired state of WebApp
                    properties:
                      image:
                        default: quay.io/lbroudoux/fruits-catalog:latest
                        type: string
                      replicaCount:
                        format: int32
                        default: 1
                        type: integer
                      [...]
    

    That's pretty neat.

    Tip 2: Preparing your Operator for OpenShift

    One nice thing about the Operator SDK is that it scaffolds a huge part of your project from operator-sdk init or operator-sdk create api. That scaffold is much of what you need to deploy your Operator to OpenShift, but it's not everything. During my experiments, I found one missing piece, which is related to role-based access control (RBAC) permissions. Essentially, the Operator should be endorsed to do its job without having full access to the cluster.

    When generating Kubernetes resources, an Operator should try to register itself as the owner of the resource. That makes it easier to watch the resource and implement finalizers. Typically, the Operator can include an ownerReference field that references the created CR:

    ownerReferences:
      - apiVersion: redhat.com/v1beta1
        blockOwnerDeletion: true
        controller: true
        kind: FruitsCatalog
        name: fruitscatalog-sample
        uid: c5d7e996-013f-40ca-bd19-14ba73728eaf
    

    The default scaffolding works well on vanilla Kubernetes. But on OpenShift, the Operator needs to be able to set finalizers on the custom resource after it's been created in order to set the ownerReference block. So now you have to add the extra permissions for your Operator as described below.

    Adding RBAC permissions with Helm and Ansible

    Using Helm and Ansible-based Operators, you can configure the RBAC permissions within the config/rbac/role.yaml file. You would typically add something like this:

    - apiGroups:
      - redhat.com
      resources:
      - fruitscatalogs
      - fruitscatalogs/status
      - fruitscatalogs/finalizers 	# Missing line that is not added by the SDK
      verbs:
      - create
      - delete
      - get
      - list
      - patch
      - update
      - watch

    Adding RBAC permissions with Go

    Using Go-based Operators, you can use a +kubebuilder:rbac comment to set the RBAC permissions directly into the controller source code. Just add something like this to your Reconcile function comments:

    [...]
    // +kubebuilder:rbac:groups=redhat.com,resources=fruitscatalogs/finalizers,verbs=get;create;update;patch;delete
    
    // Reconcile the state rfor a FruitsCatalog object and makes changes based on the state read and what is in the FruitsCatalogSpec.
    func (r *FruitsCatalogG1Reconciler) Reconcile(req ctrl.Request) (ctrl.Result, error) {
    	[...]
    }

    Note: These permissions might be added by default in a future release. See Pull Request #3779: Helm Operator: add finalizers permission for created APIs for details and tracking.

    Tip 3: Discovering the cluster you're running on

    Operators are expected to be adaptable, which means that they have to be able to change their actions and the resources they manage depending on the environment. A straightforward illustration is using route capabilities instead of ingress when running on OpenShift. To make that change, a Kubernetes Operator should be able to discover the Kubernetes distribution that it is deployed on and any extensions that have been installed on the cluster. Currently, such advanced discovery can only be done using the Ansible- and Go-based Operators.

    Advanced discovery with Ansible

    In Ansible-based Operators, we use a k8s lookup requesting the api_groups present on the cluster. Then, we should be able to detect that we're running on OpenShift and create a Route only when appropriate:

    - name: Get information about the cluster
      set_fact:
        api_groups: "{{ lookup('k8s', cluster_info='api_groups') }}"
    
    [...]
    
    - name: The Webapp Route is present if OpenShift
      when: "'route.openshift.io' in api_groups"
      k8s:
        state: present
        definition: "{{ lookup('template', 'webapp-route.yml') | from_yaml  }}"
    

    Advanced discovery with Go

    This type of discovery is a little more complex using Go-based Operators. In this case, we use a specific DiscoveryClient from the discovery package. Once retrieved, you can make a request to retrieve the API groups and detect that you are on OpenShift:

    import {
        [...]
        "k8s.io/client-go/discovery"
    }
    
    // getDiscoveryClient returns a discovery client for the current reconciler
    func getDiscoveryClient(config *rest.Config) (*discovery.DiscoveryClient, error) {
        return discovery.NewDiscoveryClientForConfig(config)
    }
    
    // Reconcile the state for a FruitsCatalog object and makes changes based on the state read and what is in the FruitsCatalogSpec.
    func (r *FruitsCatalogG1Reconciler) Reconcile(req ctrl.Request) (ctrl.Result, error) {
    	[...]
        // The discovery package is used to discover APIs supported by a Kubernetes API server.
        config, err := ctrl.GetConfig()
        if err == nil andand config != nil {
            dclient, err := getDiscoveryClient(config)
            if err == nil andand dclient != nil {
                apiGroupList, err := dclient.ServerGroups()
                if err != nil {
                    reqLogger.Info("Error while querying ServerGroups, assuming we're on Vanilla Kubernetes")
                } else {
                    for i := 0; i < len(apiGroupList.Groups); i++ {
                        if strings.HasSuffix(apiGroupList.Groups[i].Name, ".openshift.io") {
                            isOpenShift = true
                            reqLogger.Info("We detected being on OpenShift! Wouhou!")
                            break
                        }
                    }
                }
            } else {
                reqLogger.Info("Cannot retrieve a DiscoveryClient, assuming we're on Vanilla Kubernetes")
            }
        }
        [...]
    }
    

    You could use this mechanism to detect other installed Operators, Ingress classes, storage capabilities, and so on.

    Tip 4: Using extensions APIs in Go-based Operators

    This tip is specific to Go-based Operators. As Ansible and Helm are treating everything as YAML, you can freely describe any kind of resources you need. Their validation will only occur on the cluster API-side when doing OpenAPI v3 validation or through an admission hook.

    Go is a strongly typed language, which is clearly an advantage when you are dealing with complex Operators and data structures. With Go you can rely on tools like Integrated Development Environment (IDE) to help you through code completion and inline documentation. Thus you are able to validate the Kubernetes resources you are using before actually submitting them to the cluster. However, when you want to build something with API extensions, you'll have to integrate them as Go dependencies and register them within your own client runtime. I'll show you how to do that.

    Note: While the following discussion might seem obvious to developers familiar with Go and Kubernetes, my background is in Java, and it took me a moment to figure it out.

    Integrating API extensions as Go dependencies

    First, you have to include the new API extension dependencies within your go.mod file at the project root. For this, Go modules use either a Git tag or branch name (in the latter case, it appears to translate the branch name into the latest commit hash). Following my previous example, if I want to use an OpenShift-specific data structure for a Route resource, I have to add the following:

    require (
       [...]
       github.com/openshift/api v3.9.0+incompatible   // v3.9.0 is the last tag. New releases are managed as branches
       // github.com/openshift/client-go release-4.5  // As an example of integrating the OpenShift-specific client lib
    )
    

    The next step is to register one or many packages into the supported runtime schemes. This allows you to use Route Go objects with the standard Kubernetes Go client. For that, you have to modify the main.go file that was generated at the project root. Add a new import and register the scheme into the init() function:

    import (
        [...]
        routev1 "github.com/openshift/api/route/v1"
    )
    
    func init() {
        utilruntime.Must(clientgoscheme.AddToScheme(scheme))
        utilruntime.Must(redhatv1beta1.AddToScheme(scheme))
        utilruntime.Must(routev1.AddToScheme(scheme))
        // +kubebuilder:scaffold:scheme
    }
    

    Finally, within your Go Reconcile() function or another Operator package, you'll be able to manipulate the Route structure in a strongly typed fashion that helps keep you on track. You can then create this object using the standard client present into your controller:

    return androutev1.Route{
        [...]
         Spec: routev1.RouteSpec{
            To: routev1.RouteTargetReference{
                Name:   spec.AppName + "-webapp",
                Kind:   "Service",
                Weight: andweight,
            },
            Port: androutev1.RoutePort{
                TargetPort: intstr.IntOrString{
                    Type:   intstr.String,
                    StrVal: "http",
                },
            },
            TLS: androutev1.TLSConfig{
                Termination:                   routev1.TLSTerminationEdge,
                InsecureEdgeTerminationPolicy: routev1.InsecureEdgeTerminationPolicyNone,
            },
            WildcardPolicy: routev1.WildcardPolicyNone,
        },
    }

    Tip 5: Adjusting Operator resource consumption

    My final tip is to watch your resources—which is a sentence with a double meaning.

    In the specific context of Kubernetes Operators, I mean to say that the custom resource is watched by the Operator controller, often called the Operand. It is important to configure your Operator to watch dependent resources. While there is some really good documentation on watching dependent resources (see the docs for dependent watches, resources watched by the controller, and using predicates for event filtering), there's no need to dive into these right now. What's important to know is that watching more software resources impacts your physical resources, namely, CPU and memory.

    That is the second meaning of the sentence: Once your Operator starts growing—which can happen very quickly—you should think carefully about the resources that it consumes. The default requests and limits set low values that should be adapted to your needs. This is especially true for Helm- or Ansible-based Operators.

    Before you start raising CPU and memory, make sure you pay attention to the concurrent reconciles that your Operator should manage. Simply put: How many custom resources should your Operator manage simultaneously? By default, the Operator SDK sets this value to the number of cores present on the node running the Operator. If you're watching many resources and you have big nodes, however, then this setting could act as a multiplying factor for the consumed resources. Moreover, if your Kubernetes Operator is scoped to a specific namespace, there is a little chance that you'll need 16 concurrent reconciles for one or two custom resources in your namespace.

    Managing concurrent reconciles

    You can easily use the —max-concurrent-reconciles flag to set the number of maximum concurrent reconciles. The new Operator SDK project layout takes advantage of Kustomize, so you'll have to change the config/default/manager_auth_proxy_patch.yaml like this:

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: controller-manager
      namespace: system
    spec:
      template:
        spec:
          containers:
          - name: kube-rbac-proxy
            [...]
          - name: manager
            args:
            - "--metrics-addr=127.0.0.1:8080"
            - "--enable-leader-election"
            - "--leader-election-id=fruits-catalog-operator"
            - "--max-concurrent-reconciles=4"

    After that, you can set up the resources requests and limits in the usual Kubernetes way.

    Wrap up

    In this article, I've shared five tips that ease my life while developing Operators with the newly released Kubernetes Operator SDK 1.0.0. Having a strong background in Java, but not in Ansible or Go, the issues that I discussed all made me scratch my head for a few hours. The tips might be obvious to experienced developers, but I hope that they will save other developers' time.

    What about you? What are your tricks for working with Kubernetes Operators?

    Last updated: January 12, 2024

    Recent Posts

    • A deep dive into Apache Kafka's KRaft protocol

    • Staying ahead of artificial intelligence threats

    • Strengthen privacy and security with encrypted DNS in RHEL

    • How to enable Ansible Lightspeed intelligent assistant

    • Why some agentic AI developers are moving code from Python to Rust

    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Products

    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform

    Build

    • Developer Sandbox
    • Developer Tools
    • Interactive Tutorials
    • API Catalog

    Quicklinks

    • Learning Resources
    • E-books
    • Cheat Sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site Status Dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit
    © 2025 Red Hat

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue