Create a Kubernetes Operator in Golang to automatically manage a simple, stateful application

Create a Kubernetes Operator in Golang to automatically manage a simple, stateful application

A Kubernetes Operator acts as an automated site reliability engineer for its application, encoding the skills of an expert administrator in software. For example, an Operator can manage a cluster of database servers and configure and manage its application. It can also install a database cluster of a declared software version and a designated number of members.

The Operator continues to monitor its application while it runs, and can automatically back up data, recover from failures, and upgrade the application over time. Cluster users employ kubectl and other standard tools to work with Operators and their applications, thereby extending Kubernetes services.

Operators make use of custom resources (CRs) to manage applications and their components. They follow Kubernetes principles, notably the controllers (control loop).

In this article, we demonstrate how to deploy a stateful application using a Kubernetes Operator. In this case, the Operator uses the operator-sdk project to deploy WordPress on SQL using a custom resource. If you need to do this without an Operator, follow the link in the conclusion.

Note: New to Operators and Operator patterns? Check out the Kubernetes documentation and the Kubernetes Operators e-Book to learn more.

Prerequisites

To create Kubernetes Operator and use this demo, first install the following:

Everything you need to grow your career.

With your free Red Hat Developer program membership, unlock our library of cheat sheets and ebooks on next-generation application development.

SIGN UP

Build and initialize the Operator

To build the Operator, start in $GOPATH/src directory. Then, to initialize the Operator, run the command:

operator-sdk new wordpress-operator --type go --repo github.com/<github-user-name>/<github-repo-name>

The output is represented as follows:

INFO[0000] Creating new Go operator 'wordpress-operator’.
INFO[0000] Created go.mod
INFO[0000] Created tools.go
INFO[0000] Created cmd/manager/main.go
INFO[0000] Created build/Dockerfile
INFO[0000] Created build/bin/entrypoint
INFO[0000] Created build/bin/user_setup
INFO[0000] Created deploy/service_account.yaml
INFO[0000] Created deploy/role.yaml
INFO[0000] Created deploy/role_binding.yaml
INFO[0000] Created deploy/operator.yaml
INFO[0000] Created pkg/apis/apis.go
INFO[0000] Created pkg/controller/controller.go
INFO[0000] Created version/version.go
INFO[0000] Created .gitignore
INFO[0000] Validating project

After the Operator is initialized, run the cd wordpress-operator command.

Create custom resource definitions

Custom resource definitions (CRDs) define our resource for interacting with the Kubernetes API. This is similar to how existing resources are defined, such as pods, deployments, services, persistent volume claims (PVCs), and so on. In this case, we specify the api-version of the format <group>/<version>,  and create the kind custom resource.

To create a CRD for a WordPress Operator, run the command:

operator-sdk add api --kind WordPress --api-version example.com/v1

Now examine the file:

deploy/crds/example.com_v1_wordpress_cr.yaml

This is an example CRD of the generated type. It is prepopulated with the appropriate api-version and kind, and the resource name. Also, ensure that the spec section is completed with values relevant to the CRD we created.

Examine the following file:

deploy/crds/example.com_wordpresses_crd.yaml

This file is the beginning of a CRD manifest. The SDK generates many of the fields related to the resource type’s name.

In the pkg/apis/example/v1/*_types.go file, address two struct objects called the spec object and the status object.

Examine the following:

// WordPressSpec defines the desired state of WordPress  
type WordPressSpec struct s
{                                                                   
    SQLRootPassword string `json:"sqlrootpassword"`// the user will provide the root password through CR
}

Update the CRD with these changes by running the operator-sdk generate crds and operator-sdk generate k8s commands.

This adds the spec specified in *_types.go to *crd.yaml.

Set the controller

Set a controller inside the Operator pod to watch for changes to the CRs and react accordingly. To start, generate the controller skeleton code using operator-SDK, for example:

operator-sdk add controller --api-version=example.com/v1 --kind=Wordpress

Next, edit the file to include the controller logic:

pkg/controller/wordpress/wordpress_controller.go

The main reconcile function is called each time there are changes to the CR.

For more clarity, let’s review the code, line by line. Initially, the controller needs to add watches for the resources, so that Kubernetes can tell the controller about changes needed for the resources. The initial watch is created for the primary resource, WordPress (in our case), that is monitored by the controller.

For example:

// Watch for changes to primary resource WordPress
err = c.Watch(&source.Kind{Type: &examplev1.Wordpress{}}, &handler.EnqueueRequestForObject{})
if err != nil {
    return err
}

Next, we can create subsequent watches for child resources, such as pod, deployment, service, PVC, and so on. The Operator uses this watch to support the primary resource. Create the watch for a child resource by specifying the value of OwnerType as a primary resource.

For example:

err = c.Watch(&source.Kind{Type: &appsv1.Deployment{}}, &handler.EnqueueRequestForOwner{
      IsController: True,
      OwnerType: &examplev1.Wordpress{},
})
if err != nil {
      return err
}
err = c.Watch(&source.Kind{Type: &corev1.Service{}}, &handler.EnqueueRequestForOwner{
    IsController: true,
    OwnerType: &examplev1.Wordpress{},
})
if err != nil {
   return err
}
err = c.Watch(&source.Kind{Type: &corev1.PersistentVolumeClaim{}}, &handler.EnqueueRequestForOwner{
    IsController: true,
    OwnerType: &examplev1.Wordpress{},
})
if err != nil {
     return err
}

Now run the reconcile function, also called the reconcile loop. This is the main part where the actual logic resides. This function returns the reconcile.Result{} which indicates whether or not the reconcile loop needs to execute another pass.

The possible outcomes based on the reconcile.Result{} return value are:

Outcome Description
return reconcile.Result{}, nil The reconcile process finished with no errors, so another iteration through the reconcile loop is not needed.
return reconcile.Result{}, err The reconcile failed due to an error and Kubernetes needs to re-queue it and run it again.
return reconcile.Result{Requeue: true}, nil The reconcile did not encounter an error, however, Kubernetes needs to re-queue it and run another iteration.

Take the example:

return reconcile.Result{RequeueAfter: time.Second*5}, nil

This example to the table’s last entry. The watch waits for the specified amount of time—for example, five seconds—before rerunning the request. This approach is useful when we are serially running multiple steps, although it might take longer to complete.

If a backend service needs a running database prior to starting, we can use this example to re-queue the reconcile function with a delay to give the database time to start. Once the database is running, the Operator does not re-queue the reconcile request, and the rest of the steps continue.

I recommend you review the Kubernetes API documentation, especially the core/v1 and apps/v1 directories, for more details.

Next, let’s understand the code for the reconcile function, step by step. Initially, the reconcile function retrieves the primary resource. For example:

// Fetch the WordPress instance
wordpress := &examplev1.Wordpress{}
err := r.client.Get(context.TODO(), request.NamespacedName, wordpress) -----1
if err != nil {
      if errors.IsNotFound(err) {
       // Request object not found, could have been deleted after reconcile request.
       // Owned objects are automatically garbage collected. For additional cleanup logic use finalizers.
       // Return and don't requeue
          return reconcile.Result{}, nil
      }
   // Error reading the object - requeue the request.
   return reconcile.Result{}, err
}

// ensure that the child resources are running (example can be seen in below snippet)

//if everything goes fine
return reconcile.Result{}, nil

The function checks whether the WordPress resource already exists. The variable r is the reconciler object on which the reconcile function is called. client is the client for the Kubernetes API.

Next, we create a child resource. Similar to the primary resource, the reconciler checks if the child resource is present by calling Get() for the Kubernetes client. If not, it creates the child resource in the target namespace.

For example:

found := &appsv1.Deployment{}
err := r.client.Get(context.TODO(), types.NamespacedName{
    Name: dep.Name,
    Namespace: instance.Namespace,
}, found)
if err != nil && errors.IsNotFound(err) {

// Create the deployment

        log.Info("Creating a new Deployment", Deployment.Namespace", dep.Namespace, "Deployment.Name", dep.Name)
        err = r.client.Create(context.TODO(), dep)   ------------------1

       if err != nil {

        // Deployment failed
         log.Error(err, "Failed to create new Deployment", "Deployment.Namespace", dep.Namespace, "Deployment.Name", dep.Name)
         return &reconcile.Result{}, err
         }
     // Deployment was created successfully

       return nil, nil

}else if err != nil {
    // Error that isn't due to the deployment not existing
     log.Error(err, "Failed to get Deployment")
     return &reconcile.Result{}, err
}
// deployment successful
return nil, nil

(1) Here is the code snippet for the MySQL deployment instance (dep):

labels := map[string]string{
"app": cr.Name,
}
matchlabels := map[string]string{
"app": cr.Name,
"tier": "mysql",
}

dep := &appsv1.Deployment{
        ObjectMeta: metav1.ObjectMeta{
             Name: "wordpress-mysql",
             Namespace: cr.Namespace,
             Labels: labels,
          },

        Spec: appsv1.DeploymentSpec{
            Selector: &metav1.LabelSelector{
                          MatchLabels: matchlabels,
                    },
            Template: corev1.PodTemplateSpec{
                           ObjectMeta: metav1.ObjectMeta{
                           Labels: matchlabels,
                              },
                           Spec: corev1.PodSpec{
                              Containers: []corev1.Container{{
                              Image: "mysql:5.6",
                              Name: "mysql",
                              Env: []corev1.EnvVar{
                                   {
                                     Name: "MYSQL_ROOT_PASSWORD",
                                     Value: cr.Spec.SQLRootPassword,  ------1
                                    },
                               },

                              Ports: []corev1.ContainerPort{{
                                     ContainerPort: 3306,
                                     Name: "mysql",
                                      }},
                              VolumeMounts: []corev1.VolumeMount{
                                            {
                                              Name: "mysql-persistent-storage",
                                              MountPath: "/var/lib/mysql",
                                           }, 
                                       }, 
                                   }, 
                              },

            Volumes: []corev1.Volume{
                          {
                              Name: "mysql-persistent-storage",
                              VolumeSource: corev1.VolumeSource{
                                            PersistentVolumeClaim: &corev1.PersistentVolumeClaimVolumeSource{
                                                      ClaimName: "mysql-pv-claim",
                                                 },
                                             },
                                      },
                              },
                       },
                },
         },
}

controllerutil.SetControllerReference(cr, dep, r.scheme) -------2

(1) It’s important to note the value for MYSQL_ROOT_PASSWORD, as taken from cr.Spec. (2) This is the most critical line in the definition, as it establishes the parent-child relationship between the primary resource, WordPress, and the child, deployment. We can also write similar code for child resources as pod, deployment, service, PVC, and so on. I recommend you also review WordPress-Operator for more details.

Run the Operator

Now to run the Operator. First, make sure that minikube cluster is running using these commands:

kubectl create-f ./deploy/crds/example.com_wordpresses_crd.yaml
operator-sdk run --local

In the next terminal run, use this command:

kubectl apply -f ./deploy/crds/example.com_v1_wordpress_cr.yaml

Once complete, the following logs display:

INFO[0000] Running the operator locally in namespace default. 
{"level":"info","ts":1598973876.2819793,"logger":"cmd","msg":"Operator Version: 0.0.1"}
{"level":"info","ts":1598973876.2820053,"logger":"cmd","msg":"Go Version: go1.13.10"}
{"level":"info","ts":1598973876.282011,"logger":"cmd","msg":"Go OS/Arch: linux/amd64"}
{"level":"info","ts":1598973876.2820172,"logger":"cmd","msg":"Version of operator-sdk: v0.15.2"}
{"level":"info","ts":1598973876.285575,"logger":"leader","msg":"Trying to become the leader."}
{"level":"info","ts":1598973876.285611,"logger":"leader","msg":"Skipping leader election; not running in a cluster."}
{"level":"info","ts":1598973876.5921307,"logger":"controller-runtime.metrics","msg":"metrics server is starting to listen","addr":"0.0.0.0:8383"}
{"level":"info","ts":1598973876.596543,"logger":"cmd","msg":"Registering Components."}
{"level":"info","ts":1598973876.5967476,"logger":"cmd","msg":"Skipping CR metrics server creation; not running in a cluster."}
{"level":"info","ts":1598973876.5967603,"logger":"cmd","msg":"Starting the Cmd."}
{"level":"info","ts":1598973876.5973437,"logger":"controller-runtime.controller","msg":"Starting EventSource","controller":"wordpress-controller","source":"kind source: /, Kind="}
{"level":"info","ts":1598973876.5975914,"logger":"controller-runtime.controller","msg":"Starting EventSource","controller":"wordpress-controller","source":"kind source: /, Kind="}
{"level":"info","ts":1598973876.5977812,"logger":"controller-runtime.controller","msg":"Starting EventSource","controller":"wordpress-controller","source":"kind source: /, Kind="}
{"level":"info","ts":1598973876.5979419,"logger":"controller-runtime.controller","msg":"Starting EventSource","controller":"wordpress-controller","source":"kind source: /, Kind="}
{"level":"info","ts":1598973876.5980544,"logger":"controller-runtime.controller","msg":"Starting Controller","controller":"wordpress-controller"}
{"level":"info","ts":1598973876.598183,"logger":"controller-runtime.manager","msg":"starting metrics server","path":"/metrics"}
{"level":"info","ts":1598973876.6982796,"logger":"controller-runtime.controller","msg":"Starting workers","controller":"wordpress-controller","worker count":1}
{"level":"info","ts":1598973876.6983802,"logger":"controller_wordpress","msg":"Reconciling WordPress","Request.Namespace":"default","Request.Name":"example-wordpress"}
{"level":"info","ts":1598973876.6984997,"logger":"controller_wordpress","msg":"Creating a new PVC","PVC.Namespace":"default","PVC.Name":"wp-pv-claim"}
{"level":"info","ts":1598973876.7138047,"logger":"controller_wordpress","msg":"Creating a new Deployment","Deployment.Namespace":"default","Deployment.Name":"wordpress"}
{"level":"info","ts":1598973876.736821,"logger":"controller_wordpress","msg":"Creating a new Service","Service.Namespace":"default","Service.Name":"wordpress"}
{"level":"info","ts":1598973876.8298655,"logger":"controller_wordpress","msg":"Reconciling WordPress","Request.Namespace":"default","Request.Name":"example-wordpress"}
{"level":"info","ts":1598973876.8301716,"logger":"controller_wordpress","msg":"Creating a new Service","Service.Namespace":"default","Service.Name":"wordpress"}

This example also shows the pod, deployment, service, PVC, and so on, as follows:

[pjiandan@pjiandan crds]$ kubectl get po
NAME                               READY   STATUS    RESTARTS   AGE
wordpress-6d5b4988ff-dcxfj         1/1     Running   0          16h
wordpress-mysql-59d5d89ff8-qj92r   1/1     Running   0          17h
[pjiandan@pjiandan crds]$ kubectl get svc
NAME              TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE
kubernetes        ClusterIP   10.96.0.1       <none>        443/TCP        19h
wordpress         NodePort    10.100.123.86   <none>        80:31881/TCP   16h
wordpress-mysql   ClusterIP   None            <none>        3306/TCP       17h
[pjiandan@pjiandan crds]$ kubectl get deploy
NAME              READY   UP-TO-DATE   AVAILABLE   AGE
wordpress         1/1     1            1           16h
wordpress-mysql   1/1     1            1           17h
[pjiandan@pjiandan crds]$ kubectl get pvc
NAME             STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
mysql-pv-claim   Bound    pvc-9ee52dce-b7b7-433d-8596-22392033e55e   10Gi       RWO            standard       17h
wp-pv-claim      Bound    pvc-8674f3fa-acb3-4cd7-9283-5ecec8305945   10Gi       RWO            standard       16h

Next, run the following command to return the IP address for the WordPress service:

minikube service wordpress --url

An example of this IP address response follows:

http://192.168.99.101:31881

Last, copy the IP address and load the page in the browser to view the site, as shown in Figure 1.

Initial WordPress page to load to browser

Figure 1. Copy the IP address and load the WordPress page to the browser

Conclusion

This completes my demo and presentation! In this article, I demonstrated how to use a Kubernetes Operator to deploy a stateful application. This Operator uses the operator-sdk project to deploy WordPress on SQL using a custom resource.

Again, if you need to deploy a stateful application without an Operator, see: Example: Deploying WordPress and MySQL with Persistent Volumes.

Share