Featured image for Kubernetes topics.

This article describes how to write, configure, and install a simple Kubernetes validating admission webhook. The webhook intercepts and validates PrometheusRule object creation requests to prevent users from creating rules with invalid fields.

A key benefit of this approach is that your clusters will only contain prevalidated user-defined rules, resulting in uncluttered configuration across environments. Additionally, imagine there is an external alerting system that leverages fields in these customer-provided rules to make alerting decisions. It is important to ensure the rules are properly formatted, so the alerts are forwarded to the appropriate teams with the correct information.

The example here is quite simple, but it can serve as a starting point to cleaner Prometheus installations with minimal errors.

Overview of Kubernetes admission control

When operating clusters on Red Hat OpenShift or Kubernetes, we often perform checks in order to mutate, validate, or convert REST API requests for object creation.

Such operations occur at admission time—that is, right after the request is authorized, but before the object is registered in etcd. This process is important in order to implement control decisions in the kube-apiserver admission chain.

A common example used in existing documentation is the LimitRanger admission controller. This is a great example because the end user immediately understands that their configuration must be changed to schedule new workloads. Once pods are denied admission for a specific reason, there is no further need to progress into the pod scheduling phase.

For our example, we will focus on ValidatingAdmissionWebhook admission controllers and the steps required to write and deploy one using the Go-based controller-runtime webhook package.

An example validating admission webhook

Before diving into the code, it might be useful to explain further and visualize our example. We are going to create a validating admission webhook that will be invoked each time a new PrometheusRule is created.

To configure the API server to forward requests to our validation code, it is necessary to create a ValidatingWebhookConfiguration with a specified service to forward requests to, as defined in webhooks.clientConfig (see Figure 1). This service will be called every time a PrometheusRule is created or updated, as seen in the rules section. Also noteworthy is the namespaceSelector, which only considers AdmissionReview objects for namespaces with the prom-webhook: enabled label.

The REST handler call flow: validatingwebhook --> service --> pod { /validate-promrule }.
Figure 1: The REST handler call flow.

Validating webhooks are executed sequentially and are independent of each other, even when acting on the same resource.

Because the test cluster has Red Hat OpenShift 4.6 installed, it is important to mention that a webhook for PrometheusRules already exists by default: validatingwebhookconfiguration/prometheusrules.openshift.io, which is part of the cluster monitoring stack. This is useful to know so that we don't choose the same REST endpoint path for our handler function. We also won't be allowed to name our webhook resource the same way.

Bootstrap with the Operator SDK

We will use the Operator SDK to initialize our project with a base structure that we can use to start coding.

To start, we can use the operator-sdk init command, passing in an example domain and a repository as parameters. The goal is to keep things as simple as possible since we won't be using much of the generated code, even though the generated manifests and scaffolding are very useful to have. A go.mod file and a go.sum file are also generated for us in the process:

operator-sdk init --domain=example.com --repo=github.com/example/promrule-webhook
Writing kustomize manifests for you to edit...
Writing scaffold for you to edit...
Get controller runtime:
$ go get sigs.k8s.io/controller-runtime@v0.8.3
go: downloading k8s.io/apimachinery v0.20.2
go: downloading k8s.io/client-go v0.20.2
go: downloading k8s.io/api v0.20.2
go: downloading k8s.io/apiextensions-apiserver v0.20.1
go: downloading k8s.io/component-base v0.20.2
Update dependencies:
$ go mod tidy

Given that we will be implementing a small REST handler, we don't need to generate any of the additional boilerplate code that is created by the operator-sdk create api and create webhook commands. What matters is that we now have a base starting point to implement our logic.

Set up a controller manager

The following code snippet contains the basics for setting up a controller manager, which is then used to instantiate a webhook server:

package main

import (
	"os"

	_ "k8s.io/client-go/plugin/pkg/client/auth/gcp"
	"sigs.k8s.io/controller-runtime/pkg/client/config"
	"sigs.k8s.io/controller-runtime/pkg/log"
	"sigs.k8s.io/controller-runtime/pkg/log/zap"
	"sigs.k8s.io/controller-runtime/pkg/manager"
	"sigs.k8s.io/controller-runtime/pkg/manager/signals"
	"sigs.k8s.io/controller-runtime/pkg/webhook"

	webhookadmission "github.com/rflorenc/prometheusrule-validating-webhook/admission"
)

func init() {
	log.SetLogger(zap.New())
}

func main() {
	setupLog := log.Log.WithName("entrypoint")

	// setup a manager
	setupLog.Info("setting up manager")
	mgr, err := manager.New(config.GetConfigOrDie(), manager.Options{})
	if err != nil {
		setupLog.Error(err, "unable to setup controller manager")
		os.Exit(1)
	}

	// +kubebuilder:scaffold:builder

	setupLog.Info("setting up webhook server")
	hookServer := mgr.GetWebhookServer()

	setupLog.Info("registering PrometheusRule validating webhook endpoint")	
	hookServer.Register("/validate-promerule", &webhook.Admission{Handler: &webhookadmission.PrometheusRuleValidator{Client: mgr.GetClient()}})

	setupLog.Info("starting manager")
	if err := mgr.Start(signals.SetupSignalHandler()); err != nil {
		setupLog.Error(err, "unable to run manager")
		os.Exit(1)
	}
}

Once the hookServer is created, we just have to register the previously configured REST endpoint. We do this by passing it a Handler{} and a client, which the manager provides. The remaining code was generated from the initial bootstrap via the Operator SDK and starts the manager with the default configuration, logging, and signal handling.

A Prometheus rule validator

The PrometheusRuleValidator is a helper struct that receives a client to talk with the kube API and (more importantly) defines a pointer to an admission decoder:

package admission

import (
	"context"
	"fmt"
	"net/http"

	monitoringv1 "github.com/prometheus-operator/prometheus-operator/pkg/apis/monitoring/v1"

	"sigs.k8s.io/controller-runtime/pkg/client"
	"sigs.k8s.io/controller-runtime/pkg/webhook/admission"
)

// +kubebuilder:webhook:path=/validate-v1-prometheusrule,mutating=false,failurePolicy=fail,groups="",resources=pods,verbs=create;update,versions=v1,name=prometheusrule-validating-webhook.example.com

// PrometheusRuleValidator validates PrometheusRules
type PrometheusRuleValidator struct {
	Client  client.Client
	decoder *admission.Decoder
}

// PrometheusRuleValidator admits a PrometheusRule if a specific set of Rule labels exist
func (v *PrometheusRuleValidator) Handle(ctx context.Context, req admission.Request) admission.Response {
	prometheusRule := &monitoringv1.PrometheusRule{}

	err := v.decoder.Decode(req, prometheusRule)
	if err != nil {
		return admission.Errored(http.StatusBadRequest, err)
	}

	for _, group := range prometheusRule.Spec.Groups {
		for _, rule := range group.Rules {
			_, found_severity := rule.Labels["severity"]
			_, found_example_response_code := rule.Labels["example_response_code"]
			_, found_example_alerting_email := rule.Labels["example_alerting_email"]

			if !found_severity || !found_example_response_code || !found_example_alerting_email {
				return admission.Denied(fmt.Sprintf("Missing one or more of minimum required labels. severity: %v, example_response_code: %v, example_alerting_email: %v", found_severity, found_example_response_code, found_example_alerting_email))
			}
		}
	}
	return admission.Allowed("Rule admitted by PrometheusRule validating webhook.")
}

func (v *PrometheusRuleValidator) InjectDecoder(d *admission.Decoder) error {
	v.decoder = d
	return nil
}

As the name suggests, the InjectDecoder() function injects an admission decoder into the PrometheusRuleValidator. But why do we need decoding in the first place?

When a target object is received for admission, it is wrapped or referenced in an AdmissionReview object, as shown in the following code. Before this object can be accessed, it has to be decoded so its fields can be accessed for the operations we allow:

{
    "kind": "AdmissionReview",
    "apiVersion": "admission.k8s.io/v1beta1",
    "request": {
        "uid":"800ab934-1343-01e8-b8cd-45010b400002",
        "resource": {
            "version":"v1",
            "resource":"prometheusrules"
        },
        "name": "foo",
        "namespace": "test",
        "operation": "CREATE",
        "object": {
            "apiVersion": "v1",
            "kind": "PrometheusRule",
            "metadata": {
                "annotations" : {
                    "abc": "[]"
                },
                "labels": {
                    "def": "true"
                },
                "name": "foo",
                "namespace": "test"
            },
            "spec": {
                ...
                ...
            }
        }
    }
}

So, to decode a PrometheusRule from the "parent" object, we first instantiate it as a raw object and then pass it on to the Decode() function together with the AdmissionReview in the request:

err := v.decoder.Decode(req, prometheusRule)
if err != nil {
    return admission.Errored(http.StatusBadRequest, err)
}

Once this JSON body decoding step is done, we can access the PrometheusRule fields directly. We then allow or deny object admission based on simple validations. If it is allowed, the object will be persisted to etcd.

Validating webhook authorization

Because a ValidatingWebhook is a construct that sits close to the Kubernetes API server, certificate-based authentication and authorization are mandatory.

In Red Hat OpenShift, we can annotate a ValidatingWebhookConfiguration object with service.beta.openshift.io/inject-cabundle=true to have the webhook's clientConfig.caBundle field populated with the service CA bundle. This allows the Kubernetes API server to validate the service CA certificate used to secure the targeted endpoint.

Note in the example ValidatingWebhookConfiguration, we specify a dummy clientConfig.caBundle. The value doesn't matter here because it will be injected based on the annotation.

For the container running our code to access the service CA certificate, it has to mount a ConfigMap object annotated with service.beta.openshift.io/inject-cabundle=true. Once annotated, the cluster automatically injects the service CA certificate into the service-ca.crt key on the ConfigMap:

volumeMounts:
# default path if manager.Options.CertDir is not overriden
# /tmp/k8s-webhook-server/serving-certs/tls.{key,crt} must exist.
  - mountPath: /tmp/k8s-webhook-server/serving-certs
    name: serving-cert

You can specify an alternate path and mount point by altering the CertDir controller manager Options struct in main.go.

When writing tests locally, for example, we need to be able to either port-forward to our pod or authenticate to the API with a valid token and a valid TLS key and certificate combination before POSTing our payload with the AdmissionReview and reference the JSON-encoded object.

Conclusion

Validating, mutating, and converting webhooks can be of great use in organizations looking for Kubernetes policy enforcement and object validation. On the other hand, having too many registered webhooks can become hard to manage. In edge cases, they can potentially cause scalability issues if not coded correctly, because they will be called every time a new target object is created. Therefore, webhooks should be used in moderation.

Before the operator-sdk and controller-runtime webhook packages were available, there were multiple nonstandard ways to implement such a project. Requirements such as coding the web server, REST handling routing, JSON object decoding from AdmissionReview, and certificate handling could be done in all sorts of different ways. This article presented an alternative.

See the GitHub repository associated with this learning project for a full example with a Helm installer.

Comments