I recently assisted a client to deploy Elastic Cloud on Kubernetes (ECK) on Red Hat OpenShift 4.x. They had run into an issue where Elasticsearch would throw an error similar to:
Max virtual memory areas vm.max_map_count [65530] likely too low, increase to at least [262144]
According to the official documentation, Elasticsearch uses a mmapfs
directory by default to store its indices. The default operating system limits on mmap counts are likely to be too low, which may result in out of memory exceptions. Usually, administrators would just increase the limits by running:
sysctl -w vm.max_map_count=262144
However, OpenShift uses Red Hat CoreOS for its worker nodes and, because it is an automatically updating, minimal operating system for running containerized workloads, you shouldn't manually log on to worker nodes and make changes. This approach is unscalable and results in a worker node becoming tainted. Instead, OpenShift provides an elegant and scalable method to achieve the same via its Node Tuning Operator.
The default tuned configuration contains a profile for Elasticsearch. The tuned operator on a given node looks for a pod running on the same node with the tuned.openshift.io/elasticsearch label set (match). If found, it applies the sysctl
command (data).
You can view the default configuration by logging into your OpenShift cluster and running:
bastion $ oc get Tuned/default -o yaml -n openshift-cluster-node-tuning-operator apiVersion: tuned.openshift.io/v1alpha1 kind: Tuned metadata: name: default namespace: openshift-cluster-node-tuning-operator spec: profile: ... ... - name: "openshift-node-es" data: | [main] summary=Optimize systems running ES on OpenShift nodes include=openshift-node [sysctl] vm.max_map_count=262144 recommend: ... ... - profile: "openshift-node-es" priority: 20 match: - label: "tuned.openshift.io/elasticsearch" type: "pod" ```
The trick is to ensure that the Elasticsearch operator tags its pods with the label: tuned.openshift.io/elasticsearch
. Below is an example of how to achieve this.
--- apiVersion: elasticsearch.k8s.elastic.co/v1alpha1 kind: Elasticsearch metadata: name: elasticsearch-tst spec: version: "7.2.0" setVmMaxMapCount: false nodes: - config: node.master: true node.data: true nodeCount: 1 podTemplate: metadata: labels: tuned.openshift.io/elasticsearch: ""
The tuned operator will read the pod label tuned.openshift.io/elasticsearch
and add the vm.max_map_count=262144
to the node running the pod. This is useful because pods can be terminated and scheduled on different nodes across the cluster. No more manually worrying about the sysctl configuration of the nodes running a particular workload.
Thanks to James Ryles for helping solve this problem.
Let me know if you run into any issues.
Last updated: October 31, 2023