How to customize OpenShift RBAC permissions

Logs are like gold dust. Taken alone they may not be worth much, but put together and worked by a skillful goldsmith they may become very valuable. OpenShift comes with The EFK stack: Elasticsearch, Fluentd, and Kibana. Applications running on OpenShift get their logs automatically aggregated to provide valuable information on their state and health during tests and in production.

The only requirement is that the application sends its logs to the standard output. OpenShift does the rest. Simple enough!

In this blog I am covering a few points that may help you with bringing your logs from raw material to a more valuable product.

Configuration file externalization

This first step aims at providing a level of flexibility by making it easy to change logging properties and level when required. Therefore the log configuration needs to be externalized. Most Java applications use a framework for logging like Log4j, or Logback, which are configurable through an XML or properties file.

When running an application in OpenShift you don't want to rebuild a container image for changing the logging configuration. This can easily be avoided by using a configMap, which gets mounted with the log configuration file into the container file system.

In a Spring Boot application you would just specify the location of the log configuration file in application.properties:

logging.config=file:/opt/configuration/logback.xml

Creating the configMap is as simple as a single command line:

$ oc create configmap logconfig --from-file=./src/main/resources/logback.xml

The deployment configuration needs then to be edited to get the configMap mounted into the container file system:

[...]
volumeMounts:
- name: config-volume
  mountPath: /opt/configuration
volumes:
- name: config-volume
  configMap:
    name: logconfig

You may also want your log configuration changes to be applied without the need to restart the application / container. Changes made in a configMap in OpenShift are automatically reflected into the container file system. You simply need to tell your application to check for changes. It can be done with Logback by setting scan=true in logback.xml. The scan period can be set with the following: scanPeriod="30 seconds". The default scan period is one minute.

Structuring logs

Fluentd forwards logs to Elasticsearch using the index name "project.{project_name}.{project_uuid}.YYYY.MM.DD" as per the documentation. The index gets automatically created and the log content is enhanced with useful Meta data, like the names of the containers, pod, and project, and a collection timestamp.

This is all good and takes a lot of work away from the application team. You may however also want to structure the log messages you write so that additional fields can be indexed, queried, and used by filters.

For this purpose the logstash-logback-encoder can be used with Logback. It enables you to create logs in JSON format. With Log4j you just need to use the JSON layout.

The JSON library for Logback needs to be added to the project pom. If you use Spring Boot, the versions of the logback dependencies imported by Spring Boot starter need to be excluded, as they are older versions:

<properties>
[...]
  <ch.qos.logback.version>1.2.3</ch.qos.logback.version>
</properties>

<dependencyManagement>
  <dependencies>
[...]
  <!-- Use the latest versions compatible with logstash-logback-encoder 4.11 -->
  <!-- Older versions may be imported by Spring Boot starters that haven't been tested with 4.11 -->
    <dependency>
      <groupId>ch.qos.logback</groupId>
      <artifactId>logback-core</artifactId>
      <version>${ch.qos.logback.version}</version>
    </dependency>
    <dependency>
      <groupId>ch.qos.logback</groupId>
      <artifactId>logback-classic</artifactId>
      <version>${ch.qos.logback.version}</version>
    </dependency>
    <dependency>
      <groupId>ch.qos.logback</groupId>
      <artifactId>logback-access</artifactId>
      <version>${ch.qos.logback.version}</version>
    </dependency>
[...]

Logback configuration can then use the JSON encoder:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE xml>
<configuration scan="true" scanPeriod="30 seconds" >
  <appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
    <encoder class="net.logstash.logback.encoder.LogstashEncoder"/>
  </appender>
  <root level="info">
    <appender-ref ref="STDOUT" />
  </root>
</configuration>

Bellow is the result you get in Elasticsearch per default. _source fields can be added or removed by amending the default template. Details can be found here:

{
  "_index": "project.test.ac2664e3-1398-11e7-acfe-5254005a078f.2018.01.14",
  "_type": "com.redhat.viaq.common",
  "_id": "AWDz2VVS0wWjxf1Icmz5",
  "_score": null,
  "_source": {
    "@timestamp": "2018-01-14T08:46:33.146088894Z",
    "@version": 1,
    "message": "Connector vm://localhost stopped",
    "logger_name": "org.apache.activemq.broker.TransportConnector",
    "thread_name": "http-nio-0.0.0.0-8081-exec-9",
    "level": "info",
    "level_value": 20000,
    "docker": {
      "container_id": "10ba060d704c1153165b0db0b38f2d294e3a2df9d7e5447023b710ac79c22d72"
    },
    "kubernetes": {
      "container_name": "spring-boot",
      "namespace_name": "test",
      "pod_name": "spring-boot-camel-amq-te-5-pphb6",
      "pod_id": "3063a039-f907-11e7-a821-5254005a078f",
      "labels": {
        "deployment": "spring-boot-camel-amq-te-5",
        "deploymentconfig": "spring-boot-camel-amq-te",
        "group": "org.jboss.fuse.fis.archetypes",
        "project": "spring-boot-camel-amq-testing",
        "provider": "fabric8",
        "version": "2.2.195.redhat-000013"
      },
      "host": "ocpnode3.sandbox.com",
      "master_url": "https://kubernetes.default.svc.cluster.local",
      "namespace_id": "ac2664e3-1398-11e7-acfe-5254005a078f"
    },
    "hostname": "ocpnode3.sandbox.com",
    "pipeline_metadata": {
      "collector": {
        "ipaddr4": "10.129.0.51",
        "ipaddr6": "fe80::858:aff:fe81:33",
        "inputname": "fluent-plugin-systemd",
        "name": "fluentd",
        "received_at": "2018-01-14T08:46:34.054763+00:00",
        "version": "0.12.39 1.6.0"
      }
    }
  },
  "fields": {
    "@timestamp": [
      1515919593146
    ],
    "pipeline_metadata.collector.received_at": [
      1515919594054
    ]
  },
  "sort": [
   1515919593146
  ]
}

Further on you can add structured fields with logstash encoder by using Structured Arguments or Markers. Here is an example:

net.logstash.logback.marker.Markers.*

[...]

LOG.info(append("MyfirstField", myFirstValue).and(append("MySecondField", mySecondValue),"my message");

Indexing

If you send structured logs you may however want to fine-tune the index. The explanation on how Elasticsearch API can be accessed is provided in OpenShift documentation. The first step is to log into an Elasticsearch container with "oc rsh". Administrative operations can then be performed, for instance configuring fields:

curl -XPUT 'localhost:9200/my_index/_mapping/type_one?update_all_types&pretty' \
    -H 'Content-Type: application/json' -d '{ \
    "properties": { \
      "text": { \
        "type": "text", \
        "analyzer": "standard", \
        "search_analyzer": "whitespace" \
      } \
    } \
}'

As a new index is created every day it is best to apply modifications through an index template rather than directly to an existing index. This means that the modifications won't be visible before the next day, but will be applied to the indices created the subsequent days as well. Experimenting with the current index and then moving to an index template when you are satisfied with the result seems to be a sound approach.

Best is to create a new index template rather than modifying the template provided with OpenShift, which can also be retrieved by querying Elasticsearch, as Elasticsearch supports multiple templates matching.

You should now have nice data available in Elasticsearch and can start creating searches, visualizations, and dashboards in Kibana.

Last updated: October 18, 2018