The Monitoring aspects of Eclipse MicroProfile 1.2

Eclipse MicroProfile (MP) aims at bringing Microservices to Enterprise Java by developing common standards that MP-compliant vendors then implement [1]. This not only applies for developer APIs but also to interfaces for running, configuring, and managing the servers.

The more classical specifications have often left out many details as vendor-specific – especially in the area of setting up and running of the applications and servers. For the Java Enterprise Edition, there are standards like JMX and JSR-77, but those were most of the time not used or access from management stations was not specified. This made it harder in practice to monitor the health of an application or to port the application from one application server vendor than needed.

The MicroProfile community has decided that aspects of running the applications like telemetry and health checks should not be vendor specific but be part of the base specifications.

I will now show two aspects of Monitoring in MicroProfile, that are included in the MicroProfile 1.2 release. Further aspects like distributed tracing may follow in subsequent releases.

Health Checks

Health Checks answer the binary question “is my application running well or should it be restarted?” This has always been an important question for operations and it was pretty common to just restart an application that had memory leaks.  Health Checks became a lot more relevant in recent days with schedulers like Kubernetes, where health checks are a core concept. Kubernetes and thus OpenShift regularly check the health of a running container. If the container reports being unhealthy (or doesn’t answer at all), Kubernetes will kill the container and start a new instance.

Describing the health state via a Java API

Health checks have two access points, the http side, and the Java API side. Let’s first have a look at the Java side to expose the system health:

@Health
@ApplicationScoped
public class HealthDemo implements HealthCheck {
    @Override
    public HealthCheckResponse call() {
        HealthCheckResponseBuilder alive = HealthCheckResponse.named("alive");
        // add other info
        return alive.up().build();
    }
}

To expose the data, you have to implement the HealthCheck interface. Within the implemented call() method one then retrieves a HealthCheckResponseBuilder with a name for the check and supplies the status (up/down). It is also possible to supply more parameters to the HealthCheckResponseBuilder. Those are then exposed on the rest interface.

It is possible to provide more than one such health check provider. The results of all those checks will then aggregated to form the final outcome.

Fetching the health state via http

Systems can query health data via http GET operation on the /health endpoint. Kubernetes will consider the application as healthy if the response code is in the range of 200-399.

Thus to report a healthy system state, MP-Health implementations will respond with “200 OK”, and a payload that specifies the outcome as UP and the list of checks executed. The checks array will be empty unless specific health checks have been installed.

$ curl http://localhost:8080/health
{
"outcome": "UP",
"checks": [
    {
        "name": "alive",
        "state": "UP"
    }
  ]
}

The individual checks provide their status within the checks section. This is helpful when more than one check is configured and the overall outcome is DOWN. The individual checks then help to pinpoint the cause of unhealthiness. Additional information given to the HealthCheckResponseBuilder in the Java code is passed on to the individual check result.

Telemetry aka MicroProfile Metrics

Telemetry exposes metrics of the running server like CPU and Memory usage, thread count and others. Those are then often fed into charting systems to visualize the metrics over time or serve for capacity planning purposes.

The Java Virtual machine has a way to expose data for a long time via MBeans and the MBeanServer. Since Java SE 6, there is even an (RMI based) remote protocol defined for all VMs on how to access the MBean Server from remote processes.

Dealing with this protocol is difficult and does not fit in today’s http-based interactions. The other pain point is that many of the existing servers have different properties exposed under different names. It is thus not easy to set up monitoring of different kinds of servers.

MicroProfile has created a monitoring specification that addresses the above two points via a http-based API for access by monitoring agents and a Java API that allows exporting application-specific metrics.

There are three scopes for metrics within the specification:

  • Base: those are metrics, mostly JVM statistics, that every compliant vendor has to support.
  • Vendor: optional vendor-specific metrics that are not portable.
  • Application: optional metrics from deployed applications. I will present the Java-API for those below.

Retrieve telemetry data via http

Let’s have a look at the way monitoring agents can retrieve data from the server.

MicroProfile Metrics exposes data in two formats by default. The Prometheus text format is used if no format is requested explicitly. The specification also defines a JSON encoding, which can be requested by passing the media-type of ‘application/json’.

$ curl -Haccept:application/json http://localhost:8080/metrics/base
{
  "classloader.totalLoadedClass.count" : 12304,
  "cpu.systemLoadAverage" : 2.029296875,
  "thread.count" : 53,
  "classloader.currentLoadedClass.count" : 12262,
  "jvm.uptime" : 6878170,
  "gc.PS MarkSweep.count" : 3,
  "memory.committedHeap" : 1095237632,
  "thread.max.count" : 66,
  "gc.PS Scavenge.count" : 11,
  "cpu.availableProcessors" : 4,
  "thread.daemon.count" : 11,
  "classloader.totalUnloadedClass.count" : 42,
  "memory.maxHeap" : 3817865216,
  "memory.usedHeap" : 427363088,
  "gc.PS MarkSweep.time" : 322,
  "gc.PS Scavenge.time" : 244
}

In the previous example, we are only retrieving the metrics in the base scope in the JSON format. Next, we are exposing metrics from all scopes in the Prometheus format. I have trimmed the output to only show one metrics of each scope.

$curl http://localhost:8080/metrics
# TYPE application:de_bsd_swarmdemo_rest_hello_world_endpoint_a_counter counter
application:de_bsd_swarmdemo_rest_hello_world_endpoint_a_counter{tier="integration"} 52.0
# TYPE base:classloader_total_loaded_class_count counter
base:classloader_total_loaded_class_count{tier="integration"} 12304.0
# TYPE vendor:memory_pool_metaspace_usage_max gauge
vendor:memory_pool_metaspace_usage_max_bytes{tier="integration"} 6.47796E7

Java API

Let’s now have a look at the Java-API. I am using a single JAX-RS endpoint for this illustration. Users of DropWizard Metrics may find some of the following familiar. The API is on purpose modeled after DropWizard Metrics. It has been enhanced to be usable with the help of CDI to do the heavy lifting.

@ApplicationScoped
@Path("/hello")
public class HelloWorldEndpoint {

    @Inject
    Counter aCounter;

    @GET
    @Produces("text/plain")
    @Counted(description = "Counting of the Hello call", absolute = true)
    @Timed(name="helloTime", description = "Timing of the Hello call", absolute = true)
    @Metered(absolute = true, name = "helloMeter")
    public Response doGet() {
        aCounter.inc();
        return Response.ok("Hello from WildFly Swarm! " 
                               + aCounter.getCount())
               .build();
    }
}

Using and exposing the metrics happens via the magic of CDI. In most cases, it is enough to just provide one of the annotation @Counted, @Timed and so on from the package org.eclipse.microprofile.metrics.annotation on the method or field you want to expose and the implementation will do the rest for you.

There is one counter (aCounter) in the example that is merely defined for exposition with the help of @Inject (Line 6).

The counter is explicitly increased in line 14. Its value is retrieved in line 16 and included in the REST response of our JAX-RS endpoint.

If you look at the annotations, you can see that additional metadata can be provided as the description above. Supplying metadata is strongly encouraged to make it easier for operators to understand what a certain metric means.

If you do not provide an explicit name on the annotation then the implementation computes a base name from the annotated item. In case of the @Counted inline 10, the resulting metric name is doGet which is the method name. The fully qualified class name is not prepended because the parameter absolute is true.

Contributing

MicroProfile specifications follow an agile good enough approach with feedback cycles. Once a specification has been released, it will be included in a subsequent MicroProfile umbrella release. It is possible that a newer version of a specification breaks compatibility with previous versions.

If you are interested in contributing to future versions of the specification, you can join the MicroProfile Google Group.

References

Prometheus text format https://prometheus.io/docs/instrumenting/exposition_formats/#text-format-details

MicroProfile Health Specification repository https://github.com/eclipse/microprofile-health

MicroProfile Metrics Specification repository https://github.com/eclipse/microprofile-metrics

[1] https://developers.redhat.com/blog/2016/06/27/microprofile-collaborating-to-bring-microservices-to-enterprise-java/


Take advantage of your Red Hat Developers membership and download RHEL today at no cost.

Share