Featured image for Cryostat (was ContainerJFR) topics.

This article introduces a special rule definition in Cryostat 2.0 that lets you access JDK Flight Recorder (JFR) data on the fly, without waiting for your application's normally scheduled archive process. We'll introduce Cryostat's new POST rule definition and show you how to use it to quickly diagnose performance problems in containerized applications running on Kubernetes and Red Hat OpenShift.

Read the series:

We're publishing a series of hands-on guides to using Cryostat 2.0, which is JDK Flight Recorder (JFR) for Kubernetes. Read all of the articles in this series:

Note: The Red Hat build of Cryostat 2.0 is now widely available in technology preview. The Red Hat build includes the Cryostat Operator to simplify and automate Cryostat deployment on OpenShift.

Continuous monitoring for service disruptions

Consider a scenario where a cluster administrator has deployed Cryostat in a project namespace and enabled continuous application monitoring using Cryostat's automated rules. Such a rule definition might look like this:


{
  "name": "continuousMonitoring",
  "description": "Enable the Continuous template on all parts of our service",
  "matchExpression": "target.labels.platform[‘app’]==’myService’",
  "eventSpecifier": "template=Continuous,type=TARGET",
  "archivalPeriodSeconds": 3600,
  "preservedArchives": 4
}

This rule, called continuousMonitoring, creates a recording using the Continuous event template on all targets with the app=myService label. It then copies the recording data to archives once every hour and maintains four of these archived copies.

In this scenario, the project contains many instances of microservices that together provide a larger external service to customers. The microservices make internal requests to each other while servicing customer traffic. Occasionally, for reasons the cluster administrator doesn't yet understand, a hiccup occurs, and the time to service a customer’s request suddenly spikes.

Using the POST rule for OpenJDK flight data

Let say an admin receives notification that a spike has just occurred, but the continuousMonitoring rule is not set to copy the JDK Flight Recorder data into archives for another 40 minutes. Obviously, the admin doesn't want to wait that long: They need to discover what caused the service latency spike and fix it right away. For that, they need an immediate way to capture the JDK Flight Recorder data.

Cryostat 2.0 lets cluster admins capture OpenJDK flight data immediately by POSTing a rule definition with the following form:


{
  "matchExpression": "target.annotations.platform[‘app’]==’myService’",
  "eventSpecifier": "archive"
}

Notice the eventSpecifier: archive property. This is a special case that instructs Cryostat that, rather than creating a new recording in all of the matching targets, it should immediately copy recording data from all of the matching targets to the Cryostat archives. Using the same matchExpression as the original rule definition ensures that the archival action applies to the same targets.

As an example, if the admin suspected that the problem lived in a particular subset of the microservice targets, they could fine-tune the matchExpression to suit. For example, they could focus their search on the login service:


{
    "matchExpression": "target.annotations.platform[‘app’]==’myService’ && /^my-app-login-service-/.test(target.alias)",
    "eventSpecifier": "archive"
}

After POSTing the new rule definition to Cryostat, the admin will immediately have access to a copy of each microservice replica’s recording data in the Cryostat archives. Let’s take a look:


$ CRYOSTAT=https://cryostat.example.com
$ curl -F name="continuousMonitoring" -F description=”Enable the Continuous template on all parts of our service” -F matchExpression=”target.annotations.platform[‘app’]==’myService’” -F eventSpecifier=”template=Continuous,type=TARGET” -F archivalPeriodSeconds=3600 -F preservedArchives=4 $CRYOSTAT/api/v2/rules
$ # some time passes
$ curl -F matchExpression=”target.annotations.platform[‘app’]==’myService’ && /^my-app-login-service-/.test(target.alias)” -F eventSpecifier=archive $CRYOSTAT/api/v2/rules
$ curl $CRYOSTAT/api/v1/recordings

From here, they can process the JSON response and take various actions. The response is an array of objects containing information about each recording in the archives. The recording names can be filtered to isolate the ones produced by our archive rule. The admin can then use the JSON objects’ reportUrl or downloadUrl properties to get an HTML document containing either an automated analysis or the full JDK Flight Recorder data dump.

Conclusion

Regularly archived JDK Flight Recorder data recordings are adequate for diagnostics purposes most of the time. But sometimes you need more immediate access to data. In this article, you've learned how to use Cryostat's new POST rule format to access JDK Flight Recorder data on the fly. As a cluster admin, you can respond to system disruptions by POSTing a specially formulated rule, then batch retrieve the data to diagnose disruptions in your containerized application's performance. Visit Cryostat.io for more about this and other new features in Cryostat 2.0.

Comments