With the latest release of the Cluster Observability Operator version 0.4.0 a technology preview of the OpenShift web console Tracing plug-in has been made available to users. In addition to the ScatterPlot provided with the developer preview release in July, you can now make use of a Gantt chart to explore trace details. This is a great enhancement, as we are driven by the mission to provide Red Hat OpenShift users with an "observability as one" experience (see Figure 1), allowing you to easily analyze metrics, alerts, logs, network traffic, and now traces under one console.
Note that—as highlighted in our previous article (see developer preview link above)—Perses is the open source project of choice powering the Traces UI plug-in: from the ScatterPlot and the trace table to the Gantt chart. Thanks to this work, Perses now supports Tempo as a data store. Learn more about Perses, which was recently accepted as a CNCF Sandbox project.
With this background in mind, key new functionalities of the Traces UI plug-in technology preview release include:
- New TraceQL querier.
- Improved and interactive trace table.
- Brand new mini Gantt chart facilitating user navigation.
- Gantt Chart with trace details and side panel to investigate trace attributes and events.
- Multi-tenancy is now supported.
Disclaimer:
Instances without multi-tenancy are not supported anymore to prevent any potential security issues.
Not yet sure how Traces can help you? A series of blogs deep dive into the topic. Read the latest ones to learn how to best benefit from this innovative observability pillar—particularly essential for complex applications and microservice systems. And now, let’s explore what’s new!
Technology preview: Everything you need to know
To make use of the Traces UI plug-in’s technology preview release, users need to send traces to an existing distributed tracing platform (Tempo), optionally via the Red Hat build of OpenTelemetry. As the plug-in is deployed via the Cluster Observability Operator, installing it is a must. Therefore, once the relevant operators are installed, you will be able to make use of the existing Observability UI plug-ins, including the brand new Tracing one.
Now you are ready to explore distributed traces, which are crucial to monitor the lifecycle of requests across distributed systems. When accessing the Observe > Traces UI in the OpenShift web console, you first need to select a Tempo instance and Tenant, then enter the relevant TraceQL query to select the services that are key to your use case. Following this selection, you are able to visualize a scatter plot displaying spans sizes and the related trace table (see Figure 2).
What does the scatter plot display?
- Bubble size: The bigger the bubble, the higher the number of spans in the trace.
- Color coding: Blue means that all spans in the trace were successful, and red means that an error was detected in one or more spans of the trace.
- The X-axis represents the timeline of when the traces were generated.
- The Y-axis represents the duration of the traces.
Moreover, in the enhanced trace table—positioned below the scatter plot—you have the ability to analyze: 1) the specific trace name and related services, 2) the number of spans and (if detected) related errors, 3) the total span duration, and 4) the start time. Note that at the top right of the page, you can select your preferred time range, which eases the data navigation experience. Additionally, the table described here is now interactive, meaning that when clicking on each trace you will be directed to the relevant trace detail—displayed in the brand new Gantt Chart (Figure 3).
Gantt chart: Observing your requests with context
When we added the scatter plot in our previous release, we knew there was something very important yet to come: the ability to navigate throughout the entire request spans generated from a request. This feature is critical to close the observability loop, and that is why we are so happy to see this as a reality. Stay with us and let’s continue navigating the OpenShift web console!
First of all, at the top of the page (Figure 3 above) you can make use of the brand new mini Gantt chart. This allows you to better navigate trace details, such as selecting your desired time duration—allowing users to better zoom in into the chosen time frame—an action that will be reflected in the main Gantt chart (positioned below the mini Gantt chart) where you can investigate parent-child spans, and thus drill down into more granular levels of information.
As displayed in Figure 4, users can also click on a specific span in the Gantt chart and access—via a side panel on the right—the attributes and events related to the chosen span, including resource attributes.
This is very important to users, as it is now possible to check, with context:
- Which spans take more time.
- The relationship between spans.
- The identity of problematic and healthy spans.
- Status codes and relevant data.
This Gantt chart and side panel provide users access to a high level view as well as granular information to understand where the issue is, and identify what can be causing it.
All in all, this technology preview release provides the heart of any Tracing exploration experience: the Gantt chart. Thanks to this visualization tool, OpenShift users are empowered to easily detect areas of improvement for their services (e.g., performance inefficiencies or problematic code), prioritize tasks, and prevent major incidents.
What’s next?
The ability to make use of a native Traces UI in the OpenShift web console is a game-changing feature that improves the overall observability visualization experience and promotes additional data exploration and troubleshooting opportunities. Note that the Tempo operator will still be shipped in the Jaeger User Interface while we continue working to provide all important features and updates in the OpenShift web console based on your feedback.
With the latest Cluster Observability Operator 0.4.0 release, a technology preview of observability signal correlation has also been made available. Trace support is currently being worked on so that OpenShift users will be able to navigate through a greater range of correlated observability signals/resources and investigate issues in a timely manner. For example, given a specific alert, users will be able to quickly identify its correlated metric(s), log(s), netflows, and soon, traces.
Moreover, as we plan to make the Traces UI plug-in generally available to OpenShift users—and in parallel improving usability with advanced filtering—we are looking forward to receiving your input and ideas. Feel free to share your questions and recommendations with us using the Red Hat OpenShift feedback form.