A beginner's guide to the Shenandoah garbage collector

This article is a quick introduction to Red Hat's Shenandoah, a high-performance low-pause-time garbage collector. It covers Shenandoah's basic features, use cases, garbage collection (GC) logging, and basic troubleshooting.

This post is meant to be a straightforward overview. For a deeper dive, I recommend going straight to the upstream documentation page. The upstream email list is shenandoah-dev@openjdk.org.

About Shenandoah GC

Here is a very quick summary of Shenandoah's key characteristics:

Concurrent: This means the application runs together with the garbage collection.
Location-based GC: Forwarding pointers enable Shenandoah to collect each region independently without remembered sets.
Not generational-based: This means the log does not have Young, Tenure, or Old divisions; Therefore do not look for young generations/old generations in the GC logs.
Operates in 3 or 2 concurrent phases: Shenandoah operates in 2 or 3 concurrent phases, where the traversal phase (or traversal mode) was deprecated.
Supported on Red Hat's OpenJDK 1.8, OpenJDK 11, OpenJDK 17, and OpenJDK 21.
Compacts concurrently: Shenandoah compacts concurrently, thus avoiding fragmentation issues.
Goal of <10ms pauses: This is a soft goal similar to Oracle's Z Garbage Collector (ZGC) - not that ZGC and genZGC are production-ready.

The concepts above can seem a bit overwhelming if you're coming from an outside GC context, but Shenandoah's process is in many ways similar to Oracle's G1GC behavior. See details below:

Like G1GC, Shenandoah is region-based, so the memory is divided into regions like a grid; however, no generations are associated with each region like in G1GC, where each region is associated with young, old, and mature objects.

The concurrent compacting uses snapshot-at-the-beginning (SATB), which is also used in CMS and G1. And forwarding pointer (which adds one more word to the two already there in OpenJDK). Shenandoah does this compaction process to avoid fragmentation, which happens to CMS (deprecated in JDK 11 and removed in JDK 17).

The fact is, concurrent means some cleaning/compacting process happening at the same process as the application, which avoids everything going into pauses on the application like Stop The World GC Operations. The more processes you can do concurrently, the less there is to do in parallel phases. Concurrent compaction is a major part of the Shenandoah algorithm as it is one of the three main concurrent phases, where the other three are respectively marked—given by the init mark|final mark STW—evaluation and updating references.

Shenandoah's original goal was to target <10ms pauses, so having little impact on the application in terms of long pauses. Later, by adding concurrent features and improvements, it managed to get to submillisecond pauses. Shenandoah's performance has improved considerably in the last ten-plus years of development and it is considered a mature product supported in production environments.

The generational part will be discussed in a separate section below.

Finally, be aware there was an evolution of the Shenandoah Algorithm, where Shenandoah1 was the initial version of the algorithm and had a larger footprint and was originally described in most of the papers online. Later that changed for Shenandoah2 implementation, which is the second version of the algorithm currently implemented on the code.

Using and tuning Shenandoah GC

Simply add the flag -XX:+UseShenandoahGC, and the application should run it.

For tuning, see our main solution Shenandoah Collector Tuning, which covers weak reference tuning and heuristic choices.

Regions cleaning and allocation

Similar to G1GC, Shenandoah divides the regions into memory regions like a grid. However, unlike in G1GC, each region won't be assigned a specific generation; instead, each region is assigned specific thread(s) that run concurrently with the application. Shenandoah is location-based, so there are regions, and region sizes, and therefore homologous allocation is a special case to be handled.

In terms of specific comparison, see Shenandoah vs G1GC in OpenJDK.

Non-generational (so far)

Red Hat's implementation of Shenandoah is non-generational, whereas Amazon's Corretto provides generational support. At the time of writing, Shenandoah Generational is experimental, whereas Generational ZGC is production-ready. Later releases of OpenJDK 21 likely will bring GenShen as production-ready.

GC logging interpretation

OpenJDK 64-Bit Server VM (25.302-b08) for linux-amd64 JRE (1.8.0_302-b08), built on Jul 17 2021 18:13:18 by "mockbuild" with gcc 4.8.5 20150623 (Red Hat 4.8.5-44)
Memory: 4k page, physical 15908268k(2468964k free), swap 8388604k(8186876k free)
CommandLine flags: -XX:CompressedClassSpaceSize=260046848 -XX:GCLogFileSize=3145728 -XX:InitialHeapSize=1366294528 -XX:MaxHeapSize=1366294528 -XX:MaxMetaspaceSize=268435456 -XX:MetaspaceSize=100663296 -XX:NumberOfGCLogFiles=5 -XX:+PrintGC -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:-TraceClassUnloading -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseGCLogFileRotation -XX:+UseShenandoahGC
Regions: 2606 x 512K
Humongous object threshold: 512K
Max TLAB size: 65536B
GC threads: 4 parallel, 2 concurrent
Heuristics ergonomically sets -XX:+ExplicitGCInvokesConcurrent
Heuristics ergonomically sets -XX:+ShenandoahImplicitGCInvokesConcurrent
Shenandoah GC mode: Snapshot-At-The-Beginning (SATB)
Shenandoah heuristics: Adaptive
Pacer for Idle. Initial: 26685K, Alloc Tax Rate: 1.0x
Initialize Shenandoah heap: 1303M initial, 1303M min, 1303M max

Copy snippet

Shenandoah GC use cases

Throughout years of using and testing Shenandoah, I've encountered plenty of situations where using it improves performance considerably; in other scenarios, not so much. Below, I've outlined my recommendations for when to use Shenandoah and when not to use it.

When to use Shenandoah GC

You can use non-generational Shenandoah adequately in situations such as huge workloads with neither random nor too much generational overload.

Examples of situations can be applications with constant allocation (and even usage in general) of objects.

Other examples of situations can be applications where the objects are older and the overhead of creating new objects is small, percentage-wise proportionally to the size and number of regions to be cleaned.

In some instances container usage can be very adequate and even recommended, but it depends on the use case, naturally.

When not to use Shenandoah GC

Shenandoah (in its non-generational form) is not applicable for all situations and workloads, and there will be workloads where its performance will be hurt more than helped by Shenandoah, given it is not generational. Being non-generational is a core part of the algorithm and helps considerably in several aspects, but it can hurt in other more specific aspects.

An example is when a high number of very short-lived objects is created at random periods, which leads to all the threads kicking in and running at the same time and can lead to several subsequent full pauses in a roll. For those cases, a generational collector, like G1GC and Parallel, would likely handle better the situation—by splitting the collection into phases. For those generational workloads, Amazon's Correto has developed its generational Shenandoah, GenShen.

Consequently, there needs to be due diligence from the development team to verify how a non-generational collector is handling—in terms of latency, throughput, and less (but not least) footprint—which is most of the time sacrificed in several situations when developing in Java or (self/auto) collected garbage collection development.

However, this non-generational aspect will likely change given the support of Generational Shenandoah to be introduced in later releases, likely OpenJDK 21+.

In any case, the main recommendation is to benchmark the application with the Shenandoah and other collectors such as G1GC and ParallelGC for a more direct comparison. I would recommend doing that even before tuning Shenandoah.

Generational Shenandoah GC

Amazon started contributing to OpenJDK and made available on Amazon's Corretto a generational version of Shenandoah and in 2021 announced it—to use it, download Corretto and set:

-XX:+UseShenandoahGC -XX:+UnlockExperimentalVMOptions
-XX:ShenandoahGCMode=generational

Copy snippet

An example is when a high number of very short-lived objects is created at random periods, which leads to all the threads kicking in and running at the same time and can lead to several subsequent full pauses in a roll. For those cases, a generational collector, like G1GC and Parallel would likely handle better the situation—by splitting the collection into phases. For those (generational) workloads, Amazon (Corretto) is developing its Generational Shenandoah.

On a side note, Oracle's ZGC Generational—GenZGC—is not supported in production yet.

Notes on benchmarking applications

Regarding benchmarking, this is an important process to compare the performance of specific settings/changes in an environment and can be done for garbage collection changes. In this case, the user deploys the application in different GCs (for instance Shenandoah versus G1GC), puts them under stress, and compares three main metrics: footprint, throughput, and latency. Most of the time, we sacrifice footprint for throughput or latency.

Just by sampling deploying Shenandoah or G1GC, the user should have some traces of performance increase or decrease, for instance, if the application has a strong generational character (in terms of relying heavily on young or old generation behaviors) or even random patterns of allocation/deallocation, which can be instances where Shenandoah (as currently is implemented) may underperform.

Evidently, at the end of the day, for a vast majority of the cases, what will matter for applications will be more about throughput and latency rather than footprint, except in specific instances where one is paying for memory/CPU per usage.

Additional resources

To learn more about the Shenandoah garbage collector, I strongly recommend checking out the upstream documentation, which provides more details and some relevant diagrams.

For any other specific inquiries, please open a case with Red Hat support. Our global team of experts can help you with any issues, including help with JVM and Garbage Collection specific issues, and we are glad to do so.

Acknowledgments

I would like to thank Roman Kennke, my mentor at Red Hat (from August 2021 until the end of his tenure). Also thank you Zhengyu Gu and Aleksey Shipilev, who were always open to questions about GC usage/troubleshooting in the several years working, since Shenandoah turned Production Support ready.

Finally, thanks to Alexander Barbosa and Will Russell for their detailed review of this article.

A beginner's guide to the Shenandoah garbage collector

Share:

About Shenandoah GC

Using and tuning Shenandoah GC

Regions cleaning and allocation

Non-generational (so far)

GC logging interpretation

Shenandoah GC use cases

When to use Shenandoah GC

When not to use Shenandoah GC

Generational Shenandoah GC

Notes on benchmarking applications

Additional resources

Acknowledgments

Products

Build

Quicklinks

Communicate

RED HAT DEVELOPER

Red Hat legal and privacy links

Red Hat legal and privacy links

Report a website issue