Using Unsafe safely in GraalVM Native Image

The perils of using the Unsafe class in Java applications are well documented. Although Unsafe has historically offered access to low-level programming features, it exposes internal details of the implementation and its use is therefore highly discouraged. If you submit code that uses the Unsafe API to the GraalVM Native Image compiler, you can encounter even more problems that don't exist on dynamic JVMs such as HotSpot.

This article looks at the issues that GraalVM Native Image can potentially introduce with Unsafe, and offers coding techniques that can help you achieve some of the goals you might want to use Unsafe for. Among other things, we'll look at the VarHandles API, which was introduced in the JDK as an alternative to some of the Unsafe APIs.

What is GraalVM Native Image?

To begin with, let's go through a quick overview of how GraalVM Native Image works.

Native Image is an ahead-of-time compilation technology that performs static analysis of the complete application to find reachable components, and then statically compiles the code to produce a native executable. The static analysis includes points-to analysis that, in addition to identifying unused methods, also identifies the unused fields in a class and allows such fields to be excluded from the class definition (also called class metadata) stored in the image.

To illustrate how fields are removed from the metadata, you can use the GDB debugger to inspect the internal state when the native image is running. Please refer to the GraalVM documentation on the Debug Info feature for details on debugging a native image.

Let's consider the following code, stored in a file named FieldDCETest.java. Note that field2 is not used, even though it is defined and an unused method even refers to it:

  1 class MyClass {
  2         private int field1;
  3         private int field2;
  4         private int field3;
  5 
  6         MyClass(int val1, int val2) {
  7                 field1 = val1;
  8                 field3 = val2;
  9         }
 10 
 11         int getField1() { return field1; }
 12         int getField2() { return field2; }
 13         int getField3() { return field3; }
 14 }
 15 
 16 public class FieldDCETest {
 17         public static void main(String args[]) {
 18                 MyClass obj = new MyClass(10, 20);
 19                 System.out.println("field1: " + obj.getField1());
 20                 System.out.println("field3: " + obj.getField3());
 21                 return;
 22         }
 23 }

If we ask GDB to print the definition of the MyClass type after it passes through the Native Image builder, the output shows:

(gdb) ptype MyClass
type = class MyClass : public java.lang.Object {
  private:
    int field1;
    int field3;
  public:
    MyClass(int, int);
}

field2 is not present in the type definition, because points-to analysis in Native Image builder eliminated it as an unused field.

In addition to static compilation, Native Image builder adds build-time initialization. As the name suggests, build-time initialization runs class initializers at build time to eliminate the need for that activity at run time. However, Native Image builder actually relies on the JVM it is running on to execute the static class initializers. Keep this detail in mind, as you'll see its implications later.

The builder also creates a heap snapshot containing the objects created as part of running the class initializers.

Once the Native Image builder has identified all the reachable code and executed the class initializers, it compiles the reachable methods and generates a native executable.

Unsafe proves to be unsafe

Now let's see how the use of Unsafe can create really "unsafe" situations when used in a native image with build-time initialization. The following file, UnsafeTest.java, performs a typical operation that uses the Unsafe API to get the offset of field3 within the class:

  1 import sun.misc.Unsafe;
  2 import java.lang.reflect.Field;
  3 
  4 public class UnsafeTest {
  5         public int field1;
  6         public int field2;
  7         public int field3;
  8 
  9         static long field3Offset;
 10         static Unsafe unsafe;
 11 
 12         static {
 13                 try {
 14                         Field f = Unsafe.class.getDeclaredField("theUnsafe");
 15                         f.setAccessible(true);
 16                         unsafe = (Unsafe) f.get(null);
 17                         field3Offset = unsafe.objectFieldOffset(UnsafeTest.class.getField("field3"));
 18                 } catch (Exception e) {
 19                         throw new RuntimeException(e);
 20                 }
 21         }
 22         public static void main(String args[]) throws Exception {
 23                 System.out.println("field3Offset (from class initializer): " + field3Offset);
 24                 System.out.println("field3 offset: " + unsafe.objectFieldOffset(UnsafeTest.class.getField("field3")));
 25         }
 26 }

The class initializer block for UnsafeTest caches the offset of field3 using the Unsafe API at line 17. Line 24 in main() uses Unsafe again to get the offset of that field.

Let's build the native image for this example and run it:

$ javac UnsafeTest.java
$ native-image UnsafeTest

The output is:

$ ./unsafetest
field3Offset (from class initializer): 12
field3 offset: 12

By default, the native image runs class initializers for application classes at runtime, which can be verified using the -H:+PrintClassInitialization option when building the native image. This option prints a report indicating the type of class initialization and the reason behind it for each class. The following command generates a class initialization report for the previous example:

$ native-image -H:+PrintClassInitialization UnsafeTest

This generates a file with the name reports/class_initialization_report_<timestamp>.csv. For the previous example, the following entry can be found in the report:

UnsafeTest, RUN_TIME, classes are initialized at run time by default

Since the class initializer for UnsafeTest and UnsafeTest::main both get executed at run-time, the offset of the field is the same when computed both at line 17 and at line 24.

Now let's ask the image builder to initialize the UnsafeTest class at build time using --initialize-at-build-time=UnsafeTest and run the test again:

$ native-image --initialize-at-build-time=UnsafeTest UnsafeTest

This time the output is:

$ ./unsafetest
field3Offset (from class initializer): 20
field3 offset: 12

Hmm, that's unexpected. Why did the build time initialization of UnsafeTest result in a different offset of the same field in the class initializer? Let's dissect the image creation process.

At build time, the image builder performs points-to reachability analysis and executes class initializers in tandem multiple times. During this phase, the builder executes the class initializer of the UnsafeTest class, which computes the offset of field3. As mentioned previously, the class initializer is executed by the JVM running the image builder.

For the JVM, the shape of UnsafeTest instances looks like Figure 1. The JVM allocates 12 bytes for the object headers and 4 bytes for each of the integer fields, including field1 and field2. Therefore, the offset of field3 computed in the class initializer and stored in the static UnsafeTest::field3Offset variable is 20.

Figure 1. The JVM includes the unused field2 when calculating the size of the Unsafe object.

However, after the Native Image builder has completed the points-to analysis and executed the class initializers, it determines that field1 and field2 of UnsafeTest have never been read or written to and are essentially dead fields. We already saw this in the previous example of FieldDCETest.java, where field2 was removed from the class definition. So the image builder eliminates field1 and field2, thus changing the shape of the UnsafeTest instances in the native image to the structure shown in Figure 2.

Figure 2. The Native Image builder notices unused fields and removes them, leaving only field3.

This is the final shape of the UnsafeTest instances, which is used at run time to compute the offset of field3 within the class. Accordingly, field3's offset is computed as 12.

In short, the difference in the field offset computed at build time and run time is due to different views of the class held by the Native Image builder and the JVM.

This example shows how the use of Unsafe in the context of a native image can compound problems for developers, in addition to the usual concerns of being an unsupported API. Build-time initialization and points-to analysis can create situations where Unsafe provides inconsistent results, thus becoming a source of subtle bugs in the application. These inconsistent results occur only in the native executable, so tests on the dynamic JVM wouldn't expose them.

How to fix unsafe offset computations

The GraalVM Native Image documentation mentions the problem with field offsets illustrated in the previous section and describes a couple of ways to work around it. We'll explore workarounds briefly before looking at a more reliable solution.

An automatic fix

One of the mechanisms employed by GraalVM Native Image is automatic detection of the code patterns that access Unsafe.objectFieldOffset(). This mechanism tracks the fields that store the field offsets and rewrites them according to the final class shape. Logic for this process is in the UnsafeAutomaticSubstitutionProcessor class. However, the automatic detection has a couple of constraints:

The argument passed to Unsafe.objectFieldOffset() should be a constant, so that static analysis is able to identify the field for which the offset is being computed.
The field in which the offset is being stored should be declared static final.

Our previous examples don't conform to the second constraint, and therefore would generate a warning message such as:

Warning: RecomputeFieldValue.FieldOffset automatic substitution failed. The automatic substitution registration was attempted because a call to sun.misc.Unsafe.objectFieldOffset(Field) was detected in the static initializer of UnsafeTest. Detailed failure reason(s): The field UnsafeTest.field3Offset, where the value produced by the field offset computation is stored, is not final.

To allow the image builder to automatically detect and handle the field offset for the previous example, all we need to do is declare field3Offset as static final. With this change, the offset in field3Offset is the same as the offset computed at run time in the main() method:

$ ./unsafetest
field3Offset (from class initializer): 12
field3 offset: 12

Substitution and annotations

In real-world use cases, you might not be able to change a field declaration as easily—maybe the code comes from a third-party library, for instance, or perhaps the field really isn't final. So automatic detection would fail, and the image builder would need some hand-holding to be able to correctly recompute the fields that store field offsets. This is done using the RecomputeFieldValue annotation. It depends on another powerful feature of GraalVM Native Image: substitution.

Substitution allows you to replace parts of the target code with code that you provide. The main use case for this feature is to handle JDK or third-party code that trips over some of the constraints imposed by the Native Image builder. Because the code cannot be modified (unless you are ready to maintain your own fork of the source code of the library or JDK), you can use substitutions to provide compatible code.

Let's see this in action with our example, assuming that UnsafeTest.field3Offset cannot be declared final. To mark this field with the RecomputeFieldValue annotation, you need to add a substitution class for UnsafeTest. We call this class Target_UnsafeTest:

@TargetClass(UnsafeTest.class)
public final class Target_UnsafeTest {
    // user code here
}

Target_UnsafeTest is annotated with TargetClass specifying the name of the original class it is replacing, which in this case in UnsafeTest. The purpose of this substitution class is to mark the field UnsafeTest::field3Offset with the RecomputeFieldValue annotation. So we add a field with the same name and type and annotate it as Alias. In addition, we annotate the field with RecomputeFieldValue and specify the kind of recomputation as TranslateFieldOffset:

@Alias @RecomputeFieldValue(kind = Kind.TranslateFieldOffset)
static long field3Offset;

These annotations tell the Native Image builder that the field field3Offset in class UnsafeTest holds a field offset and needs to be recomputed according to the final class shape, using the same field as before.

The complete code for Target_UnsafeTest is:

  1 import com.oracle.svm.core.annotate.RecomputeFieldValue; 
  2 import com.oracle.svm.core.annotate.RecomputeFieldValue.Kind; 
  3 import com.oracle.svm.core.annotate.TargetClass; 
  4 import com.oracle.svm.core.annotate.Alias; 
  5  
  6 @TargetClass(UnsafeTest.class) 
  7 public final class Target_UnsafeTest { 
  8         /* UnsafeTest::field3Offset stores the field offset. Annotate it for recomputation. 
  9          * Recomputation is of type TranslateFieldOffset. 
 10          */ 
 11         @Alias @RecomputeFieldValue(kind = Kind.TranslateFieldOffset) 
 12         static long field3Offset; 
 13 }

The RecomputeFieldValue annotation supports many other kinds of recomputation for different scenarios. For example, instead of TranslateFieldOffset, we can use FieldOffset by explicitly specifying the class and name of the field for which the offset is to be stored, as in:

@Alias @RecomputeFieldValue(kind = Kind.FieldOffset, declClassName="UnsafeTest", name="field3")
static long field3Offset;

If your application is using third-party libraries that use Unsafe APIs that can cause the issues discussed in this article, substitution is the only way to make such code compatible with Native Image builder.

However, if you are able to modify such code, then you should avoid using Unsafe APIs as much as possible. How to do that is the topic for the next section.

VarHandles

Variable handles were added in Java 9 to provide safe and supported alternatives to some of the Unsafe APIs, with the aim of helping Java developers move away from Unsafe APIs. VarHandles modifiers provide read and write access to instance fields, static fields, and array elements under various access modes. A comprehensive explanation of VarHandles can be found in JEP 193, the JDK Enhancement Proposal that introduced them.

What is relevant in the current context is that VarHandles in GraalVM Native Image are implemented using the Unsafe APIs to get the offset of the fields and to access the fields using these offsets.

Doesn't that mean VarHandles would suffer from the same problems that we saw earlier when using Unsafe to get field offsets at build time? They don't, because the JDK implementation for VarHandles has a fixed set of classes that hold the field offsets in particular fields, and GraalVM Native Image has annotated such fields with the RecomputeFieldValue annotation using the substitution mechanism that you saw in the previous section.

Let's write a program that accesses the field of an object. We will create two versions of this program: one using Unsafe and one using VarHandles to retrieve the value of a field. The first file is named FieldAccessTestUnsafe.java:

  1 import sun.misc.Unsafe;
  2 import java.lang.reflect.Field;
  3 
  4 class FieldAccessor {
  5         public static Object getFieldValue(Object obj, long offset) {
  6                 return UnsafeAccessor.unsafe.getInt(obj, offset);
  7         }
  8 }
  9 
 10 class UnsafeAccessor {
 11         public static Unsafe unsafe;
 12 
 13         static {
 14                 try {
 15                         Field theUnsafe = Unsafe.class.getDeclaredField("theUnsafe");
 16                         theUnsafe.setAccessible(true);
 17                         unsafe = (Unsafe) theUnsafe.get(null);
 18                 } catch (Exception e) {
 19                         throw new RuntimeException(e);
 20                 }
 21         }
 22 }
 23 
 24 public class FieldAccessTestUnsafe {
 25         public int field1;
 26         public int field2;
 27         public int field3;
 28 
 29         static long field3Offset;
 30 
 31         FieldAccessTestUnsafe() {
 32                 field3 = 100;
 33         }
 34 
 35         static {
 36                 try {
 37                         field3Offset = UnsafeAccessor.unsafe.objectFieldOffset(FieldAccessTestUnsafe.class.getField("field3"));
 38                 } catch (Exception e) {
 39                         throw new RuntimeException(e);
 40                 }
 41         }
 42 
 43         public static void main(String args[]) throws Exception {
 44                 FieldAccessTestUnsafe unsafeTest = new FieldAccessTestUnsafe();
 45                 System.out.println("field3: " + unsafeTest.field3);
 46                 System.out.println("field value using precomputed offset: " + FieldAccessor.getFieldValue(unsafeTest, field3Offset));
 47         }
 48 }

The FieldAccessTestUnsafe class is similar to the UnsafeTest class from the previous example. It uses field3Offset to retrieve the value of field3. The value of field3Offset itself is retrieved using the FieldAccessor helper class, which uses the Unsafe API.

Compile this class and create a native image by initializing FieldAccessTestUnsafe and UnsafeAccessor at build time using the following commands:

$ javac FieldAccessTestUnsafe.java
$ native-image --initialize-at-build-time=FieldAccessTestUnsafe,UnsafeAccessor FieldAccessTestUnsafe fieldaccesstestunsafe

Running the native image generates the following output:

$ ./fieldaccesstestunsafe 
field3: 100
field value using precomputed offset: 0

The field value obtained using the offset computed during build time is clearly incorrect. As mentioned in the previous section, to get the correct value at run time, we would need to create a substitution class to recompute the field offset stored in field3Offset.

Now we will rewrite the previous example using VarHandle API. The file is named FieldAccessTestVarHandle.java:

  1 import java.lang.invoke.MethodHandles;
  2 import java.lang.invoke.VarHandle;
  3 
  4 class FieldAccessor {
  5         public static Object getFieldValue(Object obj, VarHandle fieldHandle) {
  6                 return fieldHandle.get(obj);
  7         }
  8 }
  9 
 10 public class FieldAccessTestVarHandle {
 11         public int field1;
 12         public int field2;
 13         public int field3;
 14 
 15         private static final VarHandle field3Handle;
 16 
 17         FieldAccessTestVarHandle() {
 18                 field3 = 100;
 19         }
 20 
 21         static {
 22                 try {
 23                         field3Handle = MethodHandles.lookup().findVarHandle(FieldAccessTestVarHandle.class, "field3", int.class);
 24                 } catch (Exception e) {
 25                         throw new RuntimeException(e);
 26                 }
 27         }
 28 
 29         public static void main(String args[]) throws Exception {
 30                 FieldAccessTestVarHandle unsafeTest = new FieldAccessTestVarHandle();
 31                 System.out.println("field3: " + unsafeTest.field3);
 32                 System.out.println("field value using varhandle: " + FieldAccessor.getFieldValue(unsafeTest, field3Handle));
 33         }
 34 }

Here, we have updated FieldAccessor::getFieldValue() to use a VarHandle. Notice that on line 23 in the FieldAccessTestVarHandle's class initializer block we are now creating a VarHandle for field3.

Compile this class and create a native image by initializing FieldAccessTestVarHandle at build time using the following commands:

$ javac FieldAccessTestVarHandles.java 
$ native-image --initialize-at-build-time=FieldAccessTestVarHandles FieldAccessTestVarHandles fieldaccesstestvarhandles

Running the native image generates the following output:

$ ./fieldaccesstestvarhandles 
field3: 100
field value using varhandle: 100

This time the value obtained using VarHandles is correct.

This example demonstrates how VarHandles can help developers avoid the need to use GraalVM's substitution mechanism to recompute field offsets. This approach is much cleaner and more easily maintained than adding substitution classes.

Conclusion

We looked at how the use of Unsafe APIs can result in potential problems when using GraalVM Native Image builder. We also looked at how the image builder tries to address the issues by identifying common patterns of accessing field offsets using Unsafe. However, if the application is using Unsafe in complex patterns, manual intervention is required in the form of substitution classes and annotations. All these problems can be avoided if the application or library is rewritten to use VarHandles.

Last updated: February 5, 2024

Using Unsafe safely in GraalVM Native Image

Share:

What is GraalVM Native Image?

Unsafe proves to be unsafe

How to fix unsafe offset computations

An automatic fix

Substitution and annotations

VarHandles

Conclusion

Debuginfod project update 2024

Containerizing workloads on image mode for RHEL

Implement remediation strategies with Event-Driven Ansible

Dumping packets from anywhere in the networking stack

An overview of virtual routing and forwarding (VRF) in Linux

Products

Build

Quicklinks

Communicate

RED HAT DEVELOPER

Red Hat legal and privacy links

Red Hat legal and privacy links

Report a website issue