The infamous sun.misc.Unsafe explained

The biggest competitor to the Java virtual machine might be  Microsoft's CLR that hosts languages such as C#. The CLR allows to write  unsafe code as an entry gate for low level programming, something that is hard to achieve on the JVM. If you need such advanced functionality in Java, you might be forced to use the  JNI which requires you to know some C and will quickly lead to code that is tightly coupled to a specific platform. With  sun.misc.Unsafe, there is however another alternative to low-level programming on the Java plarform using a Java API, even though this alternative is  discouraged. Nevertheless, several applications rely on  sun.misc.Unsafe such for example  objenesis and therewith all libraries that build on the latter such for example  kryo which is again used in for example  Twitter's Storm. Therefore, it is time to have a look, especially since the functionality of  sun.misc.Unsafe is considered to  become part of Java's public API in Java 9.

Getting hold of an instance of sun.misc.Unsafe


The  sun.misc.Unsafe class is intended to be only used by core Java classes which is why its authors made its only constructor private and only added an equally private singleton instance. The public getter for this instances performs a security check in order to avoid its public use:

1
2
3
4
5
6
public static Unsafe getUnsafe() {
   Class cc = sun.reflect.Reflection.getCallerClass( 2 );
   if (cc.getClassLoader() != null )
     throw new SecurityException( "Unsafe" );
   return theUnsafe;
}

This method first looks up the calling  Class from the current thread’s method stack. This lookup is implemented by another internal class named  sun.reflection.Reflection which is basically browsing down the given number of call stack frames and then returns this method’s defining class. This security check is however likely to  change in future version. When browsing the stack, the first found class (index  0) will obviously be the  Reflection class itself, and the second (index  1) class will be the  Unsafe class such that index  2 will hold your application class that was calling  Unsafe#getUnsafe()

This looked-up class is then checked for its  ClassLoader where a  null reference is used to represent the bootstrap class loader on a HotSpot virtual machine. (This is documented in  Class#getClassLoader() where it says that “ some implementations may use null to represent the bootstrap class loader”.) Since no non-core Java class is normally ever loaded with this class loader, you will therefore never be able to call this method directly but receive a thrown  SecurityException as an answer. (Technically, you could force the VM to load your application classes using the bootstrap class loader by adding it to the  –Xbootclasspath, but this would require some setup outside of your application code which you might want to avoid.) Thus, the following test will succeed:

1
2
3
4
@Test (expected = SecurityException. class )
public void testSingletonGetter() throws Exception {
   Unsafe.getUnsafe();
}

However, the security check is poorly designed and should be seen as a warning against the  singleton anti-pattern. As long as the use of  reflection is not prohibited (which is hard since it is so widely used in many frameworks), you can always get hold of an instance by inspecting the private members of the class. From the  Unsafe class's source code, you can learn that the singleton instance is stored in a private static field called  theUnsafe. This is at least true for the HotSpot virtual machine. Unfortunately for us, other virtual machine implementations sometimes use other names for this field. Android’s  Unsafe class is for example storing its singleton instance in a field called  THE_ONE. This makes it hard to provide a “compatible” way of receiving the instance. However, since we already left the save territory of compatibility by using the  Unsafe class, we should not worry about this more than we should worry about using the class at all. For getting hold of the singleton instance, you simply read the singleton field's value:

1
2
3
Field theUnsafe = Unsafe. class .getDeclaredField( "theUnsafe" );
theUnsafe.setAccessible( true );
Unsafe unsafe = (Unsafe) theUnsafe.get( null );

Alternatively, you can invoke the private instructor. I do personally prefer this way since it works for example with Android while extracting the field does not:

1
2
3
Constructor<Unsafe> unsafeConstructor = Unsafe. class .getDeclaredConstructor();
unsafeConstructor.setAccessible( true );
Unsafe unsafe = unsafeConstructor.newInstance();

The price you pay for this minor compatibility advantage is a minimal amount of heap space. The security checks performed when using reflection on fields or constructors are however similar.

 

Create an instance of a class without calling a constructor


The first time I made use of the  Unsafe class was for creating an instance of a class without calling any of the class's constructors. I needed to proxy an entire class which only had a rather  noisy constructor but I only wanted to delegate all method invocations to a real instance which I did however not know at the time of construction. Creating a subclass was easy and if the class had been represented by an interface, creating a proxy would have been a straight-forward task. With the expensive constructor, I was however stuck. By using the  Unsafe class, I was however able to work my way around it. Consider a class with an artificially expensive constructor:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
class ClassWithExpensiveConstructor {
 
   private final int value;
 
   private ClassWithExpensiveConstructor() {
     value = doExpensiveLookup();
   }
 
   private int doExpensiveLookup() {
     try {
       Thread.sleep( 2000 );
     } catch (InterruptedException e) {
       e.printStackTrace();
     }
     return 1 ;
   }
 
   public int getValue() {
     return value;
   }
}

Using the  Unsafe, we can create an instance of  ClassWithExpensiveConstructor (or any of its subclasses) without having to invoke the above constructor, simply by allocating an instance directly on the heap:

1
2
3
4
5
6
@Test
public void testObjectCreation() throws Exception {
   ClassWithExpensiveConstructor instance = (ClassWithExpensiveConstructor)
   unsafe.allocateInstance(ClassWithExpensiveConstructor. class );
   assertEquals( 0 , instance.getValue());
}

Note that final field remained uninitialized by the constructor but is set with  its type's default value. Other than that, the constructed instance behaves like a normal Java object. It will for example be garbage collected when it becomes unreachable.

The Java run time itself creates objects without calling a constructor when for example creating objects for deserialization. Therefore, the  ReflectionFactory offers even more access to individual object creation:

1
2
3
4
5
6
7
8
@Test
public void testReflectionFactory() throws Exception {
   @SuppressWarnings ( "unchecked" )
   Constructor<ClassWithExpensiveConstructor> silentConstructor = ReflectionFactory.getReflectionFactory()
       .newConstructorForSerialization(ClassWithExpensiveConstructor. class , Object. class .getConstructor());
   silentConstructor.setAccessible( true );
   assertEquals( 10 , silentConstructor.newInstance().getValue());
}

Note that the  ReflectionFactory class only requires a  RuntimePermission called  reflectionFactoryAccess for receiving its singleton instance and no reflection is therefore required here. The received instance of  ReflectionFactory allows you to define any constructor to  become a constructor for the given type. In the example above, I used the default constructor of  java.lang.Objectfor this purpose. You can however use any constructor:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
class OtherClass {
 
   private final int value;
   private final int unknownValue;
 
   private OtherClass() {
     System.out.println( "test" );
     this .value = 10 ;
     this .unknownValue = 20 ;
   }
}
 
@Test
public void testStrangeReflectionFactory() throws Exception {
   @SuppressWarnings ( "unchecked" )
   Constructor<ClassWithExpensiveConstructor> silentConstructor = ReflectionFactory.getReflectionFactory()
       .newConstructorForSerialization(ClassWithExpensiveConstructor. class ,
             OtherClass. class .getDeclaredConstructor());
   silentConstructor.setAccessible( true );
   ClassWithExpensiveConstructor instance = silentConstructor.newInstance();
   assertEquals( 10 , instance.getValue());
   assertEquals(ClassWithExpensiveConstructor. class , instance.getClass());
   assertEquals(Object. class , instance.getClass().getSuperclass());
}

Note that  value was set in this constructor even though the constructor of a completely different class was invoked. Non-existing fields in the target class are however ignored as also obvious from the above example. Note that  OtherClass does not become part of the constructed instances type hierarchy, the  OtherClass's constructor is simply borrowed for the "serialized" type.

Not mentioned in this blog entry are other methods such as  Unsafe#defineClassUnsafe#defineAnonymousClass or Unsafe#ensureClassInitialized. Similar functionality is however also defined in the public API's  ClassLoader.

Native memory allocation


Did you ever want to allocate an array in Java that should have had more than  Integer.MAX_VALUE entries? Probably not because this is not a common task, but if you once need this functionality, it is possible. You can create such an array by allocating native memory. Native memory allocation is used by for example  direct byte buffers that are offered in  Java's NIO packages. Other than heap memory, native memory is not part of the heap area and can be used non-exclusively for example for communicating with other processes. As a result, Java's heap space is in competition with the native space: the more memory you assign to the JVM, the less native memory is left.

Let us look at an example for using native (off-heap) memory in Java with creating the mentioned oversized array:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
class DirectIntArray {
 
   private final static long INT_SIZE_IN_BYTES = 4 ;
   
   private final long startIndex;
 
   public DirectIntArray( long size) {
     startIndex = unsafe.allocateMemory(size * INT_SIZE_IN_BYTES);
     unsafe.setMemory(startIndex, size * INT_SIZE_IN_BYTES, ( byte ) 0 );
     }
   }
 
   public void setValue( long index, int value) {
     unsafe.putInt(index(index), value);
   }
 
   public int getValue( long index) {
     return unsafe.getInt(index(index));
   }
 
   private long index( long offset) {
     return startIndex + offset * INT_SIZE_IN_BYTES;
   }
 
   public void destroy() {
     unsafe.freeMemory(startIndex);
   }
}
 
@Test
public void testDirectIntArray() throws Exception {
   long maximum = Integer.MAX_VALUE + 1L;
   DirectIntArray directIntArray = new DirectIntArray(maximum);
   directIntArray.setValue(0L, 10 );
   directIntArray.setValue(maximum, 20 );
   assertEquals( 10 , directIntArray.getValue(0L));
   assertEquals( 20 , directIntArray.getValue(maximum));
   directIntArray.destroy();
}

First, make sure that your machine has sufficient memory for running this example! You need at least  (2147483647 + 1) * 4 byte = 8192 MB of native memory for running the code. If you have worked with other programming languages as for example C, direct memory allocation is something you do every day. By calling  Unsafe#allocateMemory(long), the virtual machine allocates the requested amount of native memory for you. After that, it will be your responsibility to handle this memory correctly.

The amount of memory that is required for storing a specific value is dependent on the type's size. In the above example, I used an  int type which represents a 32-bit integer. Consequently a single  int value consumes 4 byte. For primitive types,  size is well-documented. It is however more complex to compute the size of object types since they are dependent on the number of non-static fields that are declared anywhere in the type hierarchy. The most canonical way of computing an object's size is  using the Instrumented class from Java's attach API which offers a dedicated method for this purpose called  getObjectSize. I will however evaluate another (hacky) way of dealing with objects in the end of this section.

Be aware that directly allocated memory is always  native memory and therefore not garbage collected. You therefore have to free memory explicitly as demonstrated in the above example by a call to  Unsafe#freeMemory(long). Otherwise you reserved some memory that can never be used for something else as long as the JVM instance is running what is a memory leak and a common problem in non-garbage collected languages. Alternatively, you can also directly reallocate memory at a certain address by calling  Unsafe#reallocateMemory(long, long) where the second argument describes the new amount of bytes to be reserved by the JVM at the given address.

Also, note that the directly allocated memory is  not initialized with a certain value. In general, you will find garbage from old usages of this memory area such that you have to explicitly initialize your allocated memory if you require a default value. This is something that is normally done for you when you let the Java run time allocate the memory for you. In the above example, the entire area is overriden with zeros with help of the  Unsafe#setMemory method.

When using directly allocated memory, the JVM will neither do range checks for you. It is therefore possible to corrupt your memory as this example shows:

1
2
3
4
5
6
7
8
9
10
@Test
public void testMallaciousAllocation() throws Exception {
   long address = unsafe.allocateMemory(2L * 4 );
   unsafe.setMemory(address, 8L, ( byte ) 0 );
   assertEquals( 0 , unsafe.getInt(address));
   assertEquals( 0 , unsafe.getInt(address + 4 ));
   unsafe.putInt(address + 1 , 0xffffffff );
   assertEquals( 0xffffff00 , unsafe.getInt(address));
   assertEquals( 0x000000ff , unsafe.getInt(address + 4 ));
}

Note that we wrote a value into the space that was each partly reserved for the first and for the second number. This picture might clear things up. Be aware that the values in the memory run from the "right to the left" (but this might be machine dependent).

The first row shows the initial state after writing zeros to the entire allocated native memory area. Then we override 4 byte with an offset of a single byte using 32 ones. The last row shows the result after this writing operation.

Finally, we want to write an entire object into native memory. As mentioned above, this is a difficult task since we first need to compute the size of the object in order to know the amount of size we need to reserve. The Unsafe class does however not offer such functionality. At least not directly since we can at least use the Unsafe class to find the offset of an instance's field which is used by the JVM when itself allocates objects on the heap. This allows us to find the approximate size of an object:

1
2
3
4
5
6
7
8
9
10
11
public long sizeOf(Class<?> clazz)
   long maximumOffset = 0 ;
   do {
     for (Field f : clazz.getDeclaredFields()) {
       if (!Modifier.isStatic(f.getModifiers())) {
         maximumOffset = Math.max(maximumOffset, unsafe.objectFieldOffset(f));
       }
     }
   } while ((clazz = clazz.getSuperclass()) != null );
   return maximumOffset + 8 ;
}

This might at first look cryptic, but there is no big secret behind this code. We simply iterate over all non-static fields that are declared in the class itself or in any of its super classes. We do not have to worry about interfaces since those cannot define fields and will therefore never alter an object's memory layout. Any of these fields has an offset which represents the  first  byte that is occupied by this field's value when the JVM stores an instance of this type in memory, relative to a first byte that is used for this object. We simply have to find the maximum offset in order to find the space that is required for all fields but the last field. Since a field will never occupy more than 64 bit (8 byte) for a  long  or  double  value or for an object reference when run on a 64 bit machine, we have at least found an upper bound for the space that is used to store an object. Therefore, we simply add these 8 byte to the maximum index and we will not run into danger of having reserved to little space. This idea is of course wasting some byte and a better algorithm should be used for production code.

In this context, it is best to think of a class definition as a form of heterogeneous array. Note that the minimum field offset is not  0  but a positive value. The first few byte contain meta information. The graphic below visualizes this principle for an example object with an  int  and a  long  field where both fields have an offset. Note that we do not normally write meta information when writing a copy of an object into native memory so we could further reduce the amount of used native memoy. Also note that this memory layout might be highly dependent on an implementation of the Java virtual machine.


With this overly careful estimate, we can now implement some stub methods for writing shallow copies of objects directly into native memory. Note that native memory does not really know the concept of an object. We are basically just setting a given amount of byte to values that reflect an object's current values. As long as we remember the memory layout for this type, these byte contain however enough information to reconstruct this object.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
public void place(Object o, long address) throws Exception {
   Class clazz = o.getClass();
   do {
     for (Field f : clazz.getDeclaredFields()) {
       if (!Modifier.isStatic(f.getModifiers())) {
         long offset = unsafe.objectFieldOffset(f);
         if (f.getType() == long . class ) {
           unsafe.putLong(address + offset, unsafe.getLong(o, offset));
         } else if (f.getType() == int . class ) {
           unsafe.putInt(address + offset, unsafe.getInt(o, offset));
         } else {
           throw new UnsupportedOperationException();
         }
       }
     }
   } while ((clazz = clazz.getSuperclass()) != null );
}
 
public Object read(Class clazz, long address) throws Exception {
   Object instance = unsafe.allocateInstance(clazz);
   do {
     for (Field f : clazz.getDeclaredFields()) {
       if (!Modifier.isStatic(f.getModifiers())) {
         long offset = unsafe.objectFieldOffset(f);
         if (f.getType() == long . class ) {
           unsafe.putLong(instance, offset, unsafe.getLong(address + offset));
         } else if (f.getType() == int . class ) {
           unsafe.putLong(instance, offset, unsafe.getInt(address + offset));
         } else {
           throw new UnsupportedOperationException();
         }
       }
     }
   } while ((clazz = clazz.getSuperclass()) != null );
   return instance;
}
 
@Test
public void testObjectAllocation() throws Exception {
   long containerSize = sizeOf(Container. class );
   long address = unsafe.allocateMemory(containerSize);
   Container c1 = new Container( 10 , 1000L);
   Container c2 = new Container( 5 , -10L);
   place(c1, address);
   place(c2, address + containerSize);
   Container newC1 = (Container) read(Container. class , address);
   Container newC2 = (Container) read(Container. class , address + containerSize);
   assertEquals(c1, newC1);
   assertEquals(c2, newC2);
}

Note that these stub methods for writing and reading objects in native memory only support  int  and  long  field values. Of course, Unsafe  supports all primitive values and can even write values without hitting thread-local caches by using the volatile forms of the methods. The stubs were only used to keep the examples concise. Be aware that these "instances" would  never  get garbage collected since their memory was allocated directly. (But maybe this is what you want.) Also, be careful when precalculating size since an object's memory layout might be VM dependent and also alter if a 64-bit machine runs your code compared to a 32-bit machine. The offsets might even change between JVM restarts.

For reading and writing primitives or object references,  Unsafe  provides the following type-dependent methods:
  • getXXX(Object target, long offset): Will read a value of type XXX from target's address at the specified offset.
  • putXXX(Object target, long offset, XXX value): Will place value at target's address at the specified offset.
  • getXXXVolatile(Object target, long offset): Will read a value of type XXX from target's address at the specified offset and not hit any thread local caches.
  • putXXXVolatile(Object target, long offset, XXX value): Will place value at target's address at the specified offset and not hit any thread local caches.
  • putOrderedXXX(Object target, long offset, XXX value): Will place value at target's address at the specified offet and might not hit all thread local caches.
  • putXXX(long address, XXX value): Will place the specified value of type XXX directly at the specified address.
  • getXXX(long address): Will read a value of type XXX from the specified address.
  • compareAndSwapXXX(Object target, long offset, long expectedValue, long value): Will atomicly read a value of type XXX from target's address at the specified offset and set the given value if the current value at this offset equals the expected value.
Be aware that you are copying references when writing or reading object copies in native memory by using the  getObject(Object, long)  method family. You are therefore only creating shallow copies of instances when applying the above method. You could however always read object sizes and offsets recursively and create deep copies. Pay however attention for cyclic object references which would cause infinitive loops when applying this principle carelessly.

Not mentioned here are existing utilities in the Unsafe class that allow manipulation of static field values sucht as staticFieldOffset  and for handling array types. Finally, both methods named  Unsafe#copyMemory  allow to instruct a direct copy of memory, either relative to a specific object offset or at an absolute address as the following example shows:

1
2
3
4
5
6
7
8
@Test
public void testCopy() throws Exception {
   long address = unsafe.allocateMemory(4L);
   unsafe.putInt(address, 100 );
   long otherAddress = unsafe.allocateMemory(4L);
   unsafe.copyMemory(address, otherAddress, 4L);
   assertEquals( 100 , unsafe.getInt(otherAddress));
}

Throwing checked exceptions without declaration


There are some other interesting methods to find in  Unsafe . Did you ever want to throw a specific exception to be handled in a lower layer but you high layer interface type did not declare this checked exception?  Unsafe#throwException  allows to do so:

1
2
3
4
5
6
7
8
@Test (expected = Exception. class )
public void testThrowChecked() throws Exception {
   throwChecked();
}
 
public void throwChecked() {
   unsafe.throwException( new Exception());
}

Native concurrency


The  park  and  unpark  methods allow you to pause a thread for a certain amount of time and to resume it:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
@Test
public void testPark() throws Exception {
   final boolean [] run = new boolean [ 1 ];
   Thread thread = new Thread() {
     @Override
     public void run() {
       unsafe.park( true , 100000L);
       run[ 0 ] = true ;
     }
   };
   thread.start();
   unsafe.unpark(thread);
   thread.join(100L);
   assertTrue(run[ 0 ]);
}

Also, monitors can be acquired directly by using Unsafe using  monitorEnter(Object) monitorExit(Object)  and tryMonitorEnter(Object) .

A file containing all the examples of this blog entry is  available as a gist .


转帖:http://mydailyjava.blogspot.com/2013/12/sunmiscunsafe.html

在IT领域,尤其是地理信息系统(GIS)中,坐标转换是一项关键技术。本文将深入探讨百度坐标系、火星坐标系和WGS84坐标系之间的相互转换,并介绍如何使用相关工具进行批量转换。 首先,我们需要了解这三种坐标系的基本概念。WGS84坐标系,即“World Geodetic System 1984”,是一种全球通用的地球坐标系统,广泛应用于GPS定位和地图服务。它以地球椭球模型为基础,以地球质心为原点,是国际航空和航海的主要参考坐标系。百度坐标系(BD-09)是百度地图使用的坐标系。为了保护隐私和安全,百度对WGS84坐标进行了偏移处理,导致其与WGS84坐标存在差异。火星坐标系(GCJ-02)是中国国家测绘局采用的坐标系,同样对WGS84坐标进行了加密处理,以防止未经授权的精确位置获取。 坐标转换的目的是确保不同坐标系下的地理位置数据能够准确对应。在GIS应用中,通常通过特定的算法实现转换,如双线性内插法或四参数转换法。一些“坐标转换小工具”可以批量转换百度坐标、火星坐标与WGS84坐标。这些工具可能包含样本文件(如org_xy_格式参考.csv),用于提供原始坐标数据,其中包含需要转换的经纬度信息。此外,工具通常会附带使用指南(如重要说明用前必读.txt和readme.txt),说明输入数据格式、转换步骤及可能的精度问题等。x86和x64目录则可能包含适用于32位和64位操作系统的软件或库文件。 在使用这些工具时,用户需要注意以下几点:确保输入的坐标数据准确无误,包括经纬度顺序和浮点数精度;按照工具要求正确组织数据,遵循读写规则;注意转换精度,不同的转换方法可能会产生微小误差;在批量转换时,检查每个坐标是否成功转换,避免个别错误数据影响整体结果。 坐标转换是GIS领域的基础操作,对于地图服务、导航系统和地理数据分析等至关重要。理解不同坐标系的特点和转换方法,有助于我们更好地处
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值