Memory consumption of popular Java data types

本文概述了Java中常见数据类型的内存消耗情况,重点介绍了枚举、位集、列表和队列等集合类型的实现原理及其对内存使用的影响。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

This article will give you an overview of some popular Java data types memory consumption. This article follows An overview of memory saving techniques in Java article earlier published in this blog. I strongly recommend you to read the previous article before reading this one.

Enum, EnumMap, EnumSet

Enums were added to Java 1.5. Technically speaking, each enum is an Object and has all memory consumption properties of objects described in the previous article. Luckily, they have several useful properties:

All enum objects are singletons. This is guaranteed by Java language – you can create an instance of enum only by its definition. This means that you pay for enum object once and then you only spend 4 bytes per its reference (in this article I will assume that object references occupy 4 bytes). But what is the benefit of enums if they consume as much memory as int variables and 4 times more than byte variables, besides type safety and better support in IDE?

The answer is the ordinal() method implemented by every enum. It is an increasing number starting from 0 and ending at the number of values in the given enum minus 1. Such enumproperty allows us to use arrays for enum to any other object mapping: a value related to the first enum value will be stored in array[0], second enum will be mapped to array[1] and so on according to Enum.ordinal() method result. By the way, it was a short description of JDK EnumMapclass implementation.

If you need to implement a map from Object into enum, you won’t have a tailored JDK implementation at hand. Of course, you can use any JDK Object to Object map, but it wouldn’t be that efficient. The easy way here is to use Trove TObjectByteMap (or TObjectIntMap in rare cases when you have more than 128 values in your enum) and map from Object key intoEnum.ordinal() value. You will need a decoding method for getters in order to convert byte into an enum. This method will require 1 byte per entry, which is the best we can do without paying CPU algorithmic penalty (of course, we can use less than 1 byte per enum, if there are less than 128, 64, 32, etc elements in the enum, but it may make your code more complicated for a very little memory gain).

With all this knowledge at hand, you may now realize that EnumSet is implemented similarly toBitSet. There are 2 EnumSet implementations in JDK: RegularEnumSet and JumboEnumSet. The former is used for enums having less than 65 values (it covers 99,9999% of real-world enums), the latter is used for larger enumerations.

RegularEnumSet utilizes the knowledge of the fact that there are less or equal to 64 values in theenum. It allows RegularEnumSet to use a single long to store all “enum present in the set” flags, rather than using long[] (utilized by JumboEnumSet or BitSet). Using a single long instead of long[1]allows to save 4 bytes on long[] reference and 16 bytes on long value (long occupies 8 bytes,long[1] needs 12 bytes for header, 8 bytes for a single long and 4 bytes for alignment).

BitSet

As you have noticed from a previous section, a BitSet is built on top of the long[]. The minor disadvantage of a BitSet is that its minimal allowed value is fixed to zero. Thus, if you need aBitSet starting at some offset, you will have to manually subtract your offset from all your values in BitSet methods. Besides that, a BitSet is limited to int values. It also means a need of some sort of envelope for storing real long values in it. The LongBitSet class described in my Bit sets article may give you some useful hints.

Dense set of integers as a BitSet

Details of EnumSet implementation may give you a hint on efficient storage of a dense set of integer values: find out the minimal allowed value in your set (call it minVal) and then use aBitSet for integer values storage: call BitSet.set( value - minVal ) to store your value andBitSet.get( value - minVal ) to check if a value is present in the set. The dense set is a set with more than 1 value per 64 entries.

ArrayList

ArrayList is one of the most efficient storage structures in JDK: it has an Object[] for storage plus an int field for tracking the list size (which is different from the array/list capacity =array.length). As long as you keep the actual objects in the ArrayList, it offers you near-optimal storage. All you need to remember about ArrayList is that it is better not to allocate a largeArrayList in advance if you are not sure that its capacity will be fully utilized. This is especially important for collections of such arrays (or collections of objects having array fields).

Another important property of ArrayList to remember is that there is only one ArrayListmethod which shrinks its capacity – ArrayList.trimToSizeclear/remove* methods decreaseArrayList size instead of capacity. So, if you use an ArrayList as a buffer of some sort and you expect some high peaks in your data, but rather low average buffer size, then it may make sense for you to trim your ArrayList “manually” after that.

If you have a list of numeric wrappers, you’d better use Trove T[Type]ArrayList, which allows you to store (and most important – keep) primitive values in those collections.

LinkedList

LinkedList is the only JDK collection which implements the two following interfaces: List andDeque, which makes it useful for iteration operations on queue/dequeue/stack structures.

Due to being a deque, each LinkedList node contains references to the previous and next elements as well as a reference to the data value. Such node occupies 24 bytes (12 bytes header + 3*4 bytes for references), which is 6 times more than ArrayList in terms of per-node overhead. This structure is also CPU cache-unfriendly due to its spatial non-locality.

ArrayDeque

If all you need is a FIFO/LIFO buffer and you don’t need to access anything except its head or tail, you’d better prefer ArrayDeque to LinkedList.

ArrayDeque is based on an Object[], which length is a power of 2. Besides the array, there are inthead and tail pointers – they allow the array to wrap around its edges if required. Because ofArrayDeque internal memory management, its storage array length is always equal to the size ofArrayDeque rounded to the next power of 2. It means that you spend less than 8 bytes per value in the worst case and 4 bytes per value in the best case.

Summary

The following table summarizes the storage occupied per stored value assuming that a reference occupies 4 bytes. Note that you must spend 4 byte per Object reference in any case, so subtract 4 bytes from the values in the following table to find out the storage overhead.

EnumSetBitSet 1 bit per value
EnumMap 4 bytes (for value, nothing for key)
ArrayList 4 bytes (but may be more if ArrayList capacity is seriously more than its size)
LinkedList 24 bytes (fixed)
ArrayDeque 4 to 8 bytes, 6 bytes on average

See also

Recommended reading

If you want to better understand computer memory system and its implications on your software, I’d recommend you to read Memory Systems: Cache, DRAM, Disk by Bruce Jacob et al.

Reference: http://java-performance.info/memory-consumption-of-java-data-types-1/
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值