SparseArray
sparseArray的主要作用是将Integers映射到Objects,相当于Map<Integer, Object>。当需要将Integers映射到Objects时,SparseArray比HashMap更高效。因为SparseArray避免自动装箱keys,并且它的数据结构不依赖于外部的Entry。 SparseArray是在一个数组结构中维护它的映射关系,通过二分查找来查找key。SparseArray不适合包含大量数据(数百以上)的场景,因为添加和删除一个元素时,需要先二分查找,找到key所对应的索引index,然后在数组中插入和删除一个value。
先看构造函数,然后直接看put()方法实现。
/**
* SparseArrays map integers to Objects. Unlike a normal array of Objects,
* there can be gaps in the indices. It is intended to be more memory efficient
* than using a HashMap to map Integers to Objects, both because it avoids
* auto-boxing keys and its data structure doesn't rely on an extra entry object
* for each mapping.
* SparseArray用于映射integers到object。但不像普通数组那样,它的元素间没有无用元素。由于采用避免自动装箱的keys和
* 它的数据结构不依赖额外的对象(类似Entry)来存储映射关系的实现,因此它比hashMap的内存使用更高效一些。
*
* <p>Note that this container keeps its mappings in an array data structure,
* using a binary search to find keys. The implementation is not intended to be appropriate for
* data structures
* that may contain large numbers of items. It is generally slower than a traditional
* HashMap, since lookups require a binary search and adds and removes require inserting
* and deleting entries in the array. For containers holding up to hundreds of items,
* the performance difference is not significant, less than 50%.</p>
* SparseArray在查找keys的过程中采用了二分查找, 这种实现不适合数据量大的情况。由于查找时要用到二分查找,添加删除时
* 涉及到数组其他元素的挪动,因此通常SparseArray会比hashMap慢。当处理上百的数据量,性能差异不超过50%。
*
* <p>To help with performance, the container includes an optimization when removing
* keys: instead of compacting its array immediately, it leaves the removed entry marked
* as deleted. The entry can then be re-used for the same key, or compacted later in
* a single garbage collection step of all removed entries. This garbage collection will
* need to be performed at any time the array needs to be grown or the the map size or
* entry values are retrieved.</p>
* 为了优化性能,SparseArray针对remove case作了优化,remove时它不是立即挤压数组空间,而是标记为delete。
* 这个被标记的元素要么被重复利用,要么在多次remove之后通过一次gc操作中被挤压出去。
* gc需要在下列情况之前被执行:数组要扩容;获取SparseArray容量;get values(更详细的见代码注释);
*
* <p>It is possible to iterate over the items in this container using
* {@link #keyAt(int)} and {@link #valueAt(int)}. Iterating over the keys using
* <code>keyAt(int)</code> with ascending values of the index will return the
* keys in ascending order, or the values corresponding to the keys in ascending
* order in the case of <code>valueAt(int)</code>.</p>
*/
public class SparseArray<E> implements Cloneable {
//用来标记values数组中被删除的位置
private static final Object DELETED = new Object();
//用来优化删除性能,标记是否需要垃圾回收GC
private boolean mGarbage = false;
//key数组
private int[] mKeys;
//存储value 的数组
private Object[] mValues;
//集合大小
private int mSize;
/**
* Creates a new SparseArray containing no mappings.
*/
public SparseArray() {
//默认容量是10个元素
this(10);
}
/**
* Creates a new SparseArray containing no mappings that will not
* require any additional memory allocation to store the specified
* number of mappings. If you supply an initial capacity of 0, the
* sparse array will be initialized with a light-weight representation
* not requiring any additional array allocations.
*/
public SparseArray(int initialCapacity) {
if (initialCapacity == 0) {
//看EmptyArray的实现便知,mKeys的初值等于new int[0], 其他同理
mKeys = EmptyArray.INT;
mValues = EmptyArray.OBJECT;
} else {
//newUnpaddedObjectArray最后指向了VMRuntime的一个native方法,返回一个至少长initialCapacity的数组,
mValues = ArrayUtils.newUnpaddedObjectArray(initialCapacity);
//同时新建一个同样长度的key数组
mKeys = new int[mValues.length];
}
mSize = 0;
}
}
构造函数中的两个类EmptyArray 和ArrayUtils ,具体实现如下:
public final class EmptyArray {
public static final int[] INT = new int[0];
public static final Object[] OBJECT = new Object[0];
}
public class ArrayUtils {
private static final int CACHE_SIZE = 73;
private static Object[] sCache = new Object[CACHE_SIZE];
public static Object[] newUnpaddedObjectArray(int minLen) {
return (Object[])VMRuntime.getRuntime().newUnpaddedArray(Object.class, minLen);
}
}
put
/**
* Adds a mapping from the specified key to the specified value,
* replacing the previous mapping from the specified key if there
* was one.
* 添加一个指定key到指定object的映射,如果之前有一个指定key的映射则直接替换掉原映射object。注意gc。
*/
public void put(int key, E value) {
//先二分查找,确定插入位置,保证了key数组的有序性
int i = ContainerHelpers.binarySearch(mKeys, mSize, key);
if (i >= 0) {
//如果返回的index是正数,说明之前这个key存在,直接覆盖value即可
mValues[i] = value;
} else {
//若返回的i是负数,说明 key不存在.
//根据返回值的正负,可以判断是否找到index。对负index取反,即可得到应该插入的位置。
i = ~i;
//若i在size范围内,且刚好对应位置标记为delete了,则复用这个空间
if (i < mSize && mValues[i] == DELETED) {
mKeys[i] = key;
mValues[i] = value;
return;
}
if (mGarbage && mSize >= mKeys.length) {
gc();
// Search again because indices may have changed.
i = ~ContainerHelpers.binarySearch(mKeys, mSize, key);
}
mKeys = GrowingArrayUtils.insert(mKeys, mSize, i, key);
mValues = GrowingArrayUtils.insert(mValues, mSize, i, value);
mSize++;
}
}
/**
* Puts a key/value pair into the array, optimizing for the case where
* the key is greater than all existing keys in the array.
*/
//往SparseArray加入键值对key/value
public void append(int key, E value) {
//若key小于等于已有的最大key,直接Put
if (mSize != 0 && key <= mKeys[mSize - 1]) {
put(key, value);
return;
}
if (mGarbage && mSize >= mKeys.length) {
gc();
}
//若key大于了现有的所有key,就不用走put的二分查找过程了,直接append
mKeys = GrowingArrayUtils.append(mKeys, mSize, key);
mValues = GrowingArrayUtils.append(mValues, mSize, value);
mSize++;
}
在put方法中,调用了二分查找,判断是否在数组中存在。当查找不到指定值时,返回的是lo的按位取反值,为负数。
//二分查找
static int binarySearch(int[] array, int size, int value) {
int lo = 0;
int hi = size - 1;
while (lo <= hi) {
//无符号右移运算,相当于除以2,左边补0
final int mid = (lo + hi) >>> 1;
final int midVal = array[mid];
if (midVal < value) {
lo = mid + 1;
} else if (midVal > value) {
hi = mid - 1;
} else {
return mid; // value found
}
}
//若没找到,则lo是value应该插入的位置,是一个正数。对这个正数去反,返回负数回去
return ~lo; // value not present
}
GrowingArrayUtils工具类:
public static int[] insert(int[] array, int currentSize, int index, int element) {
//断言 确认 当前集合长度 小于等于 array数组长度
assert currentSize <= array.length;
//如果不需要扩容
if (currentSize + 1 <= array.length) {
//将array数组内元素,从index开始 后移一位,保证从小到大排列
System.arraycopy(array, index, array, index + 1, currentSize - index);
//在index处赋值
array[index] = element;
//返回
return array;
}
//需要扩容
//构建新的数组
int[] newArray = new int[growSize(currentSize)];
//将原数组中index之前的数据复制到新数组中
System.arraycopy(array, 0, newArray, 0, index);
//在index处赋值
newArray[index] = element;
//将原数组中index及其之后的数据赋值到新数组中
System.arraycopy(array, index, newArray, index + 1, array.length - index);
//返回
return newArray;
}
/**
*根据现在的size 返回合适的扩容后的容量
*扩容时,当前容量小于等于4,则扩容后容量为8.否则为当前容量的两倍。
*和ArrayList,ArrayMap不同(扩容一半),和Vector相同(扩容一倍)。
*扩容操作依然是用数组的复制、覆盖完成。类似ArrayList.
*/
public static int growSize(int currentSize) {
//如果当前size 小于等于4,则返回8, 否则返回当前size的两倍
return currentSize <= 4 ? 8 : currentSize * 2;
}
final class GrowingArrayUtils{
//对现有数组的扩容
public static int[] append(int[] array, int currentSize, int element) {
assert currentSize <= array.length;
if (currentSize + 1 > array.length) {
int[] newArray = ArrayUtils.newUnpaddedIntArray(growSize(currentSize));
System.arraycopy(array, 0, newArray, 0, currentSize);
array = newArray;
}
array[currentSize] = element;
return array;
}
remove
/**
* Removes the mapping from the specified key, if there was any.
*/
public void delete(int key) {
//二分查找得到要删除的key所在index
int i = ContainerHelpers.binarySearch(mKeys, mSize, key);
//如果i>=0,表示存在,注意:这里delete只操作了values数组,并没有去操作key数组;
if (i >= 0) {
if (mValues[i] != DELETED) {
mValues[i] = DELETED;
mGarbage = true;//标记 可以GC
}
}
}
/**
* @hide
* Removes the mapping from the specified key, if there was any, returning the old value.
* 带返回值的删除
*/
public E removeReturnOld(int key) {
int i = ContainerHelpers.binarySearch(mKeys, mSize, key);
if (i >= 0) {
if (mValues[i] != DELETED) {
final E old = (E) mValues[i];
mValues[i] = DELETED;
mGarbage = true;
return old;
}
}
return null;
}
/**
* Alias for {@link #delete(int)}.
*/
public void remove(int key) {
delete(key);
}
/**
* Removes the mapping at the specified index.
*
* <p>For indices outside of the range <code>0...size()-1</code>,
* the behavior is undefined.</p>
* 根据index直接索引到对应位置 执行删除操作
*/
public void removeAt(int index) {
if (mValues[index] != DELETED) {
mValues[index] = DELETED;
mGarbage = true;
}
}
/**
* Remove a range of mappings as a batch.
*
* @param index Index to begin at
* @param size Number of mappings to remove
*
* <p>For indices outside of the range <code>0...size()-1</code>,
* the behavior is undefined.</p>
*/
public void removeAtRange(int index, int size) {
final int end = Math.min(mSize, index + size);
for (int i = index; i < end; i++) {
removeAt(i);
}
}
get
/**
* Gets the Object mapped from the specified key, or <code>null</code>
* if no such mapping has been made.
*/
public E get(int key) {
return get(key, null);
}
/**
* Gets the Object mapped from the specified key, or the specified Object
* if no such mapping has been made.
*/
@SuppressWarnings("unchecked")
public E get(int key, E valueIfKeyNotFound) {
int i = ContainerHelpers.binarySearch(mKeys, mSize, key);
// 如果没找到或者该value已经被标记删除,则返回默认值
if (i < 0 || mValues[i] == DELETED) {
return valueIfKeyNotFound;
} else {
return (E) mValues[i];
}
}
//返回索引为index的mKeys值,注意gc。
public int keyAt(int index) {
if (mGarbage) {
gc();
}
return mKeys[index];
}
@SuppressWarnings("unchecked")
public E valueAt(int index) {
if (mGarbage) {
gc();
}
return (E) mValues[index];
}
gc
private void gc() {
// Log.e("SparseArray", "gc start with " + mSize);
int n = mSize;
int o = 0;
int[] keys = mKeys;
Object[] values = mValues;
for (int i = 0; i < n; i++) {
Object val = values[i];
if (val != DELETED) {
if (i != o) {
keys[o] = keys[i];
values[o] = val;
values[i] = null;
}
o++;
}
}
mGarbage = false;
//得到最新的大小
mSize = o;
// Log.e("SparseArray", "gc end with " + mSize);
}
总结
1.在remove和get操作时,是不会调用gc,这就是针对remove作了优化,将可能的多次gc操作变为一次完成。以下是会发生gc的操作。
2. 调用remove移除数据时,没有立即回收内存,会设置标志位mGarbage为true,而且将对应的value[index]=DELETED。表明value[]数组 是可以被复用和回收的,当有新数据put时,优先使用复用被设置成value[index]=DELETED的位置。多次remove仅多次设置了标志,在gc触发时,仅一次循环就压缩完空间。
3.gc操作,将无效的value值设置为null,将Key[]数组压缩,设置mSize变量为实际大小。