java容器理解

最新推荐文章于 2022-08-09 14:44:03 发布

原创最新推荐文章于 2022-08-09 14:44:03 发布 · 301 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#java

java 专栏收录该内容

2 篇文章

订阅专栏

本文详细介绍了Java中的容器类，包括List、Set、Queue和Map四种类型。解释了每种容器的特点、应用场景及内部实现原理，如ArrayList和LinkedList的不同，HashSet与TreeSet的区别等。

最近看到《深入理解java》这本书，做些笔记来记录一下，可能比较零碎。
我会尽力和java源码结合起来，一起理解更加容易。

看到了第11章，持有对象，写下这篇博文，记录一下。

1、基本概念

程序通常是根据运行时才知道的某些条件去创建对象，在此之前不知道对象的数量和具体类型。为了解决这个问题，java引入了容器类来解决这一问题。
容器用于保存对象，包含List、Set、Queue和Map四个类型，这些类型又称为集合类。
下面我用一幅图来表示容器的类型和关系

这里写图片描述

上图中，绿色框表示常用的容器，黑色箭头表示实现接口，蓝色箭头表示继承关系。

Java容器类类库可以将其分为2个概念。

Collection
一个独立元素的序列，这些元素都服从一条或多条规则。其中List必须按照插入的顺序保存元素、Set不能有重复的元素、Queue按照排队规则来确定对象的产生顺序（通常也是和插入顺序相同）

Map
一组成对的值键对对象，允许用键来查找值。ArrayList允许我们用数字来查找值，它是将数字和对象联系在一起。而Map允许我们使用一个对象来查找某个对象，它也被称为关联数组。或者叫做字典。

2、List

Lsit可以将元素维护在特定序列中。可以看到，list接口继承自Collection接口，但是添加了addAll(), replaceAll(), sort(), get(), set()等方法实现List的插入和删除

public interface List<E> extends Collection<E> {

List有两种基本类型，ArrayList和LinkedList。

ArrayList
ArrayList内部使用数组实现，优点在于随机访问元素快，但是在中间插入和移除比较慢

public class ArrayList<E> extends AbstractList<E>
        implements List<E>, RandomAccess, Cloneable, java.io.Serializable
{
 /**
     * The array buffer into which the elements of the ArrayList are stored.
     * The capacity of the ArrayList is the length of this array buffer. Any
     * empty ArrayList with elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA
     * will be expanded to DEFAULT_CAPACITY when the first element is added.
     */
    transient Object[] elementData; // non-private to simplify nested class access
}

查看源码我们可以发现，ArrayList继承于AbstractList这个类，进入这个类，我们可以看到：

public abstract class AbstractList<E> extends AbstractCollection<E> implements List<E> {

}

这个类继承于AbstractCollection，是List类实现的基类，里面有add(), set(), addAll(), removeAll()等方法。

ArrayList以以下三种方法完成初始化

private transient Object[] elementData;
public ArrayList() {
        this(10);
    }
  public ArrayList(int initialCapacity) {
        super();
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal Capacity: "+
                                               initialCapacity);
        this.elementData = new Object[initialCapacity];
    }
 public ArrayList(Collection<? extends E> c) {
        elementData = c.toArray();
        size = elementData.length;
        // c.toArray might (incorrectly) not return Object[] (see 6260652)
        if (elementData.getClass() != Object[].class)
            elementData = Arrays.copyOf(elementData, size, Object[].class);
    }

可以看到，ArrayList数组的默认长度是10。所有ArrayList在读取的时候是具有和数组一样的效率，它的时间复杂度为1。

插入代码如下所示

 /**
     * Appends the specified element to the end of this list.
     *
     * @param e element to be appended to this list
     * @return <tt>true</tt> (as specified by {@link Collection#add})
     */
    public boolean add(E e) {
        ensureCapacityInternal(size + 1);  // Increments modCount!!
        elementData[size++] = e;
        return true;
    }

    /**
     * Inserts the specified element at the specified position in this
     * list. Shifts the element currently at that position (if any) and
     * any subsequent elements to the right (adds one to their indices).
     *
     * @param index index at which the specified element is to be inserted
     * @param element element to be inserted
     * @throws IndexOutOfBoundsException {@inheritDoc}
     */
    public void add(int index, E element) {
        rangeCheckForAdd(index);

        ensureCapacityInternal(size + 1);  // Increments modCount!!
        System.arraycopy(elementData, index, elementData, index + 1,
                         size - index);
        elementData[index] = element;
        size++;
    }

如果不带index，就在尾部插入元素。如果在中间某个位置插入，则需要看System.arraycopy(elementData, index, elementData, index + 1)，这程序中第一个参数是源数组，源数组起始位置，目标数组，目标数组起始位置，复制数组元素数目。那么这个意思就是从index索性处每个元素向后移动一位，最后把索引为index空出来，并将element赋值给它。这样一来我们并不知道要插入哪个位置，所以会进行匹配那么它的时间赋值度就为n。

同样道理，remove()的时候，将index+1后面的数据向前移动一位，最后一位变成null，时间复杂度自认是n。

public E remove(int index) {
        rangeCheck(index);

        modCount++;
        E oldValue = elementData(index);

        int numMoved = size - index - 1;
        if (numMoved > 0)
            System.arraycopy(elementData, index+1, elementData, index,
                             numMoved);
        elementData[--size] = null; // clear to let GC do its work

        return oldValue;
    }

LinkedList
LinkedList通过代价较低在List中间进行插入和移除，提供了优化的顺序访问，但是在随机访问方面相对较慢。但是他的特性功能要比ArrayList强大的多。支持Queue和Stack。
来看源码：

public class LinkedList<E>
    extends AbstractSequentialList<E>
    implements List<E>, Deque<E>, Cloneable, java.io.Serializable
{
transient int size = 0;

    transient Node<E> first;
    transient Node<E> last;
private static class Node<E> {
        E item;
        Node<E> next;
        Node<E> prev;

        Node(Node<E> prev, E element, Node<E> next) {
            this.item = element;
            this.next = next;
            this.prev = prev;
        }
    }
 }

在这里可以很清楚地看到，LinkedList是通过双向链表的形式实现list的。
那么LinkedList的初始化如何实现的呢？

/**
     * Constructs a list containing the elements of the specified
     * collection, in the order they are returned by the collection's
     * iterator.
     *
     * @param  c the collection whose elements are to be placed into this list
     * @throws NullPointerException if the specified collection is null
     */
    public LinkedList(Collection<? extends E> c) {
        this();
        addAll(c);
    }
    public boolean addAll(Collection<? extends E> c) {
        return addAll(size, c);
    }
    public boolean addAll(int index, Collection<? extends E> c) {
        checkPositionIndex(index);

        Object[] a = c.toArray();
        int numNew = a.length;
        if (numNew == 0)
            return false;

        Node<E> pred, succ;
        if (index == size) {
            succ = null;
            pred = last;
        } else {
            succ = node(index);
            pred = succ.prev;
        }

        for (Object o : a) {
            @SuppressWarnings("unchecked") E e = (E) o;
            Node<E> newNode = new Node<>(pred, e, null);
            if (pred == null)
                first = newNode;
            else
                pred.next = newNode;
            pred = newNode;
        }

        if (succ == null) {
            last = pred;
        } else {
            pred.next = succ;
            succ.prev = pred;
        }

        size += numNew;
        modCount++;
        return true;
    }

可以看到，初始化LinkedList调用addAll()方法，里面遍历传入的数据，然后得到双向链表，并记录下头和尾。
了解到初始化，插入就很简单了：

public void add(int index, E element) {
        checkPositionIndex(index);

        if (index == size)
            linkLast(element);
        else
            linkBefore(element, node(index));
    }
    void linkBefore(E e, Node<E> succ) {
        // assert succ != null;
        final Node<E> pred = succ.prev;
        final Node<E> newNode = new Node<>(pred, e, succ);
        succ.prev = newNode;
        if (pred == null)
            first = newNode;
        else
            pred.next = newNode;
        size++;
        modCount++;
    }
    void linkLast(E e) {
        final Node<E> l = last;
        final Node<E> newNode = new Node<>(l, e, null);
        last = newNode;
        if (l == null)
            first = newNode;
        else
            l.next = newNode;
        size++;
        modCount++;
    }

add()方法就是找到对应的位置，将这个位置的原本节点变成后继结点。这样复杂度为1

最后再看看get方法：

public E get(int index) {
        checkElementIndex(index);
        return node(index).item;
    }

Node<E> node(int index) {
        // assert isElementIndex(index);

        if (index < (size >> 1)) {
            Node<E> x = first;
            for (int i = 0; i < index; i++)
                x = x.next;
            return x;
        } else {
            Node<E> x = last;
            for (int i = size - 1; i > index; i--)
                x = x.prev;
            return x;
        }
    }

可以看到，get()方法首先看查看的index属于前一半或者后一半，再进行遍历，时间复杂度比ArrayList要高。

2、Set

Set不保存重复元素，所以Set最常用的就是测试归属性，很容易的询问出某个对象是否存在Set中。Set与Collection拥有完全一样的接口，只是行为不同。
Set分为HashSet和TreeSet两种，下面将一一介绍。

HashSet
HashSet查询速度比较快，但是存储的元素是随机的并没有排序。下面是HashSet的实现代码

public class HashSet<E>
    extends AbstractSet<E>
    implements Set<E>, Cloneable, java.io.Serializable
{
private transient HashMap<E,Object> map;

    // Dummy value to associate with an Object in the backing Map
    private static final Object PRESENT = new Object();

    /**
     * Constructs a new, empty set; the backing <tt>HashMap</tt> instance has
     * default initial capacity (16) and load factor (0.75).
     */
    public HashSet(Collection<? extends E> c) {
        map = new HashMap<>(Math.max((int) (c.size()/.75f) + 1, 16));
        addAll(c);
    }
}

可以看到，很有意思的是HashSet内部使用HashMap实现的。那么HashSet是怎么实现元素不重复的呢？

 public boolean add(E e) {
        return map.put(e, PRESENT)==null;
    }

到这里真相大白了，利用HashMap的key不重复，将元素当做key存入HashMap。

TreeSet
TreeSet是将元素存储红-黑树结构中，所以存储的结果是有顺序的（所以如果你想要自己存储的集合有顺序那么选择TreeSet）
下面是Treeset的数据结构

public class TreeSet<E> extends AbstractSet<E>
    implements NavigableSet<E>, Cloneable, java.io.Serializable
{
/**
     * The backing map.
     */
    private transient NavigableMap<E,Object> m;
    TreeSet(NavigableMap<E,Object> m) {
        this.m = m;
    }
}

可以看出来，TreeSet内部使用NavigableMap数据借口，再向下深究就发现TreeSet实际上是TreeMap实现的。

那么TreeSet怎么实现不重复的存储呢，请看代码

public boolean add(E e) {
        return m.put(e, PRESENT)==null;
    }

同样是把元素作为key传入treeMap。而treeMap中有比较器用于比较是否重复，这点我们下面会提到。

当我们构造TreeSet时；若使用不带参数的构造函数，则TreeSet的使用自然比较器；若用户需要使用自定义的比较器，则需要使用带比较器的参数

3、Queue

Queue队列是一个典型的先进先出容器，就是从容器的一端放入元素，从另一端取出，并且元素放入容器的顺序和取出的顺序是相同的。LinkedList提供了对Queue的实现，LinkedList向上转型为Queue。其中Queue有offer、peek、element、pool、remove等方法。
下面是源码实现

public interface Queue<E> extends Collection<E> {

}

public interface Deque<E> extends Queue<E> {
}

public class LinkedList<E>
    extends AbstractSequentialList<E>
    implements List<E>, Deque<E>, Cloneable, java.io.Serializable
{
}

这里继承关系就很明确了，Deque双端队列继承Queue队列，LinkedList实现了Deque。

但是在Queue中想要按照规则输入就需要PriorityQueue了。

public class PriorityQueue<E> extends AbstractQueue<E>
    implements java.io.Serializable {
    transient Object[] queue; // non-private to simplify nested class access
    private final Comparator<? super E> comparator;
 public PriorityQueue(int initialCapacity,
                         Comparator<? super E> comparator) {
        // Note: This restriction of at least one is not actually needed,
        // but continues for 1.5 compatibility
        if (initialCapacity < 1)
            throw new IllegalArgumentException();
        this.queue = new Object[initialCapacity];
        this.comparator = comparator;
    }
    }

可以看到PriorityQueue采用数组存储，还需要传入对比器。那么是如何实现排序的呢？

public boolean offer(E e) {
        if (e == null)
            throw new NullPointerException();
        modCount++;
        int i = size;
        if (i >= queue.length)
            grow(i + 1);
        size = i + 1;
        if (i == 0)
            queue[0] = e;
        else
            siftUp(i, e);
        return true;
    }

 @SuppressWarnings("unchecked")
    private void siftUpComparable(int k, E x) {
        Comparable<? super E> key = (Comparable<? super E>) x;
        while (k > 0) {
            int parent = (k - 1) >>> 1;
            Object e = queue[parent];
            if (key.compareTo((E) e) >= 0)
                break;
            queue[k] = e;
            k = parent;
        }
        queue[k] = key;
    }

由于调用比较复杂，我只摘取这一点。可以看到数据组成了完全二叉树，并且层次遍历的结果就是数组，如下图所示：
这里写图片描述

当前元素与父节点不断比较如果比父节点小就交换然后继续向上比较，否则停止比较的过程。

4、Map

Map是将对象映射到其他对象的一种容器。Map主要有HashMap、TreeMap、LinkedHashMap三种实现。其中HashMap采用是散列函数所以查询的效率是比较高的，TreeMap实现了有序排列，LinkedHashMap则是保持输入时的顺序。下面将一个个介绍它们

HashMap
HashMap数据结构如下所示

public class HashMap<K,V> extends AbstractMap<K,V>
    implements Map<K,V>, Cloneable, Serializable {
    static class Node<K,V> implements Map.Entry<K,V> {
        final int hash;
        final K key;
        V value;
        Node<K,V> next;

        Node(int hash, K key, V value, Node<K,V> next) {
            this.hash = hash;
            this.key = key;
            this.value = value;
            this.next = next;
        }

        public final K getKey()        { return key; }
        public final V getValue()      { return value; }
        public final String toString() { return key + "=" + value; }

        public final int hashCode() {
            return Objects.hashCode(key) ^ Objects.hashCode(value);
        }

        public final V setValue(V newValue) {
            V oldValue = value;
            value = newValue;
            return oldValue;
        }

        public final boolean equals(Object o) {
            if (o == this)
                return true;
            if (o instanceof Map.Entry) {
                Map.Entry<?,?> e = (Map.Entry<?,?>)o;
                if (Objects.equals(key, e.getKey()) &&
                    Objects.equals(value, e.getValue()))
                    return true;
            }
            return false;
        }
    }
    transient Node<K,V>[] table;
    }

HashMap定义了单向链表数据结构，包含hash，key，value，next四个字段。数据结构就是一个单向链表的数组。那么如何插入和获取数据呢？

public V put(K key, V value) {
        return putVal(hash(key), key, value, false, true);
    }
final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
        if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);
        else {
            Node<K,V> e; K k;
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
            else if (p instanceof TreeNode)
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;
                }
            }
            if (e != null) { // existing mapping for key
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        ++modCount;
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
    }

可以看到，在put方法中，首先对key进行hash，然后看改hash值的单项链表是否为空，为空直接写入，不为空则向后查询到最后一个Node，将该节点创建在最后，并将最后一个Node的next指向它。

get方法源码：

public V get(Object key) {
        Node<K,V> e;
        return (e = getNode(hash(key), key)) == null ? null : e.value;
    }
    final Node<K,V> getNode(int hash, Object key) {
        Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
        if ((tab = table) != null && (n = tab.length) > 0 &&
            (first = tab[(n - 1) & hash]) != null) {
            if (first.hash == hash && // always check first node
                ((k = first.key) == key || (key != null && key.equals(k))))
                return first;
            if ((e = first.next) != null) {
                if (first instanceof TreeNode)
                    return ((TreeNode<K,V>)first).getTreeNode(hash, key);
                do {
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        return e;
                } while ((e = e.next) != null);
            }
        }
        return null;
    }

可以看到，将key和hash后的key传入，根据hash后的key得到链表，然后一次比较链表上节点的key和传入的key，直到找到相同key的节点或返回null

那么这个hash是一个什么操作呢？

static final int hash(Object key) {
        int h;
        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
    }

可以看到，执行了Object.hashCode，这个方法是一个本地方法，并且会被不同的方法重写。

TreeMap

TreeMap实现了Map的排序，那么它的底层结构是什么样的呢？

public class TreeMap<K,V>
    extends AbstractMap<K,V>
    implements NavigableMap<K,V>, Cloneable, java.io.Serializable
{
 private final Comparator<? super K> comparator;

    private transient Entry<K,V> root;
}

static final class Entry<K,V> implements Map.Entry<K,V> {
        K key;
        V value;
        Entry<K,V> left;
        Entry<K,V> right;
        Entry<K,V> parent;
        boolean color = BLACK;
        }

可以看到，TreeMap数据结构包括一个比较器和一个红黑树，比较器可以是空，如果是空就调用自带的compareTo。

TreeMap的put和get方法如下所示

public V put(K key, V value) {
        Entry<K,V> t = root;
        if (t == null) {
            compare(key, key); // type (and possibly null) check

            root = new Entry<>(key, value, null);
            size = 1;
            modCount++;
            return null;
        }
        int cmp;
        Entry<K,V> parent;
        // split comparator and comparable paths
        Comparator<? super K> cpr = comparator;
        if (cpr != null) {
            do {
                parent = t;
                cmp = cpr.compare(key, t.key);
                if (cmp < 0)
                    t = t.left;
                else if (cmp > 0)
                    t = t.right;
                else
                    return t.setValue(value);
            } while (t != null);
        }
        else {
            if (key == null)
                throw new NullPointerException();
            @SuppressWarnings("unchecked")
                Comparable<? super K> k = (Comparable<? super K>) key;
            do {
                parent = t;
                cmp = k.compareTo(t.key);
                if (cmp < 0)
                    t = t.left;
                else if (cmp > 0)
                    t = t.right;
                else
                    return t.setValue(value);
            } while (t != null);
        }
        Entry<K,V> e = new Entry<>(key, value, parent);
        if (cmp < 0)
            parent.left = e;
        else
            parent.right = e;
        fixAfterInsertion(e);
        size++;
        modCount++;
        return null;
    }

public V get(Object key) {
        Entry<K,V> p = getEntry(key);
        return (p==null ? null : p.value);
    }
final Entry<K,V> getEntry(Object key) {
        // Offload comparator-based version for sake of performance
        if (comparator != null)
            return getEntryUsingComparator(key);
        if (key == null)
            throw new NullPointerException();
        @SuppressWarnings("unchecked")
            Comparable<? super K> k = (Comparable<? super K>) key;
        Entry<K,V> p = root;
        while (p != null) {
            int cmp = k.compareTo(p.key);
            if (cmp < 0)
                p = p.left;
            else if (cmp > 0)
                p = p.right;
            else
                return p;
        }
        return null;
    }