Java集合类中绝对占有一席之地的List，终于彻底把握了

大厂在职_bdZ

于 2025-02-05 21:16:32 发布

阅读量1.6k

点赞数 15

文章标签： java list python

本文链接：https://blog.youkuaiyun.com/2501_90392300/article/details/145461990

版权

查看 ArrayList 的源码可以看到它就是一个简单的数组，用来数据存储。

/**

The array buffer into which the elements of the ArrayList are stored.
The capacity of the ArrayList is the length of this array buffer. Any
empty ArrayList with elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA
will be expanded to DEFAULT_CAPACITY when the first element is added.
*/
transient Object[] elementData; // non-private to simplify nested class access

/**

Shared empty array instance used for default sized empty instances. We
distinguish this from EMPTY_ELEMENTDATA to know how much to inflate when
first element is added.
*/
private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};

/**

Default initial capacity.
*/
private static final int DEFAULT_CAPACITY = 10;

通过上面的注释了解到，ArrayList 无参构造时是会共享一个长度为 0 的数组 DEFAULTCAPACITY_EMPTY_ELEMENTDATA. 只有当第一个元素添加时才会第一次扩容，这样也防止了创建对象时更多的内存浪费。

ArrayList 扩容机制

我们都知道数组的大小一但确定是不能改变的，那么 ArrayList 明显可以不断的添加元素，它的底层又是数组，它是怎么实现的呢？从上面的 ArrayList 存储结构以及注释中了解到，ArrayList 在初始化时，是共享一个长度为 0 的数组的，当第一个元素添加进来时会进行第一次扩容，我们可以想像出 ArrayList 每当空间不够使用时就会进行一次扩容，那么扩容的机制是什么样子的呢?

依旧从源码开始，追踪 add() 方法的内部实现。

/**

Appends the specified element to the end of this list.
@param e element to be appended to this list
@return true (as specified by {@link Collection#add})
*/
public boolean add(E e) {
ensureCapacityInternal(size + 1); // Increments modCount!!
elementData[size++] = e;
return true;
}
// 开始检查当前插入位置时数组容量是否足够
private void ensureCapacityInternal(int minCapacity) {
// ArrayList 是否未初始化，未初始化是则初始化 ArrayList ，容量给 10.
if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);
}
ensureExplicitCapacity(minCapacity);
}
// 比较插入 index 是否大于当前数组长度，大于就 grow 进行扩容
private void ensureExplicitCapacity(int minCapacity) {
modCount++;
// overflow-conscious code
if (minCapacity - elementData.length > 0)
grow(minCapacity);
}

/**

Increases the capacity to ensure that it can hold at least the
number of elements specified by the minimum capacity argument.
@param minCapacity the desired minimum capacity
*/
private void grow(int minCapacity) {
// overflow-conscious code
int oldCapacity = elementData.length;
// 扩容规则是当前容量 + 当前容量右移1位。也就是1.5倍。
int newCapacity = oldCapacity + (oldCapacity >> 1);
if (newCapacity - minCapacity < 0)
newCapacity = minCapacity;
// 是否大于 Int 最大值，也就是容量最大值
if (newCapacity - MAX_ARRAY_SIZE > 0)
newCapacity = hugeCapacity(minCapacity);
// minCapacity is usually close to size, so this is a win:
// 拷贝元素到扩充后的新的 ArrayList
elementData = Arrays.copyOf(elementData, newCapacity);
}

通过源码发现扩容逻辑还是比较简单的，整理下具体的扩容流程如下：

开始检查当前插入位置时数组容量是否足够
ArrayList 是否未初始化，未初始化是则初始化 ArrayList ，容量给 10.
判断当前要插入的下标是否大于容量
不大于，插入新增元素，新增流程完毕。
如果所需的容量大于当前容量，开始扩充。
扩容规则是当前容量 + 当前容量右移1位。也就是1.5倍。

int newCapacity = oldCapacity + (oldCapacity >> 1);

如果扩充之后还是小于需要的最小容量，则把所需最小容量作为容量。
如果容量大于默认最大容量，则使用最大值 Integer 作为容量。
拷贝老数组元素到扩充后的新数组
插入新增元素，新增流程完毕。

ArrayList 数据新增

上面分析扩容时候已经看到了新增一个元素的具体逻辑，因为底层是数组，所以直接指定下标赋值即可，非常简单。

public boolean add(E e) {
ensureCapacityInternal(size + 1); // Increments modCount!!
elementData[size++] = e; // 直接赋值
return true;
}

但是还有一种新增数据得情况，就是新增时指定了要加入的下标位置。这时逻辑有什么不同呢？

/**

Inserts the specified element at the specified position in this
list. Shifts the element currently at that position (if any) and
any subsequent elements to the right (adds one to their indices).
@param index index at which the specified element is to be inserted
@param element element to be inserted
@throws IndexOutOfBoundsException {@inheritDoc}
*/
public void add(int index, E element) {
rangeCheckForAdd(index);
ensureCapacityInternal(size + 1); // Increments modCount!!
// 指定下标开始所有元素后移一位
System.arraycopy(elementData, index, elementData, index + 1,size - index);
elementData[index] = element;
size++;
}

可以发现这种新增多了关键的一行，它的作用是把从要插入的坐标开始的元素都向后移动一位，这样才能给指定下标腾出空间，才可以放入新增的元素。

比如你要在下标为 3 的位置新增数据100，那么下标为3开始的所有元素都需要后移一位。

由此也可以看到 ArrayList 的一个缺点，随机插入新数据时效率不高。

ArrayList 数据获取

数据下标获取元素值，一步到位，不必多言。

public E get(int index) {
rangeCheck(index);
return elementData(index);
}
E elementData(int index) {
return (E) elementData[index];
}

LinkedList

LinkedList 的底层就是一个链表线性结构了，链表除了要有一个节点对象外，根据单向链表和双向链表的不同，还有一个或者两个指针。那么 LinkedList 是单链表还是双向链表呢？

LinkedList 存储结构

依旧深入 LinkedList 源码一探究竟，可以看到 LinkedList 无参构造里没有任何操作，不过我们通过查看变量 first、last 可以发现它们就是存储链表第一个和最后一个的节点。

transient int size = 0;
/**

Pointer to first node.
Invariant: (first == null && last == null) ||

       (first.prev == null && first.item != null)

*/
transient Node first;

/**

Pointer to last node.
Invariant: (first == null && last == null) ||

       (last.next == null && last.item != null)

*/
transient Node last;

/**

Constructs an empty list.
*/
public LinkedList() {
}

变量 first 和 last 都是 Node 类型，继而查看 Node 源码。

private static class Node {
E item;
Node next;
Node prev;

Node(Node prev, E element, Node next) {
this.item = element;
this.next = next;
this.prev = prev;
}
}

可以看到这就是一个典型的双向链表结构，item 用来存放元素值；next 指向下一个 node 节点，prev 指向上一个 node 节点。

LinkedList 数据获取

链表不像数组是连续的内存地址，链表是通过next 和 prev 指向记录链接路径的，所以查找指定位置的 node 只能遍历查找，查看源码也是如此。

public E get(int index) {
checkElementIndex(index);
return node(index).item;
}
/**

Returns the (non-null) Node at the specified element index.
*/
// 遍历查找 index 位置的节点信息
Node node(int index) {
// assert isElementIndex(index);
// 这里判断 index 是在当前链表的前半部分还是后半部分，然后决定是从
// first 向后查找还是从 last 向前查找。
if (index < (size >> 1)) {
Node x = first;
for (int i = 0; i < index; i++)
x = x.next;
return x;
} else {
Node x = last;
for (int i = size - 1; i > index; i–)
x = x.prev;
return x;
}
}

查找指定位置的 node 对象，这个部分要注意的是，查找会首先判断 index 是在当前链表的前半部分还是后半部分，然后决定是从 first 向后查找还是从 last 向前查找。这样可以增加查找速度。从这里也可以看出链表在查找指定位置元素时，效率不高。

LinkedList 数据新增

因为 LinkedList 是链表，所以 LinkedList 的新增也就是链表的数据新增了，这时候要根据要插入的位置的区分操作。

尾部插入

public boolean add(E e) {
linkLast(e);
return true;
}
void linkLast(E e) {
final Node l = last;
// 新节点，prev 为当前尾部节点，e为元素值，next 为 null，
final Node newNode = new Node<>(l, e, null);
last = newNode;
if (l == null)
first = newNode;
else
// 目前的尾部节点 next 指向新的节点
l.next = newNode;
size++;
modCount++;
}

默认的 add 方式就是尾部新增了，尾部新增的逻辑很简单，只需要创建一个新的节点，新节点的 prev 设置现有的末尾节点，现有的末尾 Node 指向新节点 Node，新节点的 next 设为 null 即可。

中间新增

下面是在指定位置新增元素，涉及到的源码部分。

public void add(int index, E element) {
checkPositionIndex(index);
if (index == size)
// 如果位置就是当前链表尾部，直接尾插
linkLast(element);
else
// 获取 index 位置的节点，插入新的元素
linkBefore(element, node(index));
}

/**

Inserts element e before non-null Node succ.
*/
// 在指定节点处新增元素，修改指定元素的下一个节点为新增元素，新增元素的下一个节点是查找到得 node 的next节点指向，
// 新增元素的上一个节点为查找到的 node 节点，查找到的 node 节点的 next 指向 node 的 prev 修改为新 Node
void linkBefore(E e, Node succ) {
// assert succ != null;
final Node pred = succ.prev;
final Node newNode = new Node<>(pred, e, succ);
succ.prev = newNode;
if (pred == null)
first = newNode;
else
pred.next = newNode;
size++;
modCount++;
}

可以看到指定位置插入元素主要分为两个部分，第一个部分是查找 node 节点部分，这部分就是上面介绍的 LinkedList 数据获取部分，

第二个部分是在查找到得 node 对象后插入元素。主要就是修改 node 的 next 指向为新节点，新节点的 prev 指向为查找到的 node 节点，新节点的 next 指向为查找到的 node 节点的 next 指向。查找到的 node 节点的 next 指向的 node 节点的 prev 修改为新节点。

LinkedList 数据删除

依旧查看源码进行分析，源码中看到如果节点是头结点或者尾节点，删除比较简单。我们主要看删除中间一个节点时的操作

public E remove(int index) {
checkElementIndex(index);
return unlink(node(index));
}
/**

Unlinks non-null node x.
*/
E unlink(Node x) {
// assert x != null;
final E element = x.item;
final Node next = x.next;
final Node prev = x.prev;

if (prev == null) {
first = next;
} else {
prev.next = next;
x.prev = null;
}

if (next == null) {
last = prev;
} else {
next.prev = prev;
x.next = null;
}

x.item = null;
size–;
modCount++;
return element;
}

node(index) 方法依旧是二分查找目标位置，然后进行删除操作。比如要删除的节点叫做 X，删除操作主要是修改 X 节点的 prev 节点的 next 指向为 X 节点的 next 指向，修改 X 节点的 next 节点的 prev 指向为 X 节点的 prev 指向，最后把 X 节点的 prev 和 next 指向清空。如果理解起来有点费劲，可以看下面这个图，可能会比较明白。