一道面试题带你深挖ArrayList底层原理，不懂来看看

佩奇的技术笔记

已于 2025-03-17 22:43:14 修改

阅读量1.1k

点赞数 27

文章标签： java 算法

于 2025-03-13 09:20:30 首次发布

本文链接：https://blog.youkuaiyun.com/weixin_44978801/article/details/146222067

版权

一、面试现场的尖锐问题

面试官：ArrayList扩容到1.5倍（注意是1.5而非2倍）时，如果某线程正在遍历列表，会发生什么？这个过程中涉及哪些关键校验机制？

这个问题深挖了ArrayList的扩容原理、迭代器机制和线程安全缺陷。要回答这个问题，我们需要从动态数组的核心实现逻辑入手。

二、核心数据结构与底层实现

2.1 基础容器结构

ArrayList底层采用 动态数组+扩容机制 实现，核心属性：

private static final int DEFAULT_INITIAL_CAPACITY = 10;
transient Object[] elementData; // 可能初始化为0或指定大小的数组
private int size; // 有效元素个数

2.2 容量分配策略

初始容量：默认10，可自定义
扩容阈值：每次超过当前容量时，扩容1.5倍（向下取整），公式为：

newCapacity = oldCapacity + (oldCapacity >> 1)

特殊初始化：new ArrayList<>(Collections.emptyList()); 会分配0容量的elementData

2.3 内存布局优化

注意elementData数组是非严格的精确容量：

当size()小于elementData.length时，数组存在"空闲节点"
内部允许存储null元素（除初始化时的emptyElementsIfShared场景）

三、关键操作方法实现细节

3.1 add() 方法

执行流程：

空间校验：当size >= elementData.length时触发扩容
数组复制：调用grow()扩容后，将原数组复制到新数组（时间复杂度O(n)）
元素写入：将新元素存入elementData[size++]位置

核心代码片段：

public boolean add(E e) {
    modCount++;
    add(e, elementData, size);
    return true;
}

private void add(E e, Object[] elementData, int s) {
    if (s == elementData.length)
        elementData = grow();
    elementData[s] = e;
    size = s + 1;
}

private Object[] grow() {
    return elementData = Arrays.copyOf(elementData, 
        newCapacity(size));
}

3.2 get() 方法

public E get(int index) {
    rangeCheck(index);
    return (E) elementData[index];
}

时间复杂度 O(1)：通过直接索引访问
边界校验：抛出IndexOutOfBoundsException

3.3 remove() 方法

public E remove(int index) {
    rangeCheck(index);
    modCount++;
    Object[] elementData = this.elementData;
    E oldValue = (E) elementData[index];
    // 向左覆盖元素
    int numMoved = size - index - 1;
    if (numMoved > 0)
        System.arraycopy(elementData, index+1, 
        elementData, index, numMoved);
    elementData[--size] = null;
    return oldValue;
}

关键点：

O(n)时间复杂度：覆盖数组元素（高消耗特性）
内存释放：将原位置赋值null，辅助GC回收

四、扩容机制深度解析

当插入元素时触发扩容的流程：

阈值判断：if (size >= elementData.length)

计算新容量：

int newCapacity = ((oldCapacity * 3) / 2) + 1; 
// 等效于扩容1.5倍

容量边界控制：处理Integer.MAX_VALUE溢出导致异常的情况
数组拷贝：System.arraycopy底层是JVM优化的C代码（Java17+改用LDCMP指令）

扩容典型场景

初始容量10时的扩容节点：

10 → 15 → 22 → 33 → 50... → 当接近极限时改为按位增加

五、迭代与fail-fast机制

public Iterator<E> iterator() {
    return new Itr();
}

private class Itr implements Iterator<E> {
    private int cursor;     // 当前索引
    private int lastRet = -1;
    private int expectedModCount = modCount;

    public boolean hasNext() {
        return cursor != size;
    }

    public E next() {
        checkForComodification();
        int i = cursor;
        if (i >= size)
            throw new NoSuchElementException();
        Object[] elementData = ArrayList.this.elementData;
        if (i >= elementData.length)
            throw new ConcurrentModificationException();
        E next = (E) elementData[lastRet = i];
        cursor = i + 1;
        return next;
    }

    final void checkForComodification() {
        if (modCount != expectedModCount)
            throw new ConcurrentModificationException();
    }
}

关键逻辑：

一致性校验：modCount与expectedModCount对比
线程不安全：迭代时其他线程修改集合将抛出异常
索引越界保护：i >= size的二次校验

六、高频面试问题与设计陷阱

回到开篇问题：扩容发生时迭代器状态分析：

扩容触发条件：插入第15个元素时，当前容量为10 → 触发扩容到15
迭代器状态：
- 若迭代器expectedModCount未更新，扩容导致modCount变化 → 抛出CME
- 若扩容前已迭代到中间，后续遍历将无法获取新扩容后元素
线程处理缺陷：多线程并行修改将破坏fail-fast检查

七、架构设计哲学

空间换时间：通过预分配数组提升get()效率
动态扩展权衡：1.5倍因子在内存与扩容频率间取得平衡（对比HashMap的2倍）
失败快速原则：主动抛出异常避免静默数据污染
适用场景：需要频繁访问已有元素的场景（缓存查询、遍历操作）

核心特性技术参数对比

方法	时间复杂度	关键点
get()	O(1)	索引直接访问
add()	指数摊还O(1)	一次扩容支付多次操作
remove(int)	O(n)	数组元素左移操作
iterator()	O(1)	实例化迭代器时依赖modCount