Java集合框架中List接口有哪些实现？

作者：徐梦旗，发布于：2023年09月25日 20:00，字数：2.7k，预计阅读：11分钟

> CollectionsFramework #源码

1. List详解

1.1. List接口有哪些实现？

Java平台提供了两种常规用途的List接口的实现，分别是ArrayList和LinkedList。

在一般场景下，ArrayList有着更好的性能。
在特定场景下，LinkedList有着更好的性能。

1.2. List接口的实现有什么不同？

	ArrayList	LinkedList	Vector
底层数据结构	数组	链表	数组
初始容量	首次扩容时为10	/	10
扩容	1.5倍	/	2倍
是否支持随机访问	是	否	是
是否线程安全的	不安全（fast-fail）	不安全（fast-fail）	安全（内置锁）
添加新元素	O(1)	O(1)	O(n)
指定下标添加新元素	O(n)	O(n)	O(n)
删除指定下标的元素	O(n)	O(n)	O(n)
修改指定下标的元素	O(1)	O(n)	O(1)
访问指定下标的元素	O(1)	O(n)	O(1)

2. ArrayList详解

2.1. ArrayList和数组有什么区别？

大小不同：数组的大小是固定的，在声明的时候就需要指定；ArrayList的大小会按需动态调整。
存储元素的类型不同：数组可以存储基本类型和引用类型的元素；ArrayList只能存储引用类型的元素。
操作方法不同：ArrayList提供了一系列封装的方法，比直接操作数组更易用。

2.2. ArrayList是如何设计的？

继承关系：ArrayList实现了List接口和RandomAccess接口，支持随机访问。
底层数据结构：ArrayList基于动态数组，当容量不足时会进行扩容。
存储元素上：ArrayList允许存储null元素，以及存储重复的元素。
线程安全上：ArrayList是线程不安全的，采用fast-fail机制。

2.3. 什么是fast-fail机制？

fast-fail机制是指在列表的迭代器被创建后，如果修改了列表（不包括迭代器本身的remove和add操作），在执行迭代器的操作时会抛出ConcurrentModificationException异常。设计意图是当存在并发修改时，通过快速失败机制来避免后续的并发风险，起到提醒开发人员的作用。

final void checkForComodification() {
	if (modCount != expectedModCount)
		throw new ConcurrentModificationException();
}

实现原理是当每次对列表进行修改时，会递增列表的修改次数modCount，创建迭代器时则会记录当前列表的修改次数expectedModCount，在迭代器的每次操作前都会检查modCount是否和expectedModCount一致，如果不一致则代表创建迭代器后对列表进行了修改，故抛出并发修改异常。

2.4. ArrayList是如何扩容的？

在添加新元素前，如果所需最小容量大于当前数组的大小则进行扩容，否者直接添加新元素；
如果是无参构造器创建的ArrayList，首次扩容时，所需最小容量为10；
计算当前数组的大小的1.5倍作为新容量；如果新容量仍小于所需最小容量则取所需最小容量作为新容量；
根据新容量创建新数组并复制原数组中的元素，而后添加新元素。

public boolean add(E e) {
	// 1. 添加新元素前，如果有必要则进行扩容
	ensureCapacityInternal(size + 1);  // Increments modCount!!
	elementData[size++] = e;
	return true;
}


private static int calculateCapacity(Object[] elementData, int minCapacity) {
	// 2. 如果是无参构造器创建的ArrayList，首次扩容时，所需最小容量为10
	if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
		return Math.max(DEFAULT_CAPACITY, minCapacity);
	}
	return minCapacity;
}

private void ensureCapacityInternal(int minCapacity) {
	ensureExplicitCapacity(calculateCapacity(elementData, minCapacity));
}

private void ensureExplicitCapacity(int minCapacity) {
	modCount++;

	// 3. 如果所需最小容量大于当前数组的大小则进行扩容
	// overflow-conscious code
	if (minCapacity - elementData.length > 0)
		grow(minCapacity);
}


private void grow(int minCapacity) {
	// overflow-conscious code
	int oldCapacity = elementData.length;
	// 4. 计算当前数组的大小的1.5倍作为新容量
	int newCapacity = oldCapacity + (oldCapacity >> 1);
	// 5. 如果新容量仍小于所需最小容量则取所需最小容量作为新容量
	if (newCapacity - minCapacity < 0)
		newCapacity = minCapacity;
	if (newCapacity - MAX_ARRAY_SIZE > 0)
		// 在最小容量不大于MAX_ARRAY_SIZE时，限制新容量不大于MAX_ARRAY_SIZE，避免虚拟机无法创建数组
		newCapacity = hugeCapacity(minCapacity);
	// minCapacity is usually close to size, so this is a win:
	// 6. 根据新容量创建新数组并复制原数组中的元素 
	elementData = Arrays.copyOf(elementData, newCapacity);
}

private static int hugeCapacity(int minCapacity) {
	if (minCapacity < 0) // overflow
		throw new OutOfMemoryError();
	return (minCapacity > MAX_ARRAY_SIZE) ?
		Integer.MAX_VALUE :
		MAX_ARRAY_SIZE;
}

3. LinkedList详解

3.1. LinkedList是如何设计的？

继承关系：LinkedList实现了List接口和Deque接口。
底层数据结构：LinkedList基于双向链表，无需扩容。
存储元素：LinkedList允许存储null元素，以及存储重复的元素。
线程安全：LinkedList是线程不安全的，采用fast-fail机制。

3.2. ArrayList和LinkedList有什么区别？

底层使用的数据结构不同：ArrayList基于动态数组，而LinkedList基于双向链表。
占用的内存大小不同：相比于ArrayList，LinkedList需要额外存储链表节点，会消耗更多的内存空间。
操作的性能不同：ArrayList是常量级访问，线性级修改，LinkedList是线性级访问，常量级修改。
实现的接口不同：LinkedList额外实现了Deque接口，可以当做双向队列使用。

3.3. 什么时候选用LinkedList？

官方建议在选择使用LinkedList前，应该同时测试ArrayList和LinkedList的性能，来验证是否真的有必要选用LinkedList，通常来说ArrayList拥有更好的性能。线程不安全的Deque推荐使用ArrayDeque，线程安全的Deque推荐使用ConcurrentLinkedDeque。

4. Vector详解

4.1. Vector是如何设计的？

继承关系：Vector实现了List接口和RandomAccess接口。
底层数据结构：Vector基于动态数组，当容量不足时会自动扩容。
存储元素：Vector允许存储null元素，以及存储重复的元素。
线程安全：Vector是线程安全的，采用synchronized修饰了所有方法。

4.2. ArrayList和Vector有什么区别？

Vector可以理解为线程安全的ArrayList。线程不安全的List推荐使用ArrayList，线程安全的List推荐使用CopyOnWriteArrayList。

扩容的大小不同：ArrayList扩容为原数组大小的1.5倍，Vector扩容为原数组的2倍。
线程安全性不同：ArrayList是线程不安全的，采用fast-fail机制，Vector是线程安全的，使用了内置锁。
操作的性能不同：Vector中的方法需要加锁，性能要比ArrayList差。

4.3. Stack底层是如何实现的？

Stack继承了Vector，是线程安全的，可以当做堆栈使用。

5. ArrayList和LinkedList的操作时间复杂度

5.1. 添加新元素

ArrayList#add(E)

当需要扩容时，由于需要复制数组，时间复杂度为O(n)。
当不需要扩容时，直接通过下标修改数组，时间复杂度为O(1)。
平均时间复杂度为O(n)。

public boolean add(E e) {
	// 需要扩容时，会发生数组复制
	ensureCapacityInternal(size + 1);  // Increments modCount!!
	elementData[size++] = e;
	return true;
}

LinkedList#add(E)

直接在尾结点后添加新元素，时间复杂度为O(1)。
平均时间复杂度为O(1)。

public boolean add(E e) {
    linkLast(e);
    return true;
}

5.2. 指定下标添加新元素

ArrayList#add(int, E)

由于数组移动的存在，时间复杂度为O(n)。
平均复制度为O(n)。

public void add(int index, E element) {
    rangeCheckForAdd(index);

    ensureCapacityInternal(size + 1);  // Increments modCount!!
    System.arraycopy(elementData, index, elementData, index + 1,
                     size - index);
    elementData[index] = element;
    size++;
}

LinkedList#add(int, E)

当添加的新元素在头尾节点时，时间复杂度为O(1)。
当添加的新元素不在头尾节点时，由于需要遍历链表找到插入位置，时间复杂度为O(n)。
平均时间复杂度为O(n)。

public void add(int index, E element) {
	checkPositionIndex(index);

	// 尾结点添加
	if (index == size)
		linkLast(element);
	// 非尾结点添加元素，需要遍历链表找到插入位置
	else
		linkBefore(element, node(index));
}

5.3. 删除指定下标的元素

ArrayList#remove(int)

当删除的是最后一个元素时，时间复杂度为O(1)。
当删除的不是最后一个元素时，由于需要移动数组，时间复杂度为O(n)。
平均时间复杂度为O(n)。

public E remove(int index) {
	rangeCheck(index);

	modCount++;
	E oldValue = elementData(index);
	// 有且仅有删除最后一个元素时（index为size-1），不满足以下条件
	int numMoved = size - index - 1;
	if (numMoved > 0)
		// 数组移动
		System.arraycopy(elementData, index+1, elementData, index,
						 numMoved);
	elementData[--size] = null; // clear to let GC do its work

	return oldValue;
}

LinkedList#remove(int)

当删除的元素在头节点时，时间复杂度为O(1)。
当删除的元素不在头节点时，由于需要遍历链表找到删除位置，时间复杂度为O(n)。
平均时间复杂度为O(n)。

public E remove(int index) {
	checkElementIndex(index);
	return unlink(node(index));
}

5.4. 修改指定下标的元素

ArrayList#set(int, E)

通过指定下标修改数组中的元素，是随机访问，时间复杂度为O(1)。
平均时间复杂度为O(n)；

public E set(int index, E element) {
    rangeCheck(index);

    E oldValue = elementData(index);
    elementData[index] = element;
    return oldValue;
}

LinkedList#set(int, E)

当要修改的元素在头尾节点时，时间复杂度为O(1)。
当要修改的元素不在头尾节点时，由于需要遍历链表，时间复杂度为O(n)。
平均时间复杂度为O(n)。

public E set(int index, E element) {
    checkElementIndex(index);
    Node<E> x = node(index);
    E oldVal = x.item;
    x.item = element;
    return oldVal;
}

5.5. 访问指定下标的元素

ArrayList#get

通过指定下标访问数组中的元素，是随机访问，时间复杂度为O(1)。
平均时间复杂度为O(1)。

public E get(int index) {
	rangeCheck(index);
	// 下标访问数组
	return elementData(index);
}

LinkedList#get

当要访问的元素在头尾节点时，时间复杂度为O(1)。
当要访问的元素不在头尾节点时，由于需要遍历链表，时间复杂度为O(n)。
平均时间复杂度为O(n)。

public E get(int index) {
	checkElementIndex(index);
	// 遍历链表
	return node(index).item;
}

Node<E> node(int index) {
	// assert isElementIndex(index);
	// 遍历一半的链表
	if (index < (size >> 1)) {
		Node<E> x = first;
		for (int i = 0; i < index; i++)
			x = x.next;
		return x;
	} else {
		Node<E> x = last;
		for (int i = size - 1; i > index; i--)
			x = x.prev;
		return x;
	}
}

6. 参考文档

上一篇：记录一次服务器被植入木马的排查过程下一篇：虚拟机如何调优及案例分析？

Mengqi Xu's Blog