世上的事,只要肯用心去學,沒有一件是太晚的。請你一定不要停下來,成為你想成為的人。
前言
在learn from collection framework design中提到,collection framework分為兩部分,分別為Collection
和Map
,其中Collection
又分為三類分別為List
,Set
和Queue
,本篇文章先來分析ArrayList的實現。
ArrayList繼承關係
如上圖所示,它實現了RandomAccess
(可隨機訪問),Cloneable
(可克隆),Serializable
(支援序列化和反序列化)介面以及List
介面,並且它還繼承了List
的抽象模板類AbstractList
。
其中,前三個介面都是marker interface,沒有可以讓實現類實現的方法。
下面直接來看ArrayList
內部的一些實現機制。
內部實現
資料結構
其內部維護了一個Object
型別的陣列
,即elementData
成員變數,成員變數size
記錄list的大小。。
初始化
ArrayList的構造方法有如下三種過載,分別是:
第一種方式:根據初始容量初始化ArrayList。
/**
* Constructs an empty list with the specified initial capacity.
*
* @param initialCapacity the initial capacity of the list
* @throws IllegalArgumentException if the specified initial capacity
* is negative
*/
public ArrayList(int initialCapacity) {
if (initialCapacity > 0) { // 根據傳入的初始的容量大小初始化List,其內部維護的是
this.elementData = new Object[initialCapacity];
} else if (initialCapacity == 0) {
this.elementData = EMPTY_ELEMENTDATA; // 是一個長度為0的空陣列,即{}
} else { // 因陣列長度不能小於0,故丟擲異常
throw new IllegalArgumentException("Illegal Capacity: "+
initialCapacity);
}
}
第二種:使用預設大小,預設內部陣列長度為0。
/**
* Constructs an empty list with an initial capacity of ten.
*/
public ArrayList() {
this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA; // DEFAULTCAPACITY_EMPTY_ELEMENTDATA預設為長度為0的空陣列
}
第三種:根據傳入的集合構建ArrayList
/**
* Constructs a list containing the elements of the specified
* collection, in the order they are returned by the collection's
* iterator.
*
* @param c the collection whose elements are to be placed into this list
* @throws NullPointerException if the specified collection is null
*/
public ArrayList(Collection<? extends E> c) {
elementData = c.toArray(); // 注意,先構造一個新的陣列,然後使用陣列拷貝,將舊資料拷貝到新陣列,這樣效率並不高,並且還浪費記憶體
if ((size = elementData.length) != 0) { // collection包含元素
// c.toArray might (incorrectly) not return Object[] (see 6260652)
if (elementData.getClass() != Object[].class)
elementData = Arrays.copyOf(elementData, size, Object[].class);
} else { // collection不包含元素,使用內部預定義的長度為0的陣列。
// replace with empty array.
this.elementData = EMPTY_ELEMENTDATA;
}
}
內部陣列擴容機制
java.util.ArrayList#ensureCapacityInternal
是專門用於擴容的私有方法,具體如下:
private void ensureCapacityInternal(int minCapacity) {
ensureExplicitCapacity(calculateCapacity(elementData, minCapacity));
}
一共有兩個步驟,分別為計算所需容量以及擴容兩個步。
計算所需容量
calculateCapacity
原始碼如下:
private static int calculateCapacity(Object[] elementData, int minCapacity) {
if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
return Math.max(DEFAULT_CAPACITY, minCapacity); // 如果剛開始是空陣列,則第一次擴容,陣列長度需擴容到 max(10,需要的最小容量)
}
return minCapacity;
}
擴容
private void ensureExplicitCapacity(int minCapacity) {
modCount++; // 記錄內部陣列擴容次數
// overflow-conscious code
if (minCapacity - elementData.length > 0)
grow(minCapacity);
}
這裡為什麼要用減法而不直接比較?
因為minCapacity
這個是由原始的大小 + 需要插入的元素的個數得到的,在加法運算後可能會出現溢位,變為負數,變為負數了就不能繼續擴容了。
grow
具體如下:
/**
* Increases the capacity to ensure that it can hold at least the
* number of elements specified by the minimum capacity argument.
*
* @param minCapacity the desired minimum capacity
*/
private void grow(int minCapacity) {
// overflow-conscious code
int oldCapacity = elementData.length;
int newCapacity = oldCapacity + (oldCapacity >> 1);
if (newCapacity - minCapacity < 0)
newCapacity = minCapacity;
if (newCapacity - MAX_ARRAY_SIZE > 0) // 這裡之所以用減法還是考慮到新的陣列長度可能會溢位
newCapacity = hugeCapacity(minCapacity);
// minCapacity is usually close to size, so this is a win:
elementData = Arrays.copyOf(elementData, newCapacity);
}
huge
原始碼如下:
private static int hugeCapacity(int minCapacity) {
if (minCapacity < 0) // overflow
throw new OutOfMemoryError();
return (minCapacity > MAX_ARRAY_SIZE) ?
Integer.MAX_VALUE :
MAX_ARRAY_SIZE;
}
擴容倍數是1.5,最大陣列長度為 MAX_ARRAY_SIZE,即Integer.MAX_VALUE - 8
,之所以要取這個值是因為,有的JVM在實現陣列的時候,剛開始會保留一些header的資訊,這些資訊會佔8個位元組。在擴充套件陣列時,長度一旦超過這個大小,會丟擲OutOfMemoryError
異常。
也就是說,如果當前陣列不足以容納新的元素,則需要1.5倍擴容,最終容量最大為Integer.MAX_VALUE - 8
單個元素插入
有兩種方式,分別如下:
方式一,預設在結尾插入,如下:
/**
* Appends the specified element to the end of this list.
*
* @param e element to be appended to this list
* @return <tt>true</tt> (as specified by {@link Collection#add})
*/
public boolean add(E e) {
ensureCapacityInternal(size + 1); // Increments modCount!!
elementData[size++] = e;
return true;
}
方式二,在指定位置插入元素,如下:
/**
* Inserts the specified element at the specified position in this
* list. Shifts the element currently at that position (if any) and
* any subsequent elements to the right (adds one to their indices).
*
* @param index index at which the specified element is to be inserted
* @param element element to be inserted
* @throws IndexOutOfBoundsException {@inheritDoc}
*/
public void add(int index, E element) {
rangeCheckForAdd(index); // 注意,檢查下標的合法性,這個下標是跟ArrayList的長度比較的,不是跟內部資料的capacity比較的!
ensureCapacityInternal(size + 1); // Increments modCount!!
// 把指定下標後(包括該下標)的資料整體後移一位
System.arraycopy(elementData, index, elementData, index + 1,
size - index);
elementData[index] = element;
size++;
}
多個元素插入
也有兩種方式。
方式一,在結尾插入,如下:
/**
* Appends all of the elements in the specified collection to the end of
* this list, in the order that they are returned by the
* specified collection's Iterator. The behavior of this operation is
* undefined if the specified collection is modified while the operation
* is in progress. (This implies that the behavior of this call is
* undefined if the specified collection is this list, and this
* list is nonempty.)
*
* @param c collection containing elements to be added to this list
* @return <tt>true</tt> if this list changed as a result of the call
* @throws NullPointerException if the specified collection is null
*/
public boolean addAll(Collection<? extends E> c) {
Object[] a = c.toArray();
int numNew = a.length;
ensureCapacityInternal(size + numNew); // Increments modCount
System.arraycopy(a, 0, elementData, size, numNew);
size += numNew;
return numNew != 0;
}
方式二,在指定位置插入,如下:
/**
* Inserts all of the elements in the specified collection into this
* list, starting at the specified position. Shifts the element
* currently at that position (if any) and any subsequent elements to
* the right (increases their indices). The new elements will appear
* in the list in the order that they are returned by the
* specified collection's iterator.
*
* @param index index at which to insert the first element from the
* specified collection
* @param c collection containing elements to be added to this list
* @return <tt>true</tt> if this list changed as a result of the call
* @throws IndexOutOfBoundsException {@inheritDoc}
* @throws NullPointerException if the specified collection is null
*/
public boolean addAll(int index, Collection<? extends E> c) {
rangeCheckForAdd(index);
Object[] a = c.toArray();
int numNew = a.length;
ensureCapacityInternal(size + numNew); // Increments modCount
int numMoved = size - index; // 計算需要index後(包括index)空出的元素的個數
if (numMoved > 0)
System.arraycopy(elementData, index, elementData, index + numNew,
numMoved);
System.arraycopy(a, 0, elementData, index, numNew);
size += numNew;
return numNew != 0;
}
移除單個元素
主要有兩種方式,分別為:
方式一,移出指定下標對應位置的元素,如下:
/**
* Removes the element at the specified position in this list.
* Shifts any subsequent elements to the left (subtracts one from their
* indices).
*
* @param index the index of the element to be removed
* @return the element that was removed from the list
* @throws IndexOutOfBoundsException {@inheritDoc}
*/
public E remove(int index) {
rangeCheck(index); // index 有效性校驗,跟 內部元素個數 size 比較
modCount++;
E oldValue = elementData(index); // 獲取指定下標下的元素
int numMoved = size - index - 1; // 計算需要移動的元素的個數
if (numMoved > 0) // 指定index後的所有元素統一向前一個索引距離
System.arraycopy(elementData, index+1, elementData, index,
numMoved);
elementData[--size] = null; // clear to let GC do its work 設定為null,允許gc回收不用的物件,並更新list的大小
return oldValue;
}
方式二,移出左邊第一個出現的指定元素
/**
* Removes the first occurrence of the specified element from this list,
* if it is present. If the list does not contain the element, it is
* unchanged. More formally, removes the element with the lowest index
* <tt>i</tt> such that
* <tt>(o==null ? get(i)==null : o.equals(get(i)))</tt>
* (if such an element exists). Returns <tt>true</tt> if this list
* contained the specified element (or equivalently, if this list
* changed as a result of the call).
*
* @param o element to be removed from this list, if present
* @return <tt>true</tt> if this list contained the specified element
*/
public boolean remove(Object o) {
if (o == null) {
for (int index = 0; index < size; index++)
if (elementData[index] == null) {
fastRemove(index);
return true;
}
} else {
for (int index = 0; index < size; index++)
if (o.equals(elementData[index])) {
fastRemove(index);
return true;
}
}
return false;
}
注意,其一,判斷相等使用的是equals方法,自定義的物件,需要根據自己的需求重新實現其equals方法;其二,從左向右遍歷,只移出第一個跟指定物件相等(equals)的物件。
其中,fastRemove
方法如下:
/*
* Private remove method that skips bounds checking and does not
* return the value removed.
*/
private void fastRemove(int index) {
modCount++; // 修改次數+1
int numMoved = size - index - 1; // 計算需要向前移動的元素的個數
if (numMoved > 0) // 如果需要移動,則將index後的元素統一向前移動一個元素大小位置,並把最後的元素的引用設為null,便於gc回收不再使用的物件,並更新list的大小。
System.arraycopy(elementData, index+1, elementData, index,
numMoved);
elementData[--size] = null; // clear to let GC do its work
}
移除多個元素
方式一,移除所有元素
/**
* Removes all of the elements from this list. The list will
* be empty after this call returns.
*/
public void clear() {
modCount++; // 修改次數 + 1
// clear to let GC do its work
for (int i = 0; i < size; i++) // 所有索引下標下的元素引用設定為null
elementData[i] = null;
size = 0; // 重置list的大小為0
}
方式二,移出指定範圍內的元素,包括開始索引不包括結束索引
/**
* Removes from this list all of the elements whose index is between
* {@code fromIndex}, inclusive, and {@code toIndex}, exclusive.
* Shifts any succeeding elements to the left (reduces their index).
* This call shortens the list by {@code (toIndex - fromIndex)} elements.
* (If {@code toIndex==fromIndex}, this operation has no effect.)
*
* @throws IndexOutOfBoundsException if {@code fromIndex} or
* {@code toIndex} is out of range
* ({@code fromIndex < 0 ||
* fromIndex >= size() ||
* toIndex > size() ||
* toIndex < fromIndex})
*/
protected void removeRange(int fromIndex, int toIndex) {
modCount++; // 修改次數 + 1
int numMoved = size - toIndex; // 計算需要移動的元素的個數
System.arraycopy(elementData, toIndex, elementData, fromIndex,
numMoved);
// clear to let GC do its work
int newSize = size - (toIndex-fromIndex); // 計算list新的大小
for (int i = newSize; i < size; i++) { // 從後往前依次清除指定位置上的元素
elementData[i] = null;
}
size = newSize; // 更新list的大小
}
注意,這種方式是一個
protected
型別的,即只允許ArrayList子類或其本身呼叫的方法。
方式三,批量移出給定集合內的元素或不在給定集合內的元素
private boolean batchRemove(Collection<?> c, boolean complement) {
final Object[] elementData = this.elementData;
int r = 0, w = 0;
boolean modified = false;
try {
for (; r < size; r++) // 從前向後遍歷
if (c.contains(elementData[r]) == complement)
elementData[w++] = elementData[r];
} finally {
// Preserve behavioral compatibility with AbstractCollection,
// even if c.contains() throws.
if (r != size) { // 剩餘的整體前移
System.arraycopy(elementData, r,
elementData, w,
size - r);
w += size - r;
}
if (w != size) { // 有元素被移除
// clear to let GC do its work
for (int i = w; i < size; i++) // 移除之後的設定為null
elementData[i] = null;
modCount += size - w; // 修改次數 + 移除的元素的個數
size = w; // 修改list的大小
modified = true; // 設定修改標誌位為true
}
}
return modified;
}
資料移除採用的是雙指標,指標
w
維護的是新的list,指標r
用於遍歷舊的list,一次外層迴圈遍歷即可得到新的list,其中w
是新的list的大小,演算法複雜度是O(n)
方式四,移除指定集合內的所有元素
public boolean removeAll(Collection<?> c) {
Objects.requireNonNull(c);
return batchRemove(c, false);
}
其內部呼叫的是方式三的方法,不做過多說明。
方式五,移除指定集合外的所有元素
public boolean retainAll(Collection<?> c) {
Objects.requireNonNull(c);
return batchRemove(c, true);
}
方式六,移除符合條件的所有資料
@Override
public boolean removeIf(Predicate<? super E> filter) {
Objects.requireNonNull(filter);
// figure out which elements are to be removed
// any exception thrown from the filter predicate at this stage
// will leave the collection unmodified
int removeCount = 0;
final BitSet removeSet = new BitSet(size);
final int expectedModCount = modCount;
final int size = this.size;
for (int i=0; modCount == expectedModCount && i < size; i++) {
@SuppressWarnings("unchecked")
final E element = (E) elementData[i];
if (filter.test(element)) {
removeSet.set(i);
removeCount++;
}
}
if (modCount != expectedModCount) {
throw new ConcurrentModificationException();
}
// shift surviving elements left over the spaces left by removed elements
final boolean anyToRemove = removeCount > 0;
if (anyToRemove) {
final int newSize = size - removeCount;
for (int i=0, j=0; (i < size) && (j < newSize); i++, j++) {
i = removeSet.nextClearBit(i);
elementData[j] = elementData[i];
}
for (int k=newSize; k < size; k++) {
elementData[k] = null; // Let gc do its work
}
this.size = newSize;
if (modCount != expectedModCount) {
throw new ConcurrentModificationException();
}
modCount++;
}
return anyToRemove;
}
對序列化的支援
/**
* Save the state of the <tt>ArrayList</tt> instance to a stream (that
* is, serialize it).
*
* @serialData The length of the array backing the <tt>ArrayList</tt>
* instance is emitted (int), followed by all of its elements
* (each an <tt>Object</tt>) in the proper order.
*/
private void writeObject(java.io.ObjectOutputStream s)
throws java.io.IOException{
// Write out element count, and any hidden stuff
int expectedModCount = modCount;
s.defaultWriteObject();
// Write out size as capacity for behavioural compatibility with clone()
s.writeInt(size);
// Write out all elements in the proper order.
for (int i=0; i<size; i++) {
s.writeObject(elementData[i]);
}
if (modCount != expectedModCount) {
throw new ConcurrentModificationException();
}
}
注意,在序列化的時候,list大小不能修改,序列化的時候把list的大小size也儲存下來了。
/**
* Reconstitute the <tt>ArrayList</tt> instance from a stream (that is,
* deserialize it).
*/
private void readObject(java.io.ObjectInputStream s)
throws java.io.IOException, ClassNotFoundException {
elementData = EMPTY_ELEMENTDATA;
// Read in size, and any hidden stuff
s.defaultReadObject();
// Read in capacity
s.readInt(); // ignored
if (size > 0) {
// be like clone(), allocate array based upon size not capacity
int capacity = calculateCapacity(elementData, size);
SharedSecrets.getJavaOISAccess().checkArray(s, Object[].class, capacity);
ensureCapacityInternal(size);
Object[] a = elementData;
// Read in all elements in the proper order.
for (int i=0; i<size; i++) {
a[i] = s.readObject();
}
}
}
反序列化後,list的capacity和size是一樣的。
測試程式碼如下:
package com.company;
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.ObjectInputStream;
import java.io.ObjectOutputStream;
import java.lang.reflect.Field;
import java.util.ArrayList;
import java.util.List;
public class Main {
public static void main(String[] args) throws Exception {
// write your code here
List<Integer> list = new ArrayList<>();
for (int i = 0; i < 12; i++) {
list.add(i);
}
System.out.println(list.size());
System.out.println(list);
ByteArrayOutputStream os = new ByteArrayOutputStream();
ObjectOutputStream oos = new ObjectOutputStream(os);
oos.writeObject(list);
oos.flush();
byte[] bytes = os.toByteArray();
ObjectInputStream inputStream = new ObjectInputStream(new ByteArrayInputStream(bytes));
List<Integer> o = (List<Integer>)inputStream.readObject();
System.out.println(o.size());
System.out.println(o);
Field elementData1 = o.getClass().getDeclaredField("elementData");
elementData1.setAccessible(true);
Object[] elementData = (Object[]) elementData1.get(list);
System.out.println(elementData.length);
elementData = (Object[]) elementData1.get(o);
System.out.println(elementData.length);
}
}
替換
替換,本質上就是一個變換,只不過這個是在原資料上修改。
@Override
@SuppressWarnings("unchecked")
public void replaceAll(UnaryOperator<E> operator) {
Objects.requireNonNull(operator);
final int expectedModCount = modCount;
final int size = this.size;
for (int i=0; modCount == expectedModCount && i < size; i++) {
elementData[i] = operator.apply((E) elementData[i]);
}
if (modCount != expectedModCount) {
throw new ConcurrentModificationException();
}
modCount++;
}
排序
排序,其實現了通用的排序演算法(呼叫Array.sort
方法),排序比較規則交給使用者來指定。
@Override
@SuppressWarnings("unchecked")
public void sort(Comparator<? super E> c) {
final int expectedModCount = modCount;
Arrays.sort((E[]) elementData, 0, size, c);
if (modCount != expectedModCount) {
throw new ConcurrentModificationException();
}
modCount++;
}
遍歷
Itr
實現了可以向後遍歷
和remove
操作的迭代器,由iterator
方法返回。ListItr
實現了可以向前遍歷
和向後遍歷
、元素的新增刪除修改
的迭代器,由listIterator
方法返回。
關於遍歷,不得不說一個非常有名的異常 -
ConcurrentModificationException
, 多數情況下是由於list內部陣列長度發生變化導致,modCount != expectedModCount
或者是IndexOutOfBoundsException等等原因丟擲的這個異常,遵循一個原則,在使用迭代器的時候,不能直接呼叫list的方法來修改list而要通過迭代器提供的響應方法來修改list。
ArrayList的優勢和缺點
優勢
- 順序儲存,隨機存取,資料元素與位置相關聯,因此查詢效率高,索引遍歷快,時間複雜度O(1)
- 尾部插入與刪除的速度速度快
缺點
- 執行緒不安全
- 非尾節點的插入和刪除需要移除後續的元素,效率較低
- 支援擴容不支援縮容,擴容後,原資料需逐一拷貝,效率較低
總結
本篇文章,相對來說比較簡單,歸根結底,對ArrayList的各種操作都是對底層陣列的操作,深刻理解陣列這種非常簡單的資料結構對理解ArrayList的各個操作有很大幫助。