《Java 高階篇》五：資料結構

ACatSmiling發表於2024-10-02

Java資料結構

Author: ACatSmiling

Since: 2024-07-28

概述

在 Java 語言中，陣列（Array）和集合都是對多個資料進行儲存操作的結構，簡稱Java 容器。此時的儲存，主要指的是記憶體層面的儲存，不涉及到持久化的儲存。

陣列在記憶體儲存方面的特點：

陣列一旦初始化以後，其長度就確定了。
陣列一旦定義好，其元素的型別也就確定了。

陣列在儲存資料方面的弊端：

陣列一旦初始化以後，其長度就不可修改，不便於擴充套件。
陣列中提供的屬性和方法少，不便於進行新增、刪除、插入等操作，且效率不高。
陣列中沒有現成的屬性和方法，去直接獲取陣列中已儲存的元素的個數（只能直接知道陣列的長度）。
陣列儲存的資料是有序的、可重複的。對於無序、不可重複的需求，不能滿足，即陣列儲存資料的特點比較單一。

Java 集合類可以用於儲存數量不等的多個物件，還可用於儲存具有對映關係的關聯陣列。

Java 集合框架可分為Collection和Map兩種體系：

Collection 介面：單列集合，用來儲存一個一個的物件。
- List 介面：儲存有序的、可重複的資料。包括：ArrayList、LinkedList、Vector。
- Set 介面：儲存無序的、不可重複的資料。包括：HashSet、LinkedHashSet、TreeSet。
Map 介面：雙列集合，用來儲存具有對映關係 "key - value 對" 的資料。包括：HashMap、LinkedHashMap、TreeMap、Hashtable、Properties。

List 介面

List 集合類中元素有序、且可重複，集合中的每個元素都有其對應的順序索引。

ArrayList

基礎屬性：

public class ArrayList<E> extends AbstractList<E>
        implements List<E>, RandomAccess, Cloneable, java.io.Serializable
{
    @java.io.Serial
    private static final long serialVersionUID = 8683452581122892189L;

    /**
     * Default initial capacity.
     */
    private static final int DEFAULT_CAPACITY = 10;

    /**
     * Shared empty array instance used for empty instances.
     */
    private static final Object[] EMPTY_ELEMENTDATA = {};

    /**
     * Shared empty array instance used for default sized empty instances. We
     * distinguish this from EMPTY_ELEMENTDATA to know how much to inflate when
     * first element is added.
     */
    private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};

    /**
     * The array buffer into which the elements of the ArrayList are stored.
     * The capacity of the ArrayList is the length of this array buffer. Any
     * empty ArrayList with elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA
     * will be expanded to DEFAULT_CAPACITY when the first element is added.
     */
    transient Object[] elementData; // non-private to simplify nested class access

    /**
     * The size of the ArrayList (the number of elements it contains).
     *
     * @serial
     */
    private int size;
}

本文原始碼，均基於 JDK 17。

new ArrayList() 時，底層 Object[] 陣列 elementData 初始化為 {}，是一個長度為 0 的空陣列：

/**
 * Constructs an empty list with an initial capacity of ten.
 */
public ArrayList() {
    this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
}

第一次呼叫 add() 方法，初始化底層 Object[] 陣列 elementData 的長度為 10，並將元素新增到 elementData 中：

/**
 * Appends the specified element to the end of this list.
 *
 * @param e element to be appended to this list
 * @return {@code true} (as specified by {@link Collection#add})
 */
public boolean add(E e) {
    modCount++;
    add(e, elementData, size);
    return true;
}

/**
 * This helper method split out from add(E) to keep method
 * bytecode size under 35 (the -XX:MaxInlineSize default value),
 * which helps when add(E) is called in a C1-compiled loop.
 */
private void add(E e, Object[] elementData, int s) {
    // 第一次執行 add() 方法，size 屬性的值為 0，elementData.length 為 0
    if (s == elementData.length)
        elementData = grow();
    // 新增元素到陣列中
    elementData[s] = e;
    size = s + 1;
}

private Object[] grow() {
    return grow(size + 1);
}

/**
 * Increases the capacity to ensure that it can hold at least the
 * number of elements specified by the minimum capacity argument.
 *
 * @param minCapacity the desired minimum capacity
 * @throws OutOfMemoryError if minCapacity is less than zero
 */
private Object[] grow(int minCapacity) {
    int oldCapacity = elementData.length;
    if (oldCapacity > 0 || elementData != DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
        int newCapacity = ArraysSupport.newLength(oldCapacity,
                minCapacity - oldCapacity, /* minimum growth */
                oldCapacity >> 1           /* preferred growth */);
        return elementData = Arrays.copyOf(elementData, newCapacity);
    } else {
        // 初始化陣列，長度為 DEFAULT_CAPACITY，即 10
        return elementData = new Object[Math.max(DEFAULT_CAPACITY, minCapacity)];
    }
}

之後，每次執行 add() 方法，直接將元素新增到 elementData 對應的位置，直到第 11 次新增元素。此時，新增的元素的總數，已經超過了陣列的長度，需要進行擴容操作：

/**
 * This helper method split out from add(E) to keep method
 * bytecode size under 35 (the -XX:MaxInlineSize default value),
 * which helps when add(E) is called in a C1-compiled loop.
 */
private void add(E e, Object[] elementData, int s) {
    // 第 11 次新增元素，此時，滿足 s == elementData.length 條件
    if (s == elementData.length)
        elementData = grow();
    elementData[s] = e;
    size = s + 1;
}

預設情況下，陣列長度擴容為原來容量的 1.5 倍，同時，將原有陣列中的資料複製到新的陣列中：

/**
 * Increases the capacity to ensure that it can hold at least the
 * number of elements specified by the minimum capacity argument.
 *
 * @param minCapacity the desired minimum capacity
 * @throws OutOfMemoryError if minCapacity is less than zero
 */
private Object[] grow(int minCapacity) {
    int oldCapacity = elementData.length;
    // 滿足 oldCapacity > 0 條件
    if (oldCapacity > 0 || elementData != DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
        // 擴容陣列長度到原來的 1.5 倍
        int newCapacity = ArraysSupport.newLength(oldCapacity,
                minCapacity - oldCapacity, /* minimum growth */
                oldCapacity >> 1           /* preferred growth */);
        // 複製原陣列資料到新陣列
        return elementData = Arrays.copyOf(elementData, newCapacity);
    } else {
        return elementData = new Object[Math.max(DEFAULT_CAPACITY, minCapacity)];
    }
}

結論：

ArrayList 是第一次新增元素時，才建立一個初始容量為 10 的陣列，延遲了陣列的建立。
新增資料時，如果底層的陣列需要擴容，均擴容為原來容量的 1.5 倍，同時，將原有陣列中的資料複製到新的陣列中。
開發中使用 ArrayList 時，建議按需求在初始化時就指定 ArrayList 的容量，以儘可能的避免擴容。

LinkedList

雙向連結串列，內部定義了內部類 Node，作為 LinkedList 中儲存資料的基本結構。

LinkedList 內部沒有宣告陣列，而是定義了 Node 型別的 first 和 last，用於記錄首末元素：

對於頻繁的插入或刪除元素的操作，建議使用 LinkedList 類，效率較高。

new LinkedList() 時，內部宣告瞭 Node 型別的 first 和 last 屬性，預設值為 null：

public class LinkedList<E>
    extends AbstractSequentialList<E>
    implements List<E>, Deque<E>, Cloneable, java.io.Serializable
{
    transient int size = 0;

    /**
     * Pointer to first node.
     */
    transient Node<E> first;

    /**
     * Pointer to last node.
     */
    transient Node<E> last;

    /**
     * Constructs an empty list.
     */
    public LinkedList() {
    }
}

// Node 內部類
private static class Node<E> {
    // 當前 Node 儲存的資料
    E item;
    // 指向連結串列的後一個元素
    Node<E> next;
    // 指向連結串列的前一個元素
    Node<E> prev;

    Node(Node<E> prev, E element, Node<E> next) {
        this.item = element;
        this.next = next;
        this.prev = prev;
    }
}

呼叫 add() 方法新增元素：

/**
 * Appends the specified element to the end of this list.
 *
 * <p>This method is equivalent to {@link #addLast}.
 *
 * @param e element to be appended to this list
 * @return {@code true} (as specified by {@link Collection#add})
 */
public boolean add(E e) {
    linkLast(e);
    return true;
}

/**
 * Links e as last element.
 */
void linkLast(E e) {
    // last，原連結串列的最後一個物件
    final Node<E> l = last;
    // 建立一個新的 Node 物件
    final Node<E> newNode = new Node<>(l, e, null);
    // 當前新建立的 Node 物件，成為新連結串列的最後一個物件
    last = newNode;
    if (l == null)
        // 如果原連結串列為 null，則將當前新建立的 Node 物件指定為連結串列的第一個節點 first
        first = newNode;
    else
        // 如果原連結串列不為 null，則將原連結串列的最後一個物件，指向當前新的 Node 物件
        l.next = newNode;
    size++;
    modCount++;
}

Set 介面

Set 集合儲存無序的、不可重複的資料，如果把兩個相同的元素加入同一個 Set 集合中，則新增操作失敗。

無序性：不等於隨機性。以 HashSet 為例，儲存的資料在底層陣列中並非按照陣列索引的順序新增，而是根據資料的雜湊值決定的。
不可重複性：保證新增的元素按照 equals() 判斷時，不能返回 true。即：相同的元素只能新增一個。

Set 介面是 Collection 的子介面，Set 介面沒有提供額外的方法，使用的都是Collection中宣告過的方法。

Set 判斷兩個物件是否相同不是使用 == 運算子，而是根據 equals()。對於存放在 Set（主要指：HashSet、LinkedHashSet）容器中的物件，其對應的類一定要重寫equals()和hashCode()，以實現物件相等規則。

要求：重寫的 hashCode() 和 equals() 儘可能保持一致性，即：相等的物件必須具有相等的雜湊碼。
- 如果不重寫所新增元素所在類的 hashCode()，則會呼叫 Object 類的 hashCode()，該方法是產生一個隨機數，因此，即使新增兩個一樣的元素，其 hashCode 值也可能不同，也就都能新增成功。
重寫兩個方法的小技巧：物件中用作 equals() 方法比較的 Field，都應該用來計算 hashCode 值。
TreeSet 比較兩個元素是否相同的方法，不是 equals() 和 hashCode()，而是元素對應類的排序方法。

重寫 hashCode() 方法的基本原則：

在程式執行時，同一個物件多次呼叫 hashCode() 方法應該返回相同的值。
當兩個物件的 equals() 方法比較返回 true 時，這兩個物件的 hashCode() 方法的返回值也應相等。
物件中用作 equals() 方法比較的 Field，都應該用來計算 hashCode 值。

重寫 equals() 方法的基本原則，以自定義的 Customer 類為例，何時需要重寫 equals()：

如果一個類有自己特有的 "邏輯相等" 概念，當重寫 equals() 的時候，總是需要重寫 hashCode()。因為根據一個類改寫後的 equals()，兩個截然不同的例項有可能在邏輯上是相等的，但是，根據 Object 類的 hashCode()，它們僅僅是兩個物件。這種情況，違反了 "相等的物件必須具有相等的雜湊碼" 的原則。

結論：重寫 equals() 的時候，一般都需要同時重寫 hashCode() 方法。通常參與計算 hashCode 的物件的屬性也應該參與到 equals() 中進行計算。

Eclipse/IDEA 工具裡 hashCode() 的重寫，為什麼會有 31 這個數字：
@Override
public int hashCode() {
    int result = name.hashCode();
    result = 31 * result + age;
    return result;
}
選擇係數的時候要選擇儘量大的係數，因為如果計算出來的 hashCode 值越大，所謂的衝突就越少，查詢起來效率也會提高。---> 減少衝突

31 只佔用 5 bits，相乘造成資料溢位的機率較小。

31 可以由i * 31 == (i << 5) - 1來表示，現在很多虛擬機器裡面都有做相關最佳化。---> 提高演算法效率

31 是一個素數，素數作用就是如果用一個數字來乘以這個素數，那麼最終出來的結果只能被素數本身和被乘數還有 1 來整除！---> 減少衝突

HashSet

HashSet 按 Hash 演算法來儲存集合中的元素，因此具有很好的存取、查詢、刪除效能。

HashSet 具有以下特點：

不保證元素的排列順序。
不是執行緒安全的。
集合元素可以是 null，但是隻能有一個。

HashSet 的底層，使用的是 HashMap：

/**
 * Constructs a new, empty set; the backing {@code HashMap} instance has
 * default initial capacity (16) and load factor (0.75).
 */
public HashSet() {
    map = new HashMap<>();
}

LinkedHashSet

LinkedHashSet 根據元素的 hashCode 值來決定元素的儲存位置，但它同時使用雙向連結串列維護元素的次序，這使得元素看起來是以插入順序儲存的。

遍歷 LinkedHashSet 內部資料時，可以按照新增的順序遍歷。

LinkedHashSet 插入效能略低於 HashSet，但在迭代訪問 Set 裡的全部元素時有很好的效能。對於頻繁的遍歷操作，LinkedHashSet 效率高於 HashSet。

LinkedHashSet 的底層，使用的是 LinkedHashMap：

public LinkedHashSet() {
    super(16, .75f, true);
}

HashSet(int initialCapacity, float loadFactor, boolean dummy) {
    map = new LinkedHashMap<>(initialCapacity, loadFactor);
}

Map 介面

HashMap

HashMap 原始碼中的重要常量：

DEFAULT_INITIAL_CAPACITY：HashMap 的預設容量，16。

/**
 * The default initial capacity - MUST be a power of two.
 */
static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16

MAXIMUM_CAPACITY：HashMap 的最大支援容量，$2^{30}$。

/**
 * The maximum capacity, used if a higher value is implicitly specified
 * by either of the constructors with arguments.
 * MUST be a power of two <= 1<<30.
 */
static final int MAXIMUM_CAPACITY = 1 << 30;

DEFAULT_LOAD_FACTOR：HashMap 的預設載入因子，0.75。
```
/**
 * The load factor used when none specified in constructor.
 */
static final float DEFAULT_LOAD_FACTOR = 0.75f;
```
- 不同於 ArrayList，HashMap 不是在底層陣列全部填滿時才進行擴容操作，因為陣列上有一些位置可能會一直都沒有新增元素，但其他位置上元素可能有很多，導致連結串列和二叉樹結構變多。因此，會在元素新增到一定數量時，就執行擴容操作，即新增元素數量達到 threshold 值時擴容。預設載入因子如果過小，會導致陣列還有很多空位置時擴容，陣列利用率低；預設載入因子如果過大，會導致陣列中存在很多元素時才擴容，連結串列和二叉樹結構過多。因此，預設載入因子在 0.7 ~ 0.75 左右比較合適。

TREEIFY_THRESHOLD：Bucket 中連結串列儲存的 Node 長度大於該預設值，判斷是否轉換為紅黑樹，預設為 8。

/**
 * The bin count threshold for using a tree rather than list for a
 * bin.  Bins are converted to trees when adding an element to a
 * bin with at least this many nodes. The value must be greater
 * than 2 and should be at least 8 to mesh with assumptions in
 * tree removal about conversion back to plain bins upon
 * shrinkage.
 */
static final int TREEIFY_THRESHOLD = 8;

UNTREEIFY_THRESHOLD：Bucket 中紅黑樹儲存的 Node 長度小於該預設值，轉換為連結串列，預設為 6。

/**
 * The bin count threshold for untreeifying a (split) bin during a
 * resize operation. Should be less than TREEIFY_THRESHOLD, and at
 * most 6 to mesh with shrinkage detection under removal.
 */
static final int UNTREEIFY_THRESHOLD = 6;

MIN_TREEIFY_CAPACITY：桶中的 Node 被樹化時最小的 hash 表容量，預設為 64。
```
/**
 * The smallest table capacity for which bins may be treeified.
 * (Otherwise the table is resized if too many nodes in a bin.)
 * Should be at least 4 * TREEIFY_THRESHOLD to avoid conflicts
 * between resizing and treeification thresholds.
 */
static final int MIN_TREEIFY_CAPACITY = 64;
```
- 當桶中 Node 的數量大到需要變紅黑樹（8）時，若 hash 表容量小於 MIN_TREEIFY_CAPACITY，此時應執行 resize() 進行擴容操作。MIN_TREEIFY_CAPACITY 的值至少是 TREEIFY_THRESHOLD 的 4 倍。

table：儲存元素的陣列，長度總是 2 的 n 次冪。

/**
 * The table, initialized on first use, and resized as
 * necessary. When allocated, length is always a power of two.
 * (We also tolerate length zero in some operations to allow
 * bootstrapping mechanics that are currently not needed.)
 */
transient Node<K,V>[] table;

entrySet：儲存具體元素的集。

/**
 * Holds cached entrySet(). Note that AbstractMap fields are used
 * for keySet() and values().
 */
transient Set<Map.Entry<K,V>> entrySet;

size：HashMap 中已儲存的鍵值對的數量。

/**
 * The number of key-value mappings contained in this map.
 */
transient int size;

modCount：HashMap 擴容和結構改變的次數。

/**
 * The number of times this HashMap has been structurally modified
 * Structural modifications are those that change the number of mappings in
 * the HashMap or otherwise modify its internal structure (e.g.,
 * rehash).  This field is used to make iterators on Collection-views of
 * the HashMap fail-fast.  (See ConcurrentModificationException).
 */
transient int modCount;

threshold：擴容的臨界值，其值一般等於（容量 * 載入因子），(int) Math.min(capacity * loadFactor, MAXIMUM_CAPACITY + 1);。擴容的操作不是當底層陣列全部被填滿後再擴容，而是達到臨界值後的下一次新增操作進行擴容。

/**
 * The next size value at which to resize (capacity * load factor).
 *
 * @serial
 */
// (The javadoc description is true upon serialization.
// Additionally, if the table array has not been allocated, this
// field holds the initial array capacity, or zero signifying
// DEFAULT_INITIAL_CAPACITY.)
int threshold;

loadFactor：載入因子。

/**
 * The load factor for the hash table.
 *
 * @serial
 */
final float loadFactor;

new HashMap<>() 時，賦值載入因子 loadFactor 為 DEFAULT_LOAD_FACTOR，即 0.75：

/**
 * Constructs an empty {@code HashMap} with the default initial capacity
 * (16) and the default load factor (0.75).
 */
public HashMap() {
    this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted
}

第一次呼叫 put() 方法時，透過 resize() 方法，建立一個長度為 16 的 Node 陣列：

/**
 * Associates the specified value with the specified key in this map.
 * If the map previously contained a mapping for the key, the old
 * value is replaced.
 *
 * @param key key with which the specified value is to be associated
 * @param value value to be associated with the specified key
 * @return the previous value associated with {@code key}, or
 *         {@code null} if there was no mapping for {@code key}.
 *         (A {@code null} return can also indicate that the map
 *         previously associated {@code null} with {@code key}.)
 */
public V put(K key, V value) {
    // key 做 hash
    return putVal(hash(key), key, value, false, true);
}

/**
 * Implements Map.put and related methods.
 *
 * @param hash hash for key
 * @param key the key
 * @param value the value to put
 * @param onlyIfAbsent if true, don't change existing value
 * @param evict if false, the table is in creation mode.
 * @return previous value, or null if none
 */
final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
               boolean evict) {
    Node<K,V>[] tab; Node<K,V> p; int n, i;
    if ((tab = table) == null || (n = tab.length) == 0)
        // 第一次呼叫 put() 方法，此時，table 未初始化，為 null，呼叫 resize() 方法，建立長度為 16 的 Node 陣列
        n = (tab = resize()).length;
    // 然後，檢視 Node 陣列中的位置 i 的元素 p，是否為 null
    if ((p = tab[i = (n - 1) & hash]) == null)
        // 如果 p 為 null，說明當前位置 i 沒有元素，新增成功 ---> 情況 1
        tab[i] = newNode(hash, key, value, null);
    else {
        Node<K,V> e; K k;
        if (p.hash == hash &&
            ((k = p.key) == key || (key != null && key.equals(k))))
            // 位置 i 上的元素，與當前待新增元素的 key 相同
            e = p;
        else if (p instanceof TreeNode)
            e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
        else {
            // 位置 i 上的元素，與當前待新增元素的 key 不同
            for (int binCount = 0; ; ++binCount) {
                // 位置 i 上，只有一個元素
                if ((e = p.next) == null) {
                    // 位置 i 上的原元素指向當前待新增的元素，新增成功 ---> 情況 2 和 3
                    p.next = newNode(hash, key, value, null);
                    if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                        // 如果連結串列的長度超過 8 時，判斷是否轉為紅黑樹結構
                        treeifyBin(tab, hash);
                    break;
                }
                // 位置 i 上，不止一個元素，依次獲得該連結串列上的每一個元素，與當前待新增元素的 key，對比 hash 值和 equals() 方法
                if (e.hash == hash &&
                    ((k = e.key) == key || (key != null && key.equals(k))))
                    break;
                p = e;
            }
        }
        if (e != null) { // existing mapping for key
            V oldValue = e.value;
            if (!onlyIfAbsent || oldValue == null)
                e.value = value;
            afterNodeAccess(e);
            return oldValue;
        }
    }
    ++modCount;
    if (++size > threshold)
        resize();
    afterNodeInsertion(evict);
    return null;
}

/**
 * Initializes or doubles table size.  If null, allocates in
 * accord with initial capacity target held in field threshold.
 * Otherwise, because we are using power-of-two expansion, the
 * elements from each bin must either stay at same index, or move
 * with a power of two offset in the new table.
 *
 * @return the table
 */
final Node<K,V>[] resize() {
    Node<K,V>[] oldTab = table;
    int oldCap = (oldTab == null) ? 0 : oldTab.length;
    int oldThr = threshold;
    int newCap, newThr = 0;
    if (oldCap > 0) {
        if (oldCap >= MAXIMUM_CAPACITY) {
            threshold = Integer.MAX_VALUE;
            return oldTab;
        }
        else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                 oldCap >= DEFAULT_INITIAL_CAPACITY)
            newThr = oldThr << 1; // double threshold
    }
    else if (oldThr > 0) // initial capacity was placed in threshold
        newCap = oldThr;
    else {               // zero initial threshold signifies using defaults
        // 預設陣列長度 16
        newCap = DEFAULT_INITIAL_CAPACITY;
        // 預設擴容的臨界值 0.75 * 16 = 12
        newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
    }
    if (newThr == 0) {
        float ft = (float)newCap * loadFactor;
        newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                  (int)ft : Integer.MAX_VALUE);
    }
    // 賦值擴容的臨界值 12
    threshold = newThr;
    @SuppressWarnings({"rawtypes","unchecked"})
    // 建立一個長度為 16 的 Node 陣列
    Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
    table = newTab;
    if (oldTab != null) {
        for (int j = 0; j < oldCap; ++j) {
            Node<K,V> e;
            if ((e = oldTab[j]) != null) {
                oldTab[j] = null;
                if (e.next == null)
                    newTab[e.hash & (newCap - 1)] = e;
                else if (e instanceof TreeNode)
                    ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                else { // preserve order
                    Node<K,V> loHead = null, loTail = null;
                    Node<K,V> hiHead = null, hiTail = null;
                    Node<K,V> next;
                    do {
                        next = e.next;
                        if ((e.hash & oldCap) == 0) {
                            if (loTail == null)
                                loHead = e;
                            else
                                loTail.next = e;
                            loTail = e;
                        }
                        else {
                            if (hiTail == null)
                                hiHead = e;
                            else
                                hiTail.next = e;
                            hiTail = e;
                        }
                    } while ((e = next) != null);
                    if (loTail != null) {
                        loTail.next = null;
                        newTab[j] = loHead;
                    }
                    if (hiTail != null) {
                        hiTail.next = null;
                        newTab[j + oldCap] = hiHead;
                    }
                }
            }
        }
    }
    return newTab;
}

計算 key 的 hash 值：

/**
 * Computes key.hashCode() and spreads (XORs) higher bits of hash
 * to lower.  Because the table uses power-of-two masking, sets of
 * hashes that vary only in bits above the current mask will
 * always collide. (Among known examples are sets of Float keys
 * holding consecutive whole numbers in small tables.)  So we
 * apply a transform that spreads the impact of higher bits
 * downward. There is a tradeoff between speed, utility, and
 * quality of bit-spreading. Because many common sets of hashes
 * are already reasonably distributed (so don't benefit from
 * spreading), and because we use trees to handle large sets of
 * collisions in bins, we just XOR some shifted bits in the
 * cheapest possible way to reduce systematic lossage, as well as
 * to incorporate impact of the highest bits that would otherwise
 * never be used in index calculations because of table bounds.
 */
static final int hash(Object key) {
    int h;
    return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}

判斷連結串列是否轉紅黑樹：

/**
 * Replaces all linked nodes in bin at index for given hash unless
 * table is too small, in which case resizes instead.
 */
final void treeifyBin(Node<K,V>[] tab, int hash) {
    int n, index; Node<K,V> e;
    if (tab == null || (n = tab.length) < MIN_TREEIFY_CAPACITY)
        // 如果底層陣列的長度小於 64，只擴容，不轉紅黑樹
        resize();
    else if ((e = tab[index = (n - 1) & hash]) != null) {
        TreeNode<K,V> hd = null, tl = null;
        do {
            TreeNode<K,V> p = replacementTreeNode(e, null);
            if (tl == null)
                hd = p;
            else {
                p.prev = tl;
                tl.next = p;
            }
            tl = p;
        } while ((e = e.next) != null);
        if ((tab[index] = hd) != null)
            hd.treeify(tab);
    }
}

總結：

new HashMap<>() 時，底層沒有建立陣列，只賦值載入因子 loadFactor 為 0.75。
首次呼叫 put() 方法時，底層建立長度為 16 的 Node 陣列。
執行 map.put(key1, value1) 操作，可能已經執行過多次 put() 方法：
- 首先，計算 key1 所在類的 hashCode() 以及其他操作計算 key1 的雜湊值，此雜湊值經過某種演算法計算以後，得到在 Node 陣列中的存放位置。
- 如果此位置上的資料為空，此時的 key1 - value1 新增成功。---> 情況 1
- 如果此位置上的資料不為空，意味著此位置上存在一個或多個資料，比較 key1 和已經存在的一個或多個資料的雜湊值：
  - 如果 key1 的雜湊值與已經存在的資料的雜湊值都不相同，此時 key1 - value1 新增成功。---> 情況 2
  - 如果 key1 的雜湊值和已經存在的某一個資料（key2 - value2）的雜湊值相同，則呼叫 key1 所在類的 equals(key2)，繼續比較：
    - 如果 equals() 返回 false：此時 key1 - value1 新增成功。---> 情況 3
    - 如果 equals() 返回 true：使用 value1 替換 value2。
- 補充：關於情況 2 和情況 3，此時 key1 - value1 和原來的資料以連結串列的方式儲存。
當陣列的某一個索引位置上的元素以連結串列形式存在的資料個數 > 8 且當前陣列的長度 > 64時，此時此索引位置上的資料改為使用紅黑樹儲存。

儲存結構：陣列 + 連結串列 + 紅黑樹。

1722141753815

擴容過程：

當 HashMap 中的元素越來越多的時候，hash 衝突的機率也就越來越高，因為底層陣列的長度是固定的。所以為了提高查詢的效率，就要對 HashMap 的底層陣列進行擴容，而在 HashMap 陣列擴容之後，最消耗效能的點就出現了：原陣列中的資料必須重新計算其在新陣列中的位置，並放進去，這就是resize()。
當 HashMap 中的元素個數超過 "陣列大小（陣列總大小 length，不是陣列中儲存的元素個數 size） * loadFactor" 時，就會進行陣列擴容。其中，loadFactor 的預設值為 0.75，這是一個折中的取值，預設情況下，陣列大小為 16，那麼當 HashMap 中元素個數 ≥ 16 * 0.75 = 12 （這個值就是程式碼中的 threshold 值，也叫做臨界值）且要存放的位置非空的時候，就把陣列的大小擴充套件為 2 * 16 = 32，即擴大一倍，然後重新計算每個元素在陣列中的位置，把原有的資料複製到新陣列中。
擴容是一個非常消耗效能的操作，如果已經預知 HashMap 中元素的個數，那麼預設元素的個數能夠有效的提高 HashMap 的效能。

LinkedHashMap

LinkedHashMap 在 HashMap 儲存結構的基礎上，使用了一對雙向連結串列來記錄新增元素的順序，對於頻繁的遍歷操作，執行效率高於 HashMap。

LinkedHashMap 在遍歷元素時，可以按照新增的順序實現遍歷。

LinkedHashMap 在原有的 HashMap 底層結構基礎上，新增了一對指標 befor 和 after，指向當前元素的前一個和後一個元素：

/**
 * HashMap.Node subclass for normal LinkedHashMap entries.
 */
static class Entry<K,V> extends HashMap.Node<K,V> {
    Entry<K,V> before, after;
    Entry(int hash, K key, V value, Node<K,V> next) {
        super(hash, key, value, next);
    }
}

原文連結

https://github.com/ACatSmiling/zero-to-zero/blob/main/JavaLanguage/java-advanced.md

資料結構與演算法——常用高階資料結構及其Java實現
2018-03-04
資料結構演算法Java
高階資料結構詳解
2020-08-16
資料結構
高階資料結構-可並堆
2024-07-14
資料結構
Pandas高階教程之:稀疏資料結構
2021-07-20
資料結構
高階資料結構---堆樹和堆排序
2020-05-02
資料結構排序
高階資料結構---赫(哈)夫曼樹及java程式碼實現
2020-04-28
資料結構Java
【資料結構篇】認識資料結構
2021-04-27
資料結構
kredis：用於Rails的Redis高階資料結構
2021-02-03
RedisAI資料結構
05穀粒商城-高階篇五
2024-10-14
資料結構高階--八大排序彙總
2022-12-17
資料結構排序
(精華)2020年7月3日 JavaScript高階篇 ES6(Map資料結構)
2020-07-03
JavaScript資料結構
Java資料結構
2020-12-28
Java資料結構
Nginx 高階篇（五）Nginx 直連 Redis
2020-03-19
NginxRedis
《Java 高階篇》八：新特性
2024-10-02
Java
前端進階 | 資料結構與演算法之 LeetCode 篇
2018-11-26
前端資料結構演算法LeetCode
【C#進階】高階資料結構和演算法_2024-06-22
2024-06-22
C#資料結構演算法
『高階篇』docker之CICD（終結篇）（44）
2019-03-13
Docker
Redis 的五種資料結構
2019-03-02
Redis資料結構
資料結構高階--二叉搜尋樹（原理+實現）
2022-12-01
資料結構
深入理解資料結構--二叉樹（進階篇1）
2021-09-14
資料結構二叉樹
資料結構進階：ST表
2020-08-05
資料結構
【Go進階—資料結構】map
2021-10-19
Go資料結構
【Go進階—資料結構】Channel
2021-09-26
Go資料結構
【Go進階—資料結構】slice
2021-10-03
Go資料結構
【Go進階—資料結構】string
2021-10-12
Go資料結構
338、分散式高階篇總結
2020-09-30
分散式
《Java 高階篇》六：I/O 流
2024-10-02
Java
高階工程師面試大全- 資料庫篇
2024-08-15
工程師面試資料庫
Redis系列文章-資料結構篇
2020-11-15
Redis資料結構
資料結構高階--AVL（平衡二叉樹）（圖解+實現）
2022-12-03
資料結構二叉樹圖解
《大話資料結構》讀後總結（五）
2019-03-29
資料結構
資料結構實驗課五-1
2020-12-03
資料結構
長沙Java培訓：Java基礎通往高階進階篇
2021-11-18
Java
看得見的資料結構Android版之陣列表(資料結構篇)
2018-11-22
資料結構Android陣列
Java版-資料結構-棧
2019-03-12
Java資料結構
高階前端進階（五）
2022-05-09
前端
Java版-資料結構-連結串列
2019-03-30
Java資料結構
JAVA資料結構之連結串列
2020-09-29
Java資料結構