HashMap原始碼整理

一枚螺絲釘發表於2021-01-02

HashMap類圖

在這裡插入圖片描述

重要註釋

<p>As a general rule, the default load factor (.75) offers a good
 * tradeoff between time and space costs.  Higher values decrease the
 * space overhead but increase the lookup cost (reflected in most of
 * the operations of the {@code HashMap} class, including
 * {@code get} and {@code put}).  The expected number of entries in
 * the map and its load factor should be taken into account when
 * setting its initial capacity, so as to minimize the number of
 * rehash operations.  If the initial capacity is greater than the
 * maximum number of entries divided by the load factor, no rehash
 * operations will ever occur.
 * 1.負載因子0.75很好的平衡了時間和空間花費,太高的負載因子可以減少空間消耗,但是會降低查詢的效率。
2.如果在建立 hashMap 中指定一個容量大小使得初始容量能夠大於實際的資料個數除以負載因子,即可以避免擴容操作。
因此我們使用 hashMap 時應該提前評估我們要存放資料量,可以根據資料量在建立 hashMap 時指定大小,提高效率。
* Because TreeNodes are about twice the size of regular nodes, we
     * use them only when bins contain enough nodes to warrant use
     * (see TREEIFY_THRESHOLD). And when they become too small (due to
     * removal or resizing) they are converted back to plain bins.  In
     * usages with well-distributed user hashCodes, tree bins are
     * rarely used.  Ideally, under random hashCodes, the frequency of
     * nodes in bins follows a Poisson distribution
     * (http://en.wikipedia.org/wiki/Poisson_distribution) with a
     * parameter of about 0.5 on average for the default resizing
     * threshold of 0.75, although with a large variance because of
     * resizing granularity. Ignoring variance, the expected
     * occurrences of list size k are (exp(-0.5) * pow(0.5, k) /
     * factorial(k)). The first values are:
     *
     * 0:    0.60653066
     * 1:    0.30326533
     * 2:    0.07581633
     * 3:    0.01263606
     * 4:    0.00157952
     * 5:    0.00015795
     * 6:    0.00001316
     * 7:    0.00000094
     * 8:    0.00000006
     * more: less than 1 in ten million
     當列表長度超過閾值時,連結串列會轉為紅黑樹,優化查詢效率。一般理想情況下,連結串列長度大於8的情況是很小的。該操作僅防止某些極端 hash 計算。

內部資料結構

static class Node<K,V> implements Map.Entry<K,V> {

        final int hash;
        final K key;
        V value;
        Node<K,V> next;
}
transient Node<K,V>[] table;

建構函式

/**
     * Constructs an empty {@code HashMap} with the specified initial
     * capacity and load factor.
     *
     * @param  initialCapacity the initial capacity
     * @param  loadFactor      the load factor
     * @throws IllegalArgumentException if the initial capacity is negative
     *         or the load factor is nonpositive
     *
     * 允許設定初始化大小和負載因子的建構函式
     */
    public HashMap(int initialCapacity, float loadFactor) {
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal initial capacity: " +
                                               initialCapacity);
        // map 最大容量 1 << 30
        if (initialCapacity > MAXIMUM_CAPACITY)
            initialCapacity = MAXIMUM_CAPACITY;

        // 什麼情況下可能是 NaN 呢?
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            throw new IllegalArgumentException("Illegal load factor: " +
                                               loadFactor);
        this.loadFactor = loadFactor;
        // 根據傳入的初始化大小值確定
        // 這裡先用 threshold 儲存 map 容量大小
        this.threshold = tableSizeFor(initialCapacity);
    }

/**
     * Returns a power of two size for the given target capacity.
     * TODO 為什麼要是2的 n 次冪?
     * 2的 N次冪 -1 所得的二進位制值都為1,相當於掩碼,計算 key 值所在桶位
     */
    static final int tableSizeFor(int cap) {
        // 獲取 cap - 1 在補碼中高位的0位個數
        // -1 原碼 10000000000000000000000000000001
        // -1 補碼 11111111111111111111111111111111
        // 假設 cap == 16 00000000000000000000000000010000  27 個 0
        // cap - 1 == 00000000000000000000000000001111  28 個 0
        // n == 00000000000000000000000000001111 == 15
        // 這裡 n 計算結果會是 2^x - 1
        int n = -1 >>> Integer.numberOfLeadingZeros(cap - 1);
        return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
    }

hash()

	/**
	* 使用hashcode 高位和低位異或,使得在計算 key 所在的桶位的時候有高位值的特徵
	*/
	static final int hash(Object key) {
        int h;
        // key 本身的 hashcode 和 hashcode 高16位做異或
        // 儘可能的雜湊,防止衝突
        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
    }

put()

/**
     * Associates the specified value with the specified key in this map.
     * If the map previously contained a mapping for the key, the old
     * value is replaced.
     *
     * @param key key with which the specified value is to be associated
     * @param value value to be associated with the specified key
     * @return the previous value associated with {@code key}, or
     *         {@code null} if there was no mapping for {@code key}.
     *         (A {@code null} return can also indicate that the map
     *         previously associated {@code null} with {@code key}.)
     */
    public V put(K key, V value) {
        return putVal(hash(key), key, value, false, true);
    }

    /**
     * Implements Map.put and related methods.
     *
     * @param hash hash for key
     * @param key the key
     * @param value the value to put
     * @param onlyIfAbsent if true, don't change existing value
     * @param evict if false, the table is in creation mode.
     * @return previous value, or null if none
     */
    final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        if ((tab = table) == null || (n = tab.length) == 0)
            // 初始化 table
            n = (tab = resize()).length;
        // table 長度 - 1 & hash
        // 當 key == null 時,hash 為0
        if ((p = tab[i = (n - 1) & hash]) == null)
            // 如果當前下標下沒有資料,直接建立儲存到當前下標
            tab[i] = newNode(hash, key, value, null);
        else {
            // 存在 hash 衝突
            Node<K,V> e; K k;
            // 如果 hash 值相同,並且 key 相等。
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                // p,e 指向了同一個物件
                e = p;
            else if (p instanceof TreeNode)
                // 如果 p 是一個樹節點,進行樹節點儲存
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
                // 即不等於頭節點,又不是樹,則進行列表迴圈
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {
                        // 遍歷到了列表最後一個節點,直接插入,
                        // 尾插
                        p.next = newNode(hash, key, value, null);
                        // 如果當前 count 為7,再加一個新的 node,count 為8,即當連結串列存在9個元素時,轉換為樹
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }
                    // 判斷新增節點是否和當前節點相同
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;
                }
            }
            if (e != null) { // existing mapping for key
                // 獲取舊值
                V oldValue = e.value;
                // 如果允許替換或者舊值為 null,則替換新值
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        // 修改標記
        ++modCount;
        // 如果當前的 size 比閾值大,進行擴容操作
        // 當前容量等於閾值時不會擴容
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
    }

get() & remove()

public V get(Object key) {
        Node<K,V> e;
        return (e = getNode(hash(key), key)) == null ? null : e.value;
    }

    /**
     * Implements Map.get and related methods.
     *
     * @param hash hash for key
     * @param key the key
     * @return the node, or null if none
     */
    final Node<K,V> getNode(int hash, Object key) {
        Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
        // 先判斷是否初始化了 table, table 中是否有資料
        // 當前查詢的 key 所在的下標下是否有資料
        if ((tab = table) != null && (n = tab.length) > 0 &&
            (first = tab[(n - 1) & hash]) != null) {
            // 如果 hash 相同,key 相同或者相等
            if (first.hash == hash && // always check first node
                ((k = first.key) == key || (key != null && key.equals(k))))
                return first;
            // 如果 next 不為空
            if ((e = first.next) != null) {
                // 如果是紅黑樹,執行紅黑樹查詢
                if (first instanceof TreeNode)
                    return ((TreeNode<K,V>)first).getTreeNode(hash, key);
                do {
                    // e = first.next, 判斷是否相同
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        return e;
                } while ((e = e.next) != null);
            }
        }
        return null;
    }

final Node<K,V> removeNode(int hash, Object key, Object value,
                               boolean matchValue, boolean movable) {
        Node<K,V>[] tab; Node<K,V> p; int n, index;

        if ((tab = table) != null && (n = tab.length) > 0 &&
            (p = tab[index = (n - 1) & hash]) != null) {
            // 找要remove 的元素,與 get 相同
            Node<K,V> node = null, e; K k; V v;
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                node = p;
            else if ((e = p.next) != null) {
                if (p instanceof TreeNode)
                    node = ((TreeNode<K,V>)p).getTreeNode(hash, key);
                else {
                    do {
                        if (e.hash == hash &&
                            ((k = e.key) == key ||
                             (key != null && key.equals(k)))) {
                            node = e;
                            break;
                        }
                        p = e;
                    } while ((e = e.next) != null);
                }
            }


            if (node != null && (!matchValue || (v = node.value) == value ||
                                 (value != null && value.equals(v)))) {
                if (node instanceof TreeNode)
                    ((TreeNode<K,V>)node).removeTreeNode(this, tab, movable);
                else if (node == p)
                    // 此時是移除 first 節點
                    tab[index] = node.next;
                else
                    // node是當前要刪除的節點
                    // p 是 node 的上一個節點
                    p.next = node.next;
                ++modCount;
                --size;
                afterNodeRemoval(node);
                return node;
            }
        }
        return null;
    }

resize()

final Node<K,V>[] resize() {
        Node<K,V>[] oldTab = table;
        // 獲取舊的陣列長度
        int oldCap = (oldTab == null) ? 0 : oldTab.length;
        int oldThr = threshold;
        int newCap, newThr = 0;
        if (oldCap > 0) {
            if (oldCap >= MAXIMUM_CAPACITY) {
                // 如果 capacity 已經是允許的最大值,將 threshold 設定為 Integer 的最大值
                threshold = Integer.MAX_VALUE;
                return oldTab;
            }
            else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                     oldCap >= DEFAULT_INITIAL_CAPACITY)
                // TODO 為什麼 oldCap < 16時,threshold 不直接擴大一倍?
                newThr = oldThr << 1; // double threshold
        }
        else if (oldThr > 0) // initial capacity was placed in threshold
            // 如果建立指定了容器大小,則將容器大小賦值 newCap
            newCap = oldThr;
        else {               // zero initial threshold signifies using defaults
            // new 未指定容器大小,賦預設值
            newCap = DEFAULT_INITIAL_CAPACITY;
            newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
        }
        if (newThr == 0) {
            // new 指定了容器大小,首次建立 table,重新計算 threshold
            // oldCap 小於 16
            float ft = (float)newCap * loadFactor;
            // 因為 loadFactor 可以指定,所以此處需要判斷 ft < (float)MAXIMUM_CAPACITY
            newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                      (int)ft : Integer.MAX_VALUE);
        }
        threshold = newThr;
        @SuppressWarnings({"rawtypes","unchecked"})
        Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
        table = newTab;
        if (oldTab != null) {
            for (int j = 0; j < oldCap; ++j) {
                Node<K,V> e;
                if ((e = oldTab[j]) != null) {
                    // TODO ???
                    oldTab[j] = null;
                    if (e.next == null)
                        // 當前下標只有一個節點,直接分配
                        newTab[e.hash & (newCap - 1)] = e;
                    else if (e instanceof TreeNode)
                        // 樹
                        ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                    else { // preserve order
                        Node<K,V> loHead = null, loTail = null;
                        Node<K,V> hiHead = null, hiTail = null;
                        Node<K,V> next;
                        do {
                            next = e.next;
                            // hash 和原陣列的大小相與
                            /**
                             * 假設原來長度是16 10000
                             * 原來第0位放的元素 hash & 1111 == 0
                             * hash 可能的情況 10000 100000 110000 也就是說陣列小標位置已經能確定低4位的情況
                             *
                             * 現在想要確認該元素是在當前位置,還是在第16位
                             * 只需要確認第5位是不是1即可
                             */
                            // 這裡使用的是尾插
                            if ((e.hash & oldCap) == 0) {
                                if (loTail == null)
                                    loHead = e;
                                else
                                /**
                                 * 第一次到這裡 loHead 和 loTail 指向同一個物件
                                 * 所以,loTail.next 賦值,相當於 loHead.next 賦值
                                 * 第二次到這裡 loTail 實際指向 loHead.next
                                 * 此時 loTail.next 賦值,相當於 loHead.next.next = e
                                 */
                                    loTail.next = e;
                                loTail = e;
                            }
                            else {
                                if (hiTail == null)
                                    hiHead = e;
                                else
                                    hiTail.next = e;
                                hiTail = e;
                            }
                        } while ((e = next) != null);
                        if (loTail != null) {
                            // 將原有的連結串列斷開
                            loTail.next = null;
                            newTab[j] = loHead;
                        }
                        if (hiTail != null) {
                            hiTail.next = null;
                            newTab[j + oldCap] = hiHead;
                        }
                    }
                }
            }
        }
        return newTab;
    }

樹化

基本資料結構

/**
* TreeNode 繼承 Node
*/
static final class TreeNode<K,V> extends LinkedHashMap.Entry<K,V> {
        // 父節點
        TreeNode<K,V> parent;  // red-black tree links
        // 左子樹
        TreeNode<K,V> left;
        // 右子樹
        TreeNode<K,V> right;
        // 前一個節點
        TreeNode<K,V> prev;    // needed to unlink next upon deletion
        boolean red;
        TreeNode(int hash, K key, V val, Node<K,V> next) {
            super(hash, key, val, next);
        }
}

treeifyBin()

/**
     * Replaces all linked nodes in bin at index for given hash unless
     * table is too small, in which case resizes instead.
     */
    final void treeifyBin(Node<K,V>[] tab, int hash) {
        int n, index; Node<K,V> e;
        // 如果 table 長度未達到64,執行擴容操作
        if (tab == null || (n = tab.length) < MIN_TREEIFY_CAPACITY)
            resize();
        else if ((e = tab[index = (n - 1) & hash]) != null) {
            // hd 用來標記連結串列頭
            // tl 用來連線連結串列
            TreeNode<K,V> hd = null, tl = null;
            do {
                // 修改為 treenode,next指標置為 null
                TreeNode<K,V> p = replacementTreeNode(e, null);
                if (tl == null)
                    hd = p;
                else {
                    p.prev = tl;
                    tl.next = p;
                }
                tl = p;
            } while ((e = e.next) != null);
            // index = (n - 1 & hash)
            if ((tab[index] = hd) != null)
            	// 構造樹
                hd.treeify(tab);
        }
    }

treeify()

/**
         * Forms tree of the nodes linked from this node.
         */
        final void treeify(Node<K,V>[] tab) {
            TreeNode<K,V> root = null;
            for (TreeNode<K,V> x = this, next; x != null; x = next) {
                next = (TreeNode<K,V>)x.next;
                x.left = x.right = null;
                // 根節點為 null
                if (root == null) {
                    x.parent = null;
                    // 紅黑樹性質1 根節點一定是黑色的
                    x.red = false;
                    root = x;
                }
                else {
                    K k = x.key;
                    int h = x.hash;
                    Class<?> kc = null;
                    for (TreeNode<K,V> p = root;;) {
                        int dir, ph;
                        K pk = p.key;
                        if ((ph = p.hash) > h)
                            dir = -1;
                        else if (ph < h)
                            dir = 1;
                        // 如果 hash 值相同,如果 key 對應的類實現 Comparable介面,則通過compareTo方法比較
                        // 如果 compareTo 計算等於0 即兩值相等 則使用原生的 hashcode,即物件地址值
                        else if ((kc == null &&
                                  (kc = comparableClassFor(k)) == null) ||
                                 (dir = compareComparables(kc, k, pk)) == 0)
                            dir = tieBreakOrder(k, pk);

                        TreeNode<K,V> xp = p;
                        // 如果 dir <= 0, 則放到當前節點左邊
                        if ((p = (dir <= 0) ? p.left : p.right) == null) {
                            // 標記 parent 節點
                            x.parent = xp;
                            if (dir <= 0)
                                xp.left = x;
                            else
                                xp.right = x;
                            // 平衡樹操作
                            root = balanceInsertion(root, x);
                            break;
                        }
                    }
                }
            }
            // 修改連結串列指標,將 root 節點放為首節點
            moveRootToFront(tab, root);
        }

平衡樹

static <K,V> TreeNode<K,V> balanceInsertion(TreeNode<K,V> root,
                                                    TreeNode<K,V> x) {
            // 新插入的節點是紅色
            x.red = true;
            for (TreeNode<K,V> xp, xpp, xppl, xppr;;) {
                // root
                if ((xp = x.parent) == null) {
                    x.red = false;
                    return x;
                }
                // 父節點不為紅 ,爺爺節點為 null
                // 父節點是 root
                else if (!xp.red || (xpp = xp.parent) == null)
                    return root;
                // parent 是 爺爺的左子節點
                if (xp == (xppl = xpp.left)) {
                    // 爺爺的右子節點不為 null 且為紅色
                    if ((xppr = xpp.right) != null && xppr.red) {
                        // 變色
                        // 爺爺右子節點變為黑色
                        xppr.red = false;
                        // 父節點(爺爺左子節點)變為黑色
                        xp.red = false;
                        // 爺爺變為紅
                        xpp.red = true;
                        x = xpp;
                    }
                    // !爺爺的右子節點不為 null 且為紅色
                    else {
                        // 如果當前節點是 parent 的右子節點
                        if (x == xp.right) {
                            // xp左旋
                            root = rotateLeft(root, x = xp);
                            // 旋轉後重新定位 xp, xpp
                            xpp = (xp = x.parent) == null ? null : xp.parent;
                        }
                        if (xp != null) {
                        	// 紅黑樹性質 紅色節點不能相連
                            // xp 修改為黑色 (x 是紅色)
                            xp.red = false;
                            if (xpp != null) {
                                xpp.red = true;
                                // xpp右旋
                                root = rotateRight(root, xpp);
                            }
                        }
                    }
                }
                // parent 是 爺爺的右子節點
                else {
                    if (xppl != null && xppl.red) {
                        xppl.red = false;
                        xp.red = false;
                        xpp.red = true;
                        x = xpp;
                    }
                    else {
                        if (x == xp.left) {
                            root = rotateRight(root, x = xp);
                            xpp = (xp = x.parent) == null ? null : xp.parent;
                        }
                        if (xp != null) {
                            xp.red = false;
                            if (xpp != null) {
                                xpp.red = true;
                                root = rotateLeft(root, xpp);
                            }
                        }
                    }
                }
            }
        }

/* ------------------------------------------------------------ */
        // Red-black tree methods, all adapted from CLR
        /**
         * 右旋同理
         * 左旋 
         * 1.p 右子節點掛 p.r 的左子樹
         * 2.修改 r.p = pp
         * 3.修改 pp.child
         * 4.修改 r.left, p.parent
         */
        static <K,V> TreeNode<K,V> rotateLeft(TreeNode<K,V> root,
                                              TreeNode<K,V> p) {
            TreeNode<K,V> r, pp, rl;
            // p 不為 null, p 的右子節點不為 null
            if (p != null && (r = p.right) != null) {
                // p.right = r.left
                // parent 右子節點掛 p.r 的左子樹
                // r的左子節點不為 null
                if ((rl = p.right = r.left) != null)
                    // 修改父節點
                    rl.parent = p;
                // r.p 修改為 pp
                // 如果 pp 為 null,說明 p 為 root
                if ((pp = r.parent = p.parent) == null)
                    // 此時 r 為 root,置為黑色
                    (root = r).red = false;
                // 修改 pp 的子樹指向 p.r
                else if (pp.left == p)
                    pp.left = r;
                else
                    pp.right = r;
                // p 為 r 的左子樹
                r.left = p;
                p.parent = r;
            }
            return root;
        }

樹擴容

 /**
         * Splits nodes in a tree bin into lower and upper tree bins,
         * or untreeifies if now too small. Called only from resize;
         * see above discussion about split bits and indices.
         * 原有的樹節點右前後節點的引用,所以直接按照連結串列拆分就可以
         * 然後分別對高、低位鏈進行是否樹化判斷
         *
         * @param map the map
         * @param tab the table for recording bin heads
         * @param index the index of the table being split 當前下標
         * @param bit the bit of hash to split on 舊的容量
         */
        final void split(HashMap<K,V> map, Node<K,V>[] tab, int index, int bit) {
            TreeNode<K,V> b = this;
            // Relink into lo and hi lists, preserving order
            TreeNode<K,V> loHead = null, loTail = null;
            TreeNode<K,V> hiHead = null, hiTail = null;
            int lc = 0, hc = 0;
            for (TreeNode<K,V> e = b, next; e != null; e = next) {
                next = (TreeNode<K,V>)e.next;
                e.next = null;
                // 低位
                if ((e.hash & bit) == 0) {
                    if ((e.prev = loTail) == null)
                        loHead = e;
                    else
                        loTail.next = e;
                    loTail = e;
                    // 連結串列計數
                    ++lc;
                }
                // 高位
                else {
                    if ((e.prev = hiTail) == null)
                        hiHead = e;
                    else
                        hiTail.next = e;
                    hiTail = e;
                    ++hc;
                }
            }

            if (loHead != null) {
                // 小於等於 6 轉化為連結串列
                if (lc <= UNTREEIFY_THRESHOLD)
                    tab[index] = loHead.untreeify(map);
                else {
                    tab[index] = loHead;
                    // 等於 null 說明沒有拆分到高位鏈上,就是原有的樹
                    if (hiHead != null) // (else is already treeified)
                        // 樹化
                        loHead.treeify(tab);
                }
            }
            if (hiHead != null) {
                if (hc <= UNTREEIFY_THRESHOLD)
                    tab[index + bit] = hiHead.untreeify(map);
                else {
                    tab[index + bit] = hiHead;
                    if (loHead != null)
                        hiHead.treeify(tab);
                }
            }
        }

樹查詢和刪除

樹的查詢使用二分法查詢即可

刪除節點:
1.找到要刪除的節點
2.找到要替換的節點
如果沒有子節點,直接刪除
如果存在左子樹,直接將左節點替換
如果存在右子樹,直接將右節點替換
如果同時存在左右子樹,查詢大於當前節點的最小節點替換
3.平衡樹
4.修改連結串列(樹節點相互的連結串列關聯關係)

相關文章