ArrayMap詳解及原始碼分析

仰簡發表於2019-02-27

原文網址 : https://flycode.co/archives/263529

一、前言

在《SparseArray詳解及原始碼簡析》中，我們熟悉了 SparseArray 的基本用法、特點以及實現原理。而在 Android SDK 的這個工具包中還有一個同樣重要的資料結構 ArrayMap，其目的也是在當資料量較小，比如幾百個的時候，可以用來替代 HashMap，以提高記憶體的使用效率。

如果對 HashMap 的實現感興趣的話，可以看看《HashMap詳解以及原始碼分析》，而這篇文章就來了解一下 ArrayMap 的使用及其實現原理。

二、原始碼簡析

1. demo 及其簡析

分析程式碼之前同樣先看一段 demo，後面同樣通過 demo 進行實現原理的分析。

        ArrayMap<
String,String>
 arrayMap = new ArrayMap<
>
();
        arrayMap.put(null,"張大哥");
        arrayMap.put("abcd","A大哥");
        arrayMap.put("aabb","巴大哥");
        arrayMap.put("aacc","牛大哥");
        arrayMap.put("aadd","牛大哥");
        arrayMap.put("abcd","B大哥");
        Set<
ArrayMap.Entry<
String,String>
>
 sets = arrayMap.entrySet();
        for (ArrayMap.Entry<
String,String>
 set : sets) { 
           Log.d(TAG, "arrayMapSample: key = " + set.getKey() + ";
value = " + set.getValue());
        
  }複製程式碼

程式碼中，實際插入了 6 個 Key-Value，然而輸出只有 5 個，其中 Key 為 “abcd” 的重複了而發生了覆蓋。另外，還有一點注意的是 null 為 key 是允許插入的。以下是其輸出的結果。

arrayMapSample: key = null;
value = 張大哥arrayMapSample: key = aabb;
value = 巴大哥arrayMapSample: key = aacc;
value = 牛大哥arrayMapSample: key = aadd;
value = 牛大哥arrayMapSample: key = abcd;
value = B大哥

通過 Android Studio 的 Debug 功能，也可以簡單觀察一下其在記憶體中的儲存。

2.原始碼分析

先來簡單看一下 ArrayMap 的類圖結構。

與 HashMap 不同的是，它是直接實現自介面 map。同樣，儲存 key-value 的方式也不同。ArrayMap 是通過陣列直接儲存了所有的 key-value。其中，mHashes 在 index 處儲存了 key 的 hash code，而 mArray 則在 hash code 的 index<
<
1 處儲存 key，在 index<
<
1 + 1 處儲存 value。簡單點說就是偶數處儲存 key，相鄰奇數處儲存 value。

ArrayMap 的初始化

     /**     * Create a new empty ArrayMap.  The default capacity of an array map is 0, and     * will grow once items are added to it.     */    public ArrayMap() { 
       this(0, false);
    
  }    /**     * Create a new ArrayMap with a given initial capacity.     */    public ArrayMap(int capacity) { 
       this(capacity, false);
    
  }    /** {@hide
  } */    public ArrayMap(int capacity, boolean identityHashCode) { 
       mIdentityHashCode = identityHashCode;
        // If this is immutable, use the sentinal EMPTY_IMMUTABLE_INTS        // instance instead of the usual EmptyArray.INT. The reference        // is checked later to see if the array is allowed to grow.        if (capacity <
 0) { 
           mHashes = EMPTY_IMMUTABLE_INTS;
            mArray = EmptyArray.OBJECT;
        
  } else if (capacity == 0) { 
           mHashes = EmptyArray.INT;
            mArray = EmptyArray.OBJECT;
        
  } else { 
           allocArrays(capacity);
        
  }        mSize = 0;
    
  }複製程式碼

ArrayMap 的構造方法有 3 個過載的版本都列在上面了，一般我們都用預設的構造方法，那也就是說預設容量大小就是 0，需要等待到插入元素時才會進行擴容的動作。構造方法中的另一個引數 identityHashCode 控制 hashCode 是由 System 類產生還是由 Object.hashCode() 返回。這兩者之間的實現其實沒太大區別，因為 System 類最終也是通過 Object.hashCode() 來實現的。其主要就是對 null 進行了特殊處理，比如一律為 0。而在 ArrayMap 的 put() 方法中，如果 key 為 null 也將其 hashCode 視為 0 了。所以這裡 identityHashCode 為 true 或者 false 都是一樣的。

插入元素 put()

public V put(K key, V value) { 
       final int osize = mSize;
        // 1.計算 hash code 並獲取 index        final int hash;
        int index;
        if (key == null) { 
           // 為空直接取 0            hash = 0;
            index = indexOfNull();
        
  } else { 
           // 否則取 Object.hashCode()            hash = mIdentityHashCode ? System.identityHashCode(key) : key.hashCode();
            index = indexOf(key, hash);
        
  }        // 2.如果 index 大於等於 0 ，說明之前存在相同的 hash code 且 key 也相同，則直接覆蓋        if (index >
= 0) { 
           index = (index<
<
1) + 1;
            final V old = (V)mArray[index];
            mArray[index] = value;
            return old;
        
  }        // 3.如果沒有找到則上面的 indexOf() 或者  indexOfNull() 就會返回一個負數，而這個負數就是由將要插入的位置 index 取反得到的，所以這裡再次取反就變成了將進行插入的位置        index = ~index;
        // 4.判斷是否需要擴容        if (osize >
= mHashes.length) { 
           final int n = osize >
= (BASE_SIZE*2) ? (osize+(osize>
>
1))                    : (osize >
= BASE_SIZE ? (BASE_SIZE*2) : BASE_SIZE);
            if (DEBUG) Log.d(TAG, "put: grow from " + mHashes.length + " to " + n);
            final int[] ohashes = mHashes;
            final Object[] oarray = mArray;
            // 5.申請新的空間            allocArrays(n);
            if (CONCURRENT_MODIFICATION_EXCEPTIONS &
&
 osize != mSize) { 
               throw new ConcurrentModificationException();
            
  }            if (mHashes.length >
 0) { 
               if (DEBUG) Log.d(TAG, "put: copy 0-" + osize + " to 0");
                // 將資料複製到新的陣列中                System.arraycopy(ohashes, 0, mHashes, 0, ohashes.length);
                System.arraycopy(oarray, 0, mArray, 0, oarray.length);
            
  }            // 6.釋放舊的陣列            freeArrays(ohashes, oarray, osize);
        
  }        if (index <
 osize) { 
           // 7.如果 index 在當前 size 之內，則需要將 index 開始的資料移到 index + 1 處，以騰出 index 的位置            if (DEBUG) Log.d(TAG, "put: move " + index + "-" + (osize-index)                    + " to " + (index+1));
            System.arraycopy(mHashes, index, mHashes, index + 1, osize - index);
            System.arraycopy(mArray, index <
<
 1, mArray, (index + 1) <
<
 1, (mSize - index) <
<
 1);
        
  }        if (CONCURRENT_MODIFICATION_EXCEPTIONS) { 
           if (osize != mSize || index >
= mHashes.length) { 
               throw new ConcurrentModificationException();
            
  }        
  }        // 8.然後根據計算得到的 index 分別插入 hash，key，以及 code        mHashes[index] = hash;
        mArray[index<
<
1] = key;
        mArray[(index<
<
1)+1] = value;
        mSize++;
        return null;
    
  }複製程式碼

put 方法呼叫了其他幾個內部的方法，其中關於擴容以及如何釋放空間，申請新的空間這些，從演算法層來講其實不重要，只要知道一點就是，擴容會發生資料的複製，這個是會影響效率的就可以了。而與演算法相關性較大的 indexOfNull() 方法以及 indexOf() 方法的實現。由於這兩個方法的實現基本一樣，因此這裡只分析 indexOf() 的實現。

int indexOf(Object key, int hash) { 
       final int N = mSize;
        // Important fast case: if nothing is in here, nothing to look for.        if (N == 0) { 
           return ~0;
        
  }        int index = binarySearchHashes(mHashes, N, hash);
        // If the hash code wasn't found, then we have no entry for this key.        if (index <
 0) { 
           return index;
        
  }        // If the key at the returned index matches, that's what we want.        if (key.equals(mArray[index<
<
1])) { 
           return index;
        
  }        // Search for a matching key after the index.        int end;
        for (end = index + 1;
 end <
 N &
&
 mHashes[end] == hash;
 end++) { 
           if (key.equals(mArray[end <
<
 1])) return end;
        
  }        // Search for a matching key before the index.        for (int i = index - 1;
 i >
= 0 &
&
 mHashes[i] == hash;
 i--) { 
           if (key.equals(mArray[i <
<
 1])) return i;
        
  }        // Key not found -- return negative value indicating where a        // new entry for this key should go.  We use the end of the        // hash chain to reduce the number of array entries that will        // need to be copied when inserting.        return ~end;
    
  }複製程式碼

其實它原來的註釋已經很詳細了，詳細的步驟是：

(1) 如果當前為空表，則直接返回 ~0，注意不是 0 ，而是最大的負數。

(2) 在 mHashs 陣列中進行二分查詢，找到 hash 的 index。

(3) 如果 index <
0，說明沒有找到。

(4) 如果 index >
= 0，且在 mArray 中對應的 index<
<
1 處的 key 與要找的 key 又相同，則認為是同一個 key，說明找到了。

(5) 如果 key 不相同，說明只是 hash code 相同，那麼分別向後和向前進行搜尋，如果找到了就返回。如果沒找到，那麼對 end 取反就是當前需要插入的 index 位置。

再回過頭來看 put() 方法， put() 方法的具體實現都在原始碼中加以了詳細的說明，感興趣的可以詳細閱讀一下。而從 put 方法得出以下幾個結論：

(1) mHashs 陣列以升序的方式儲存了所有的 hash code。

(2) 通過 hash code 在 mHashs 陣列裡的 index 值來確定 key 以及 value 在 mArrays 陣列中的儲存位置。一般來說分別就是 index <
<
1 以及 index <
<
1 + 1。再簡單點說就是 index * 2 以及 index * 2 + 1。

(3) hashCode 必然可能存在衝突，這裡是怎麼解決的呢？這個是由上面的第 3 步和第 7 步所決定。第 3 步是得出應該插入的 index 的位置，而第 7 步則是如果 index <
osize ，則說明原來 mArrays 中必然已經存在相同 hashCode 的值了，那麼就把資料全部往後移一位，從而在 mHashs 中插入多個相同的 hash code 並且一定是連線在一起的，而在 mArrays 中插入新的 key 和 value，最終得以解決 hash 衝突。

上面的結論可能還是讓人覺得有點暈，那麼再來看看下面的圖吧，就一定能明白了。

上面圖說， index == 0 時和 index == 1時的 hash code 是一樣的，說明 key1 與 key2 的 hash code 是一樣的，也就是存在 hash 衝突了。那麼，如上，這裡的解決辦法就是 hash code 儲存了 2 份，而 key-value 分別儲存一份。

get() 方法

    public V get(Object key) { 
       final int index = indexOfKey(key);
        return index >
= 0 ? (V)mArray[(index<
<
1)+1] : null;
    
  }複製程式碼

主要就是通過 indexOfKey() 計算出 index，而 indexOfKey() 的實現就是呼叫 indexOfNull () 和 indexOf()，其具體的實現已經上面分析過了。這裡如果返了 index >
= 0，則說明一定是找到了，那麼根據前面的規則，在 mArray 中，index<
<
1 + 1 就是所要獲取的 value 了。

remove() 方法

    public V remove(Object key) { 
       final int index = indexOfKey(key);
        if (index >
= 0) { 
           return removeAt(index);
        
  }        return null;
    
  }複製程式碼

首先通過 indexOfKey() 計算出 index 以判斷其是否存在，如果存在則進一步呼叫 removeAt() 來刪除相應的 hash code 以及 key-value。

public V removeAt(int index) { 
       final Object old = mArray[(index <
<
 1) + 1];
        final int osize = mSize;
        final int nsize;
        // 如果 size 小於等於1 ，移除後陣列長度將為 0。為了壓縮記憶體，這裡直接將mHashs 以及 mArray 置為了空陣列        if (osize <
= 1) { 
           // Now empty.            if (DEBUG) Log.d(TAG, "remove: shrink from " + mHashes.length + " to 0");
            final int[] ohashes = mHashes;
            final Object[] oarray = mArray;
            mHashes = EmptyArray.INT;
            mArray = EmptyArray.OBJECT;
            freeArrays(ohashes, oarray, osize);
            nsize = 0;
        
  } else { 
           // size >
 1 的情況，則先將 size - 1            nsize = osize - 1;
            if (mHashes.length >
 (BASE_SIZE*2) &
&
 mSize <
 mHashes.length/3) { 
               // 如果上面的條件符合，那麼就要進行資料的壓縮。                 // Shrunk enough to reduce size of arrays.  We don't allow it to                // shrink smaller than (BASE_SIZE*2) to avoid flapping between                // that and BASE_SIZE.                final int n = osize >
 (BASE_SIZE*2) ? (osize + (osize>
>
1)) : (BASE_SIZE*2);
                if (DEBUG) Log.d(TAG, "remove: shrink from " + mHashes.length + " to " + n);
                final int[] ohashes = mHashes;
                final Object[] oarray = mArray;
                allocArrays(n);
                if (CONCURRENT_MODIFICATION_EXCEPTIONS &
&
 osize != mSize) { 
                   throw new ConcurrentModificationException();
                
  }                if (index >
 0) { 
                   if (DEBUG) Log.d(TAG, "remove: copy from 0-" + index + " to 0");
                    System.arraycopy(ohashes, 0, mHashes, 0, index);
                    System.arraycopy(oarray, 0, mArray, 0, index <
<
 1);
                
  }                if (index <
 nsize) { 
                   if (DEBUG) Log.d(TAG, "remove: copy from " + (index+1) + "-" + nsize                            + " to " + index);
                    System.arraycopy(ohashes, index + 1, mHashes, index, nsize - index);
                    System.arraycopy(oarray, (index + 1) <
<
 1, mArray, index <
<
 1,                            (nsize - index) <
<
 1);
                
  }            
  } else { 
               if (index <
 nsize) { 
                   // 如果 index 在 size 內，則將資料往前移一位                    if (DEBUG) Log.d(TAG, "remove: move " + (index+1) + "-" + nsize                            + " to " + index);
                    System.arraycopy(mHashes, index + 1, mHashes, index, nsize - index);
                    System.arraycopy(mArray, (index + 1) <
<
 1, mArray, index <
<
 1,                            (nsize - index) <
<
 1);
                
  }                // 然後將最後一位資料置 null                mArray[nsize <
<
 1] = null;
                mArray[(nsize <
<
 1) + 1] = null;
            
  }        
  }        if (CONCURRENT_MODIFICATION_EXCEPTIONS &
&
 osize != mSize) { 
           throw new ConcurrentModificationException();
        
  }        mSize = nsize;
        return (V)old;
    
  }複製程式碼

一般情況下刪除一個資料，只需要將 index 後面的資料都往 index 方向移一位，然後刪除末位數即可。而如果當前的陣列中的條件達到 mHashs 的長度大於 BASE_SIZE2 且實際大小又小於其長度的 1/3，那麼就要進行資料的壓縮。而壓縮後的空間至少也是 BASE_SIZE2 的大小。

三、總結

ArrayMap 中比較重要的是 put() 方法以及 remvoeAt() 方法的實現，這兩個方法基本實現了 ArrayMap 的所有重要的特性。這裡再重複一下以作為全文的總結。

mHashs 陣列以升序的方式儲存了所有的 hash code，在查詢資料時則通過二分查詢 hash code 所對應的 index。這也是它的 get() 比 HashMap 慢的根據原因所在。
通過 hash code 在 mHashs 陣列裡的 index 值來確定 key 以及 value 在 mArrays 陣列中的儲存位置。一般來說分別就是 index <
<
1 以及 index <
<
1 + 1。再簡單點說就是 index * 2 以及 index * 2 + 1。
hashCode 必然可能存在衝突，這裡是怎麼解決的呢？簡單點說就是，在 mHashs 中相鄰地存多份 hash code，而在 mArray 中分別以它們的 index 來計算 key-value 的儲存位置。
當進行 remove 操作時，在一定條件下，可能會發生資料的壓縮，從而節省記憶體的使用。

最後，感謝你能讀到並讀完此文章。受限於作者水平有限，如果存在錯誤或者疑問都歡迎留言討論。如果我的分享能夠幫助到你，也請記得幫忙點個贊吧，鼓勵我繼續寫下去，謝謝。

來源：https://juejin.im/post/5c3f5e756fb9a049e553e52f

Android原始碼分析–ArrayMap優化
2019-02-26
Android原始碼優化
EventBus詳解及原始碼分析
2019-04-28
原始碼
Linuxepoll模型詳解及原始碼分析
2018-06-03
Linux模型原始碼
LinkedList詳解-原始碼分析
2020-05-21
原始碼
ArrayList詳解-原始碼分析
2020-05-20
原始碼
LeakCanary詳解與原始碼分析
2018-12-14
原始碼
MapReduce 詳解與原始碼分析
2022-09-21
原始碼
SparseArray詳解及原始碼簡析
2019-01-14
原始碼
LinkedHashMap 詳解及原始碼簡析
2019-01-27
HashMap原始碼
Android--Handler機制及原始碼詳解
2018-10-05
Android原始碼
shiro 整合 spring 實戰及原始碼詳解
2024-02-27
Spring原始碼
Golang WaitGroup 底層原理及原始碼詳解
2023-04-27
GolangAI原始碼
Android技術棧(五)HashMap和ArrayMap原始碼解析
2019-04-17
AndroidHashMap原始碼
Tomcat詳解系列(3) - 原始碼分析準備和分析入口
2021-04-02
Tomcat原始碼
詳解Java 容器（第③篇）——容器原始碼分析 - List
2020-04-07
Java原始碼
詳解Java 容器（第④篇）——容器原始碼分析 - Map
2020-04-08
Java原始碼
Django模型驗證器詳解和原始碼分析
2020-09-07
Django模型原始碼
ReentrantReadWriteLock原始碼分析及理解
2020-06-04
原始碼
ReentrantLock解析及原始碼分析
2020-05-29
ReentrantLock原始碼
ProgressHUD原始碼詳解
2018-05-15
原始碼
HashMap原始碼詳解
2023-11-03
HashMap原始碼
redux 原始碼詳解
2019-02-11
Redux原始碼
TimSort原始碼詳解
2020-12-11
原始碼
JDK動態代理實現原理詳解（原始碼分析）
2020-10-24
JDK原始碼
Spring原始碼分析之Bean的建立過程詳解
2020-10-29
Spring原始碼Bean
Netty原始碼分析之Reactor執行緒模型詳解
2021-11-22
Netty原始碼React執行緒模型
【UGUI原始碼分析】Unity遮罩之Mask詳細解讀
2021-08-12
UGUI原始碼Unity遮罩
springmvc工作原理及原始碼分析
2018-12-20
SpringMVC原始碼
react-Router 及原始碼分析
2019-03-01
React原始碼
ThreadPoolExecutor的使用及原始碼分析
2018-09-14
thread原始碼
ThreadLocal應用及原始碼分析
2020-11-20
thread原始碼
AQS的原理及原始碼分析
2021-12-27
AQS原始碼
【Redis原始碼】Redis 6 ACL原始碼詳解
2020-11-24
Redis原始碼
Spring事務原始碼分析專題（一）JdbcTemplate使用及原始碼分析
2020-07-21
Spring原始碼JDBC
詳解Java 容器（第⑤篇）——容器原始碼分析 - 併發容器
2020-04-09
Java原始碼
詳解Tomcat系列(一)-從原始碼分析Tomcat的啟動
2019-05-27
Tomcat原始碼
Spring 原始碼詳解(一)
2020-12-18
Spring原始碼
原始碼|jdk原始碼之棧、佇列及ArrayDeque分析
2019-01-19
原始碼JDK佇列