.NET 排序 Array.Sort<T> 實現分析

SpringLeee發表於2021-09-30

System.Array.Sort<T> 是.NET內建的排序方法, 靈活且高效, 大家都學過一些排序演算法,比如氣泡排序,插入排序,堆排序等,不過你知道這個方法背後使用了什麼排序演算法嗎?

先說結果, 實際上 Array.Sort 不止使用了一種排序演算法, 為了保證不同的資料量的排序場景,都能有一個高效能的表現,實現中包括了插入排序,堆排序和快速排序, 接下來從通過原始碼看看它都做了哪些事情。

Array.Sort

https://source.dot.net/#System.Private.CoreLib/Array.cs,ec5718fae85b7640

public static void Sort<T>(T[] array)
{
    if (array == null)
        ThrowHelper.ThrowArgumentNullException(ExceptionArgument.array);

    if (array.Length > 1)
    {
        var span = new Span<T>(ref MemoryMarshal.GetArrayDataReference(array), array.Length);
        ArraySortHelper<T>.Default.Sort(span, null);
    }
}

這裡我們對 int 陣列進行排序, 先看一下這個Sort方法, 當陣列的長度大於1時, 會先把陣列轉成 Span 列表, 然後呼叫了內部的ArraySortHelper的Default物件的Sort方法。

ArraySortHelper

[TypeDependency("System.Collections.Generic.GenericArraySortHelper`1")]
internal sealed partial class ArraySortHelper<T>
    : IArraySortHelper<T>
{
    private static readonly IArraySortHelper<T> s_defaultArraySortHelper = CreateArraySortHelper();

    public static IArraySortHelper<T> Default => s_defaultArraySortHelper;

    [DynamicDependency("#ctor", typeof(GenericArraySortHelper<>))]
    private static IArraySortHelper<T> CreateArraySortHelper()
    {
        IArraySortHelper<T> defaultArraySortHelper;

        if (typeof(IComparable<T>).IsAssignableFrom(typeof(T)))
        {
            defaultArraySortHelper = (IArraySortHelper<T>)RuntimeTypeHandle.CreateInstanceForAnotherGenericParameter((RuntimeType)typeof(GenericArraySortHelper<string>), (RuntimeType)typeof(T));
        }
        else
        {
            defaultArraySortHelper = new ArraySortHelper<T>();
        }
        return defaultArraySortHelper;
    }
}

Default 會根據是否實現了 IComparable<T> 介面來建立不同的 ArraySortHelper, 因為上面我對int陣列進行排序, 所以呼叫的是 GenericArraySortHelper 的Sort方法。

GenericArraySortHelper

https://source.dot.net/#System.Private.CoreLib/ArraySortHelper.cs,280

internal sealed partial class GenericArraySortHelper<T>
        where T : IComparable<T>
    {
    // Do not add a constructor to this class because ArraySortHelper<T>.CreateSortHelper will not execute it

    #region IArraySortHelper<T> Members

    public void Sort(Span<T> keys, IComparer<T>? comparer)
    {
        try
        {
            if (comparer == null || comparer == Comparer<T>.Default)
            {
                if (keys.Length > 1)
                {
                    // For floating-point, do a pre-pass to move all NaNs to the beginning
                    // so that we can do an optimized comparison as part of the actual sort
                    // on the remainder of the values.
                    if (typeof(T) == typeof(double) ||
                        typeof(T) == typeof(float) ||
                        typeof(T) == typeof(Half))
                    {
                        int nanLeft = SortUtils.MoveNansToFront(keys, default(Span<byte>));
                        if (nanLeft == keys.Length)
                        {
                            return;
                        }
                        keys = keys.Slice(nanLeft);
                    }

                    IntroSort(keys, 2 * (BitOperations.Log2((uint)keys.Length) + 1));
                }
            }
            else
            {
                ArraySortHelper<T>.IntrospectiveSort(keys, comparer.Compare);
            }
        }
        catch (IndexOutOfRangeException)
        {
            ThrowHelper.ThrowArgumentException_BadComparer(comparer);
        }
        catch (Exception e)
        {
            ThrowHelper.ThrowInvalidOperationException(ExceptionResource.InvalidOperation_IComparerFailed, e);
        }
    }

首先會判斷排序的型別是否是浮點型, 如果是的會做一些排序的調整優化,然後呼叫了 IntroSort 方法,並傳入了兩個引數,第一個Keys就是陣列的Span列表,那第二個是什麼呢? 它是一個int型別的depthLimit引數,這裡簡單點理解就是算出陣列的深度,因為後邊會根據這個值進行遞迴操作,然後進入到 IntroSort 方法。

IntroSort

到這個方法這裡就清晰很多了, 這是Array.Sort<T> 排序的主要內容,接著往下看

https://source.dot.net/#System.Private.CoreLib/ArraySortHelper.cs,404

 private static void IntroSort(Span<T> keys, int depthLimit)
{
    Debug.Assert(!keys.IsEmpty);
    Debug.Assert(depthLimit >= 0);

    int partitionSize = keys.Length;
    while (partitionSize > 1)
    {
        if (partitionSize <= Array.IntrosortSizeThreshold)
        {
            if (partitionSize == 2)
            {
                SwapIfGreater(ref keys[0], ref keys[1]);
                return;
            }

            if (partitionSize == 3)
            {
                ref T hiRef = ref keys[2];
                ref T him1Ref = ref keys[1];
                ref T loRef = ref keys[0];

                SwapIfGreater(ref loRef, ref him1Ref);
                SwapIfGreater(ref loRef, ref hiRef);
                SwapIfGreater(ref him1Ref, ref hiRef);
                return;
            }

            InsertionSort(keys.Slice(0, partitionSize));
            return;
        }

        if (depthLimit == 0)
        {
            HeapSort(keys.Slice(0, partitionSize));
            return;
        }
        depthLimit--;

        int p = PickPivotAndPartition(keys.Slice(0, partitionSize));

        // Note we've already partitioned around the pivot and do not have to move the pivot again.
        IntroSort(keys[(p+1)..partitionSize], depthLimit);
        partitionSize = p;
    }
}

第一次進入方法時,partitionSize 就是陣列的長度, 這裡有一個判斷條件,如下, IntrosortSizeThreshold 是一個值為16的常量,它是一個閾值, 如果陣列的長度小於等於16, 那麼使用的就是插入排序(InsertionSort), 為什麼是16呢?這裡通過註釋瞭解到, 從經驗上來看, 16及以下得陣列長度使用插入排序的效率是比較高的。

if (partitionSize <= Array.IntrosortSizeThreshold)
{
    if (partitionSize == 2)
    {
        SwapIfGreater(ref keys[0], ref keys[1]);
        return;
    }

    if (partitionSize == 3)
    {
        ref T hiRef = ref keys[2];
        ref T him1Ref = ref keys[1];
        ref T loRef = ref keys[0];

        SwapIfGreater(ref loRef, ref him1Ref);
        SwapIfGreater(ref loRef, ref hiRef);
        SwapIfGreater(ref him1Ref, ref hiRef);
        return;
    }

    InsertionSort(keys.Slice(0, partitionSize));
    return;
}

InsertionSort

如果陣列的長度小於等於3時, 直接進行對比交換, 如果長度大約3並且小於等於16的話, 使用插入排序(InsertionSort), 方法內容如下:

https://source.dot.net/#System.Private.CoreLib/ArraySortHelper.cs,537

private static void InsertionSort(Span<T> keys)
{
    for (int i = 0; i < keys.Length - 1; i++)
    {
        T t = Unsafe.Add(ref MemoryMarshal.GetReference(keys), i + 1);

        int j = i;
        while (j >= 0 && (t == null || LessThan(ref t, ref Unsafe.Add(ref MemoryMarshal.GetReference(keys), j))))
        {
            Unsafe.Add(ref MemoryMarshal.GetReference(keys), j + 1) = Unsafe.Add(ref MemoryMarshal.GetReference(keys), j);
            j--;
        }

        Unsafe.Add(ref MemoryMarshal.GetReference(keys), j + 1) = t!;
    }
}

HeapSort

if (depthLimit == 0)
{
    HeapSort(keys.Slice(0, partitionSize));
    return;
}
depthLimit--;

因為後邊是遞迴操作,所以每次 depthLimit 都會減1, 當深度為0排序還沒有完成的時候,就會直接使用堆排序(HeapSort),方法內容如下:

https://source.dot.net/#System.Private.CoreLib/ArraySortHelper.cs,990


private static void HeapSort(Span<TKey> keys, Span<TValue> values)
{
    Debug.Assert(!keys.IsEmpty);

    int n = keys.Length;
    for (int i = n >> 1; i >= 1; i--)
    {
        DownHeap(keys, values, i, n);
    }

    for (int i = n; i > 1; i--)
    {
        Swap(keys, values, 0, i - 1);
        DownHeap(keys, values, 1, i - 1);
    }
}

private static void DownHeap(Span<TKey> keys, Span<TValue> values, int i, int n)
{
    TKey d = keys[i - 1];
    TValue dValue = values[i - 1];

    while (i <= n >> 1)
    {
        int child = 2 * i;
        if (child < n && (keys[child - 1] == null || LessThan(ref keys[child - 1], ref keys[child])))
        {
            child++;
        }

        if (keys[child - 1] == null || !LessThan(ref d, ref keys[child - 1]))
            break;

        keys[i - 1] = keys[child - 1];
        values[i - 1] = values[child - 1];
        i = child;
    }

    keys[i - 1] = d;
    values[i - 1] = dValue;
}

QuickSort

int p = PickPivotAndPartition(keys.Slice(0, partitionSize), values.Slice(0, partitionSize));
 
IntroSort(keys[(p+1)..partitionSize], values[(p+1)..partitionSize], depthLimit);
partitionSize = p;

這裡呼叫了另外一個方法 PickPivotAndPartition,
Pivot 基準, Partition 分割槽, 這就是快速排序呀!而且還是使用了尾遞迴的快速排序,其中也使用了三數取中法,方法內容如下

https://source.dot.net/#System.Private.CoreLib/ArraySortHelper.cs,945

private static int PickPivotAndPartition(Span<TKey> keys, Span<TValue> values)
{
    Debug.Assert(keys.Length >= Array.IntrosortSizeThreshold);

    int hi = keys.Length - 1;

    // Compute median-of-three.  But also partition them, since we've done the comparison.
    int middle = hi >> 1;

    // Sort lo, mid and hi appropriately, then pick mid as the pivot.
    SwapIfGreaterWithValues(keys, values, 0, middle);  // swap the low with the mid point
    SwapIfGreaterWithValues(keys, values, 0, hi);   // swap the low with the high
    SwapIfGreaterWithValues(keys, values, middle, hi); // swap the middle with the high

    TKey pivot = keys[middle];
    Swap(keys, values, middle, hi - 1);
    int left = 0, right = hi - 1;  // We already partitioned lo and hi and put the pivot in hi - 1.  And we pre-increment & decrement below.

    while (left < right)
    {
        if (pivot == null)
        {
            while (left < (hi - 1) && keys[++left] == null) ;
            while (right > 0 && keys[--right] != null) ;
        }
        else
        {
            while (GreaterThan(ref pivot, ref keys[++left])) ;
            while (LessThan(ref pivot, ref keys[--right])) ;
        }

        if (left >= right)
            break;

        Swap(keys, values, left, right);
    }

    // Put pivot in the right location.
    if (left != hi - 1)
    {
        Swap(keys, values, left, hi - 1);
    }
    return left;
}

總結

本文主要介紹了System.Array.Sort<T> 排序的內部實現, 發現它使用了插入排序,堆排序和快速排序,大家有興趣可以看一下Java或者Golang的排序實現,希望對您有用。

.NET 排序 Array.Sort<T> 實現分析

相關文章