雜談WebApiClient的效能優化

jiulang發表於2020-05-26

前言

WebApiClient的netcoreapp版本的開發已接近尾聲,最後的進攻方向是效能的壓榨,我把我所做效能優化的過程介紹給大家,大家可以依葫蘆畫瓢,應用到自己的實際專案中,提高程式的效能。

總體成果展示

使用MockResponseHandler消除真實http請求,原生HttpClient、WebApiClientCore和Refit的效能參考:

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.18362.836 (1903/May2019Update/19H1)
Intel Core i3-4150 CPU 3.50GHz (Haswell), 1 CPU, 4 logical and 2 physical cores
.NET Core SDK=3.1.202
  [Host]     : .NET Core 3.1.4 (CoreCLR 4.700.20.20201, CoreFX 4.700.20.22101), X64 RyuJIT
  DefaultJob : .NET Core 3.1.4 (CoreCLR 4.700.20.20201, CoreFX 4.700.20.22101), X64 RyuJIT
Method Mean Error StdDev
HttpClient_GetAsync 3.945 μs 0.2050 μs 0.5850 μs
WebApiClientCore_GetAsync 13.320 μs 0.2604 μs 0.3199 μs
Refit_GetAsync 43.503 μs 0.8489 μs 1.0426 μs
Method Mean Error StdDev
HttpClient_PostAsync 4.876 μs 0.0972 μs 0.2092 μs
WebApiClientCore_PostAsync 14.018 μs 0.1829 μs 0.2246 μs
Refit_PostAsync 46.512 μs 0.7885 μs 0.7376 μs

優化之後的WebApiClientCore,效能靠近原生HttpClient,並領先於Refit。

Benchmark過程

效能基準測試可以幫助我們比較多個方法的效能,在沒有效能基準測試工具的情況下,我們僅憑肉眼如何區分效能的變化。

BenchmarkDotNet是一款強力的.NET效能基準測試庫,其為每個被測試的方法提供了孤立的環境,使用BenchmarkDotnet,我們很容易的編寫各種效能測試方法,並可以避免許多常見的坑。

請求總時間對比

拿到BenchmarkDotNet,我就迫不及待地寫了WebApiClient的老版本、原生HttpClient和WebApiClientCore三個請求對比,看看新的Core版本有沒有預期的效能有所提高,以及他們與原生HttpClient有多少效能損耗。

Method Mean Error StdDev
WebApiClient_GetAsync 279.479 us 22.5466 us 64.3268 us
WebApiClientCore_GetAsync 25.298 us 0.4953 us 0.7999 us
HttpClient_GetAsync 2.849 us 0.0568 us 0.1393 us
WebApiClient_PostAsync 25.942 us 0.3817 us 0.3188 us
WebApiClientCore_PostAsync 13.462 us 0.2551 us 0.6258 us
HttpClient_PostAsync 4.515 us 0.0866 us 0.0926 us

粗略地看了一下結果,我開懷一笑,Core版本比原版本效能好一倍,且接近原生。
細看讓我大吃一驚,老版本的Get請求怎麼這麼慢,想想可能是老版本使用Json.net,之前吃過Json.net頻繁建立ContractResolver效能急劇下降的虧,就算是單例ContractResolver第一次建立也很佔用時間。所以改進為在對比之前,做一次請求預熱,這樣比較接近實際使用場景,預熱之後的老版本WebApiClient,Get請求從279us降低到39us

WebApiClientCore的Get與Post對比

從上面的資料來看,WebApiClientCore在Get請求時明顯落後於其Post請求,我的介面是如下定義的:

public interface IWebApiClientCoreApi
{
    [HttpGet("/benchmarks/{id}")]
    Task<Model> GetAsyc([PathQuery]string id);

    [HttpPost("/benchmarks")]
    Task<Model> PostAsync([JsonContent] Model model);
}

Get只需要處理引數id,做為請求uri,而Post需要json序列化model為json,證明程式碼裡面的處理引數的[PathQuery]特性效能低下,[PathQuery]依賴於UriEditor工具類,執行流程為先嚐試Replace(),不成功則呼叫AddQUery(),UriEditor的原型如下:

class UriEditor
{ 
    public bool Replace(string name, string? value);
    public void AddQuery(string name, string? value);
}

考慮到請求uri為[HttpGet("/benchmarks/{id}")],這裡流程上是不會呼叫到AddQuery()方法的,所以鎖定效能低的方法就是Replace()方法,接下來就是想辦法改造Replace方法了,下面為改造前的Replace()實現:

/// <summary>
/// 替換帶有花括號的引數的值
/// </summary>
/// <param name="name">引數名稱,不帶花括號</param>
/// <param name="value">引數的值</param>
/// <returns>替換成功則返回true</returns>
public bool Replace(string name, string? value)
{
    if (this.Uri.OriginalString.Contains('{') == false)
    {
        return false;
    }

    var replaced = false;
    var regex = new Regex($"{{{name}}}", RegexOptions.IgnoreCase);
    var url = regex.Replace(this.Uri.OriginalString, m =>
    {
        replaced = true;
        return HttpUtility.UrlEncode(value, this.Encoding);
    });

    if (replaced == true)
    {
        this.Uri = new Uri(url);
    }
    return replaced;
}

Repace的改進方案效能對比

在上面程式碼中,有點經驗一眼就知道是Regex拖的後腿,因為業務需要不區分大小寫的字串替換,而現成中能用的,有且僅有Regex能用了,Regex有兩種使用方式,一種是建立Regex例項,一種是使用Regex的靜態方法。

Regex例項與靜態方法
Method Mean Error StdDev
ReplaceByRegexStatic 480.9 ns 5.50 ns 5.15 ns
ReplaceByRegexNew 2,615.8 ns 41.33 ns 36.63 ns

這一跑就知道原因了,把new Regex替換為靜態的Regex呼叫,效能馬上提高5倍!

Regex靜態方法與自實現Replace函式

感覺Regex靜態方法的效能還不是很高,自己實現一個Replace函式對比試試,萬一比Regex靜態方法還更快呢。於是我花一個晚上的時間寫了這個Replace函式,對,就是整整一個晚上,來為它做效能測試,為它做單元測試,為它做記憶體分配優化。

/// <summary>
/// 不區分大小寫替換字串
/// </summary>
/// <param name="str"></param>
/// <param name="oldValue">原始值</param>
/// <param name="newValue">新值</param>
/// <param name="replacedString">替換後的字元中</param>
/// <exception cref="ArgumentNullException"></exception>
/// <returns></returns>
public static bool RepaceIgnoreCase(this string str, string oldValue, string? newValue, out string replacedString)
{
    if (string.IsNullOrEmpty(str) == true)
    {
        replacedString = str;
        return false;
    }

    if (string.IsNullOrEmpty(oldValue) == true)
    {
        throw new ArgumentNullException(nameof(oldValue));
    }

    var strSpan = str.AsSpan();
    using var owner = ArrayPool.Rent<char>(strSpan.Length);
    var strLowerSpan = owner.Array.AsSpan();
    var length = strSpan.ToLowerInvariant(strLowerSpan);
    strLowerSpan = strLowerSpan.Slice(0, length);

    var oldValueLowerSpan = oldValue.ToLowerInvariant().AsSpan();
    var newValueSpan = newValue.AsSpan();

    var replaced = false;
    using var writer = new BufferWriter<char>(strSpan.Length);

    while (strLowerSpan.Length > 0)
    {
        var index = strLowerSpan.IndexOf(oldValueLowerSpan);
        if (index > -1)
        {
            // 左邊未替換的
            var left = strSpan.Slice(0, index);
            writer.Write(left);

            // 替換的值
            writer.Write(newValueSpan);

            // 切割長度
            var sliceLength = index + oldValueLowerSpan.Length;

            // 原始值與小寫值同步切割
            strSpan = strSpan.Slice(sliceLength);
            strLowerSpan = strLowerSpan.Slice(sliceLength);

            replaced = true;
        }
        else
        {
            // 替換過剩下的原始值
            if (replaced == true)
            {
                writer.Write(strSpan);
            }

            // 再也無匹配替換值,退出
            break;
        }
    }

    replacedString = replaced ? writer.GetWrittenSpan().ToString() : str;
    return replaced;
}

這程式碼不算長,但為它寫了好多個Buffers相關型別,所以總體工作量很大。不過總算寫好了,來個長一點文字的Benchmark:

public class Benchmark : IBenchmark
{
    private readonly string str = "WebApiClientCore.Benchmarks.StringReplaces.WebApiClientCore";
    private readonly string pattern = "core";
    private readonly string replacement = "CORE";

    [Benchmark]
    public void ReplaceByRegexNew()
    {
        new Regex(pattern, RegexOptions.IgnoreCase).Replace(str, replacement);           
    }

    [Benchmark]
    public void ReplaceByRegexStatic()
    {
        Regex.Replace(str, pattern, replacement, RegexOptions.IgnoreCase);
    }

    [Benchmark]
    public void ReplaceByCutomSpan()
    {
        str.RepaceIgnoreCase(pattern, replacement, out var _);
    }
}
Method Mean Error StdDev Median
ReplaceByRegexNew 3,323.7 ns 115.82 ns 326.66 ns 3,223.4 ns
ReplaceByRegexStatic 881.9 ns 16.79 ns 43.94 ns 868.3 ns
ReplaceByCutomSpan 524.0 ns 4.78 ns 4.47 ns 524.9 ns

大動干戈一個晚上,沒多少提高,收支不成正比啊。

與Refit對比

在自家裡和老哥哥比沒意思,所以想跳出來和功能非常相似的Refit做比較看看,在比較之前,我是很有信心的。為了公平,兩者都使用預設配置,都進行預熱,使用相同的介面定義:

配置與預熱

public abstract class BenChmark : IBenchmark
{
    protected IServiceProvider ServiceProvider { get; }

    public BenChmark()
    {
        var services = new ServiceCollection();

        services
            .AddHttpClient(typeof(HttpClient).FullName)
            .AddHttpMessageHandler(() => new MockResponseHandler());

        services
            .AddHttpApi<IWebApiClientCoreApi>()
            .AddHttpMessageHandler(() => new MockResponseHandler())
            .ConfigureHttpClient(c => c.BaseAddress = new Uri("http://webapiclient.com/"));

        services
            .AddRefitClient<IRefitApi>()
            .AddHttpMessageHandler(() => new MockResponseHandler())
            .ConfigureHttpClient(c => c.BaseAddress = new Uri("http://webapiclient.com/"));

        this.ServiceProvider = services.BuildServiceProvider();
        this.PreheatAsync().Wait();
    }

    private async Task PreheatAsync()
    {
        using var scope = this.ServiceProvider.CreateScope();

        var core = scope.ServiceProvider.GetService<IWebApiClientCoreApi>();
        var refit = scope.ServiceProvider.GetService<IRefitApi>();

        await core.GetAsyc("id");
        await core.PostAsync(new Model { });

        await refit.GetAsyc("id");
        await refit.PostAsync(new Model { });
    }
}

等同的介面定義

public interface IRefitApi
{
    [Get("/benchmarks/{id}")]
    Task<Model> GetAsyc(string id);

    [Post("/benchmarks")]
    Task<Model> PostAsync(Model model);
}

public interface IWebApiClientCoreApi
{
    [HttpGet("/benchmarks/{id}")]
    Task<Model> GetAsyc(string id);

    [HttpPost("/benchmarks")]
    Task<Model> PostAsync([JsonContent] Model model);
}

測試函式

/// <summary> 
/// 跳過真實的http請求環節的模擬Get請求
/// </summary>
public class GetBenchmark : BenChmark
{ 
    /// <summary>
    /// 使用原生HttpClient請求
    /// </summary>
    /// <returns></returns>
    [Benchmark]
    public async Task<Model> HttpClient_GetAsync()
    {
        using var scope = this.ServiceProvider.CreateScope();
        var httpClient = scope.ServiceProvider.GetRequiredService<IHttpClientFactory>().CreateClient(typeof(HttpClient).FullName);

        var id = "id";
        var request = new HttpRequestMessage(HttpMethod.Get, $"http://webapiclient.com/{id}");
        var response = await httpClient.SendAsync(request);
        var json = await response.Content.ReadAsByteArrayAsync();
        return JsonSerializer.Deserialize<Model>(json);
    }


    /// <summary>
    /// 使用WebApiClientCore請求
    /// </summary>
    /// <returns></returns>
    [Benchmark]
    public async Task<Model> WebApiClientCore_GetAsync()
    {
        using var scope = this.ServiceProvider.CreateScope();
        var banchmarkApi = scope.ServiceProvider.GetRequiredService<IWebApiClientCoreApi>();
        return await banchmarkApi.GetAsyc(id: "id");
    }


    /// <summary>
    /// Refit的Get請求
    /// </summary>
    /// <returns></returns>
    [Benchmark]
    public async Task<Model> Refit_GetAsync()
    {
        using var scope = this.ServiceProvider.CreateScope();
        var banchmarkApi = scope.ServiceProvider.GetRequiredService<IRefitApi>();
        return await banchmarkApi.GetAsyc(id: "id");
    }
}

測試結果

去掉物理網路請求時間段,WebApiClient的效能是Refit的3倍,我終於可以安心的睡個好覺了!

總結

這文章寫得比較亂,是真實的記錄我在做效能調優的過程,實際上的過程中,走過的大大小小彎路還更亂,要是寫下來文章就沒法看了,有需要效能調優的朋友,不防跑一跑banchmark,你會有收穫的。

相關文章