前言
WebApiClient的netcoreapp版本的開發已接近尾聲,最後的進攻方向是效能的壓榨,我把我所做效能優化的過程介紹給大家,大家可以依葫蘆畫瓢,應用到自己的實際專案中,提高程式的效能。
總體成果展示
使用MockResponseHandler消除真實http請求,原生HttpClient、WebApiClientCore和Refit的效能參考:
BenchmarkDotNet=v0.12.1, OS=Windows 10.0.18362.836 (1903/May2019Update/19H1)
Intel Core i3-4150 CPU 3.50GHz (Haswell), 1 CPU, 4 logical and 2 physical cores
.NET Core SDK=3.1.202
[Host] : .NET Core 3.1.4 (CoreCLR 4.700.20.20201, CoreFX 4.700.20.22101), X64 RyuJIT
DefaultJob : .NET Core 3.1.4 (CoreCLR 4.700.20.20201, CoreFX 4.700.20.22101), X64 RyuJIT
Method | Mean | Error | StdDev |
---|---|---|---|
HttpClient_GetAsync | 3.945 μs | 0.2050 μs | 0.5850 μs |
WebApiClientCore_GetAsync | 13.320 μs | 0.2604 μs | 0.3199 μs |
Refit_GetAsync | 43.503 μs | 0.8489 μs | 1.0426 μs |
Method | Mean | Error | StdDev |
---|---|---|---|
HttpClient_PostAsync | 4.876 μs | 0.0972 μs | 0.2092 μs |
WebApiClientCore_PostAsync | 14.018 μs | 0.1829 μs | 0.2246 μs |
Refit_PostAsync | 46.512 μs | 0.7885 μs | 0.7376 μs |
優化之後的WebApiClientCore,效能靠近原生HttpClient,並領先於Refit。
Benchmark過程
效能基準測試可以幫助我們比較多個方法的效能,在沒有效能基準測試工具的情況下,我們僅憑肉眼如何區分效能的變化。
BenchmarkDotNet是一款強力的.NET效能基準測試庫,其為每個被測試的方法提供了孤立的環境,使用BenchmarkDotnet,我們很容易的編寫各種效能測試方法,並可以避免許多常見的坑。
請求總時間對比
拿到BenchmarkDotNet,我就迫不及待地寫了WebApiClient的老版本、原生HttpClient和WebApiClientCore三個請求對比,看看新的Core版本有沒有預期的效能有所提高,以及他們與原生HttpClient有多少效能損耗。
Method | Mean | Error | StdDev |
---|---|---|---|
WebApiClient_GetAsync | 279.479 us | 22.5466 us | 64.3268 us |
WebApiClientCore_GetAsync | 25.298 us | 0.4953 us | 0.7999 us |
HttpClient_GetAsync | 2.849 us | 0.0568 us | 0.1393 us |
WebApiClient_PostAsync | 25.942 us | 0.3817 us | 0.3188 us |
WebApiClientCore_PostAsync | 13.462 us | 0.2551 us | 0.6258 us |
HttpClient_PostAsync | 4.515 us | 0.0866 us | 0.0926 us |
粗略地看了一下結果,我開懷一笑,Core版本比原版本效能好一倍,且接近原生。
細看讓我大吃一驚,老版本的Get請求怎麼這麼慢,想想可能是老版本使用Json.net
,之前吃過Json.net
頻繁建立ContractResolver效能急劇下降的虧,就算是單例ContractResolver第一次建立也很佔用時間。所以改進為在對比之前,做一次請求預熱,這樣比較接近實際使用場景,預熱之後的老版本WebApiClient,Get請求從279us
降低到39us
。
WebApiClientCore的Get與Post對比
從上面的資料來看,WebApiClientCore在Get請求時明顯落後於其Post請求,我的介面是如下定義的:
public interface IWebApiClientCoreApi
{
[HttpGet("/benchmarks/{id}")]
Task<Model> GetAsyc([PathQuery]string id);
[HttpPost("/benchmarks")]
Task<Model> PostAsync([JsonContent] Model model);
}
Get只需要處理引數id,做為請求uri,而Post需要json序列化model為json,證明程式碼裡面的處理引數的[PathQuery]特性效能低下,[PathQuery]依賴於UriEditor工具類,執行流程為先嚐試Replace(),不成功則呼叫AddQUery(),UriEditor的原型如下:
class UriEditor
{
public bool Replace(string name, string? value);
public void AddQuery(string name, string? value);
}
考慮到請求uri為[HttpGet("/benchmarks/{id}")]
,這裡流程上是不會呼叫到AddQuery()方法的,所以鎖定效能低的方法就是Replace()方法,接下來就是想辦法改造Replace方法了,下面為改造前的Replace()實現:
/// <summary>
/// 替換帶有花括號的引數的值
/// </summary>
/// <param name="name">引數名稱,不帶花括號</param>
/// <param name="value">引數的值</param>
/// <returns>替換成功則返回true</returns>
public bool Replace(string name, string? value)
{
if (this.Uri.OriginalString.Contains('{') == false)
{
return false;
}
var replaced = false;
var regex = new Regex($"{{{name}}}", RegexOptions.IgnoreCase);
var url = regex.Replace(this.Uri.OriginalString, m =>
{
replaced = true;
return HttpUtility.UrlEncode(value, this.Encoding);
});
if (replaced == true)
{
this.Uri = new Uri(url);
}
return replaced;
}
Repace的改進方案效能對比
在上面程式碼中,有點經驗一眼就知道是Regex拖的後腿,因為業務需要不區分大小寫的字串替換,而現成中能用的,有且僅有Regex能用了,Regex有兩種使用方式,一種是建立Regex例項,一種是使用Regex的靜態方法。
Regex例項與靜態方法
Method | Mean | Error | StdDev |
---|---|---|---|
ReplaceByRegexStatic | 480.9 ns | 5.50 ns | 5.15 ns |
ReplaceByRegexNew | 2,615.8 ns | 41.33 ns | 36.63 ns |
這一跑就知道原因了,把new Regex替換為靜態的Regex呼叫,效能馬上提高5倍!
Regex靜態方法與自實現Replace函式
感覺Regex靜態方法的效能還不是很高,自己實現一個Replace函式對比試試,萬一比Regex靜態方法還更快呢。於是我花一個晚上的時間寫了這個Replace函式,對,就是整整一個晚上,來為它做效能測試,為它做單元測試,為它做記憶體分配優化。
/// <summary>
/// 不區分大小寫替換字串
/// </summary>
/// <param name="str"></param>
/// <param name="oldValue">原始值</param>
/// <param name="newValue">新值</param>
/// <param name="replacedString">替換後的字元中</param>
/// <exception cref="ArgumentNullException"></exception>
/// <returns></returns>
public static bool RepaceIgnoreCase(this string str, string oldValue, string? newValue, out string replacedString)
{
if (string.IsNullOrEmpty(str) == true)
{
replacedString = str;
return false;
}
if (string.IsNullOrEmpty(oldValue) == true)
{
throw new ArgumentNullException(nameof(oldValue));
}
var strSpan = str.AsSpan();
using var owner = ArrayPool.Rent<char>(strSpan.Length);
var strLowerSpan = owner.Array.AsSpan();
var length = strSpan.ToLowerInvariant(strLowerSpan);
strLowerSpan = strLowerSpan.Slice(0, length);
var oldValueLowerSpan = oldValue.ToLowerInvariant().AsSpan();
var newValueSpan = newValue.AsSpan();
var replaced = false;
using var writer = new BufferWriter<char>(strSpan.Length);
while (strLowerSpan.Length > 0)
{
var index = strLowerSpan.IndexOf(oldValueLowerSpan);
if (index > -1)
{
// 左邊未替換的
var left = strSpan.Slice(0, index);
writer.Write(left);
// 替換的值
writer.Write(newValueSpan);
// 切割長度
var sliceLength = index + oldValueLowerSpan.Length;
// 原始值與小寫值同步切割
strSpan = strSpan.Slice(sliceLength);
strLowerSpan = strLowerSpan.Slice(sliceLength);
replaced = true;
}
else
{
// 替換過剩下的原始值
if (replaced == true)
{
writer.Write(strSpan);
}
// 再也無匹配替換值,退出
break;
}
}
replacedString = replaced ? writer.GetWrittenSpan().ToString() : str;
return replaced;
}
這程式碼不算長,但為它寫了好多個Buffers相關型別,所以總體工作量很大。不過總算寫好了,來個長一點文字的Benchmark:
public class Benchmark : IBenchmark
{
private readonly string str = "WebApiClientCore.Benchmarks.StringReplaces.WebApiClientCore";
private readonly string pattern = "core";
private readonly string replacement = "CORE";
[Benchmark]
public void ReplaceByRegexNew()
{
new Regex(pattern, RegexOptions.IgnoreCase).Replace(str, replacement);
}
[Benchmark]
public void ReplaceByRegexStatic()
{
Regex.Replace(str, pattern, replacement, RegexOptions.IgnoreCase);
}
[Benchmark]
public void ReplaceByCutomSpan()
{
str.RepaceIgnoreCase(pattern, replacement, out var _);
}
}
Method | Mean | Error | StdDev | Median |
---|---|---|---|---|
ReplaceByRegexNew | 3,323.7 ns | 115.82 ns | 326.66 ns | 3,223.4 ns |
ReplaceByRegexStatic | 881.9 ns | 16.79 ns | 43.94 ns | 868.3 ns |
ReplaceByCutomSpan | 524.0 ns | 4.78 ns | 4.47 ns | 524.9 ns |
大動干戈一個晚上,沒多少提高,收支不成正比啊。
與Refit對比
在自家裡和老哥哥比沒意思,所以想跳出來和功能非常相似的Refit做比較看看,在比較之前,我是很有信心的。為了公平,兩者都使用預設配置,都進行預熱,使用相同的介面定義:
配置與預熱
public abstract class BenChmark : IBenchmark
{
protected IServiceProvider ServiceProvider { get; }
public BenChmark()
{
var services = new ServiceCollection();
services
.AddHttpClient(typeof(HttpClient).FullName)
.AddHttpMessageHandler(() => new MockResponseHandler());
services
.AddHttpApi<IWebApiClientCoreApi>()
.AddHttpMessageHandler(() => new MockResponseHandler())
.ConfigureHttpClient(c => c.BaseAddress = new Uri("http://webapiclient.com/"));
services
.AddRefitClient<IRefitApi>()
.AddHttpMessageHandler(() => new MockResponseHandler())
.ConfigureHttpClient(c => c.BaseAddress = new Uri("http://webapiclient.com/"));
this.ServiceProvider = services.BuildServiceProvider();
this.PreheatAsync().Wait();
}
private async Task PreheatAsync()
{
using var scope = this.ServiceProvider.CreateScope();
var core = scope.ServiceProvider.GetService<IWebApiClientCoreApi>();
var refit = scope.ServiceProvider.GetService<IRefitApi>();
await core.GetAsyc("id");
await core.PostAsync(new Model { });
await refit.GetAsyc("id");
await refit.PostAsync(new Model { });
}
}
等同的介面定義
public interface IRefitApi
{
[Get("/benchmarks/{id}")]
Task<Model> GetAsyc(string id);
[Post("/benchmarks")]
Task<Model> PostAsync(Model model);
}
public interface IWebApiClientCoreApi
{
[HttpGet("/benchmarks/{id}")]
Task<Model> GetAsyc(string id);
[HttpPost("/benchmarks")]
Task<Model> PostAsync([JsonContent] Model model);
}
測試函式
/// <summary>
/// 跳過真實的http請求環節的模擬Get請求
/// </summary>
public class GetBenchmark : BenChmark
{
/// <summary>
/// 使用原生HttpClient請求
/// </summary>
/// <returns></returns>
[Benchmark]
public async Task<Model> HttpClient_GetAsync()
{
using var scope = this.ServiceProvider.CreateScope();
var httpClient = scope.ServiceProvider.GetRequiredService<IHttpClientFactory>().CreateClient(typeof(HttpClient).FullName);
var id = "id";
var request = new HttpRequestMessage(HttpMethod.Get, $"http://webapiclient.com/{id}");
var response = await httpClient.SendAsync(request);
var json = await response.Content.ReadAsByteArrayAsync();
return JsonSerializer.Deserialize<Model>(json);
}
/// <summary>
/// 使用WebApiClientCore請求
/// </summary>
/// <returns></returns>
[Benchmark]
public async Task<Model> WebApiClientCore_GetAsync()
{
using var scope = this.ServiceProvider.CreateScope();
var banchmarkApi = scope.ServiceProvider.GetRequiredService<IWebApiClientCoreApi>();
return await banchmarkApi.GetAsyc(id: "id");
}
/// <summary>
/// Refit的Get請求
/// </summary>
/// <returns></returns>
[Benchmark]
public async Task<Model> Refit_GetAsync()
{
using var scope = this.ServiceProvider.CreateScope();
var banchmarkApi = scope.ServiceProvider.GetRequiredService<IRefitApi>();
return await banchmarkApi.GetAsyc(id: "id");
}
}
測試結果
去掉物理網路請求時間段,WebApiClient的效能是Refit的3倍,我終於可以安心的睡個好覺了!
總結
這文章寫得比較亂,是真實的記錄我在做效能調優的過程,實際上的過程中,走過的大大小小彎路還更亂,要是寫下來文章就沒法看了,有需要效能調優的朋友,不防跑一跑banchmark,你會有收穫的。