探索:優雅地實現非同步方法的並行化

0611163發表於2023-02-09

接上篇 透過一個示例形象地理解C# async await 非並行非同步、並行非同步、並行非同步的併發量控制

前些天寫了兩篇關於C# async await非同步的部落格,
第一篇部落格看的人多,點贊評論也多,我想應該都看懂了,比較簡單。
第二篇部落格看的人少,點讚的也少,沒有評論。

我很納悶,第二篇部落格才是重點,如此吊炸天的程式碼,居然沒人評論。
部落格中的程式碼,就是.NET圈的大佬也沒有寫過,為什麼這麼說?這就要說到C# async await的語法糖了:
沒有語法糖,程式碼一樣寫,java8沒有語法糖,一樣能寫出高效能程式碼。但有了C# async await語法糖,水平一般的普通的業務程式設計師,哪怕很菜,也能寫出高效能高吞吐量的程式碼,這就是意義!
所以我說頂級大佬沒寫過,因為他們水平高,腦力好,手段多,自然不需要這麼寫。但對於普通程式來說,程式碼寫的複雜了,麻煩不說,BUG頻出。
標題我用了"探索"這個詞,有沒有更好的實踐,讓小白們都容易寫的並行非同步的實踐?

ElasticSearch的效能

下面透過一個es的查詢,來展示並行非同步程式碼的實用價值。
下面是真實環境中部署的服務的測試截圖:

379次es查詢,僅需0.185秒(當然耗時會有波動,零點幾秒都是正常的)。
es最怕的是什麼?是慢查詢,是條件複雜的大範圍模糊查詢。
我的策略是多次精確查詢,這樣可以利用es極高的吞吐能力。

有多快?

  1. 上述截圖只是其中一個測試,查詢分析的時間範圍較小(一個多月的資料量)
  2. 另一個服務介面,分析半年的資料量,大約72億+18億=90億,透過幾千次es請求,從這些資料中分析出結果,僅需幾秒。

為什麼這麼快?

  1. es叢集的伺服器較多,記憶體很大(300G,當然伺服器上不只有es),叢集本身的吞吐量很高。
  2. 並行非同步效能高且吞吐量大!而C#語法糖使得並行非同步容易編寫。

為什麼要使用並行非同步?

既然查詢次數多,單執行緒或同步方式肯定是不行的,必須並行查詢。
並行程式碼,python、java也能寫。
但前同事寫的在雙層迴圈體中多次查詢es的python程式碼,就是同步方式。為什麼不併行查詢呢?並行肯定可以寫,但是能不寫就不寫,為什麼?因為寫起來複雜,不好寫,不好除錯,還容易寫出BUG。
重點是什麼?不僅要寫並行程式碼,還要寫的簡單,不破壞程式碼原有邏輯結構。

普通的非同步方法

普通的非同步方法大家都會寫,用async await就行了,很簡單。下面是我自己寫的,主要是在雙迴圈中多次非同步請求(由於是實際程式碼,不是Demo,所以程式碼有點長,可以大致看一下,主要看await xxx是怎樣寫的):

/// <summary>
/// xxx查詢
/// </summary>
public async Task<List<AccompanyInfo>> Query2(string strStartTime, string strEndTime, int kpCountThreshold, int countThreshold, int distanceThreshold, int timeThreshold, List<PeopleCluster> peopleClusterList)
{
    List<AccompanyInfo> resultList = new List<AccompanyInfo>();
    Stopwatch sw = Stopwatch.StartNew();

    //建立字典
    Dictionary<string, PeopleCluster> clusterIdPeopleDict = new Dictionary<string, PeopleCluster>();
    foreach (PeopleCluster peopleCluster in peopleClusterList)
    {
        foreach (string clusterId in peopleCluster.ClusterIds)
        {
            if (!clusterIdPeopleDict.ContainsKey(clusterId))
            {
                clusterIdPeopleDict.Add(clusterId, peopleCluster);
            }
        }
    }

    int queryCount = 0;
    Dictionary<string, AccompanyInfo> dict = new Dictionary<string, AccompanyInfo>();
    foreach (PeopleCluster people1 in peopleClusterList)
    {
        List<PeopleFeatureInfo> peopleFeatureList = await ServiceFactory.Get<PeopleFeatureQueryService>().Query(strStartTime, strEndTime, people1);
        queryCount++;
        foreach (PeopleFeatureInfo peopleFeatureInfo1 in peopleFeatureList)
        {
            DateTime capturedTime = DateTime.ParseExact(peopleFeatureInfo1.captured_time, "yyyyMMddHHmmss", CultureInfo.InvariantCulture);
            string strStartTime2 = capturedTime.AddSeconds(-timeThreshold).ToString("yyyyMMddHHmmss");
            string strEndTime2 = capturedTime.AddSeconds(timeThreshold).ToString("yyyyMMddHHmmss");
            List<PeopleFeatureInfo> peopleFeatureList2 = await ServiceFactory.Get<PeopleFeatureQueryService>().QueryExcludeSelf(strStartTime2, strEndTime2, people1);
            queryCount++;
            if (peopleFeatureList2.Count > 0)
            {
                foreach (PeopleFeatureInfo peopleFeatureInfo2 in peopleFeatureList2)
                {
                    string key = null;
                    PeopleCluster people2 = null;
                    string people2ClusterId = null;
                    if (clusterIdPeopleDict.ContainsKey(peopleFeatureInfo2.cluster_id.ToString()))
                    {
                        people2 = clusterIdPeopleDict[peopleFeatureInfo2.cluster_id.ToString()];
                        key = $"{string.Join(",", people1.ClusterIds)}_{string.Join(",", people2.ClusterIds)}";
                    }
                    else
                    {
                        people2ClusterId = peopleFeatureInfo2.cluster_id.ToString();
                        key = $"{string.Join(",", people1.ClusterIds)}_{string.Join(",", people2ClusterId)}";
                    }

                    double distance = LngLatUtil.CalcDistance(peopleFeatureInfo1.Longitude, peopleFeatureInfo1.Latitude, peopleFeatureInfo2.Longitude, peopleFeatureInfo2.Latitude);
                    if (distance > distanceThreshold) continue;

                    AccompanyInfo accompanyInfo;
                    if (dict.ContainsKey(key))
                    {
                        accompanyInfo = dict[key];
                    }
                    else
                    {
                        accompanyInfo = new AccompanyInfo();
                        dict.Add(key, accompanyInfo);
                    }

                    accompanyInfo.People1 = people1;
                    if (people2 != null)
                    {
                        accompanyInfo.People2 = people2;
                    }
                    else
                    {
                        accompanyInfo.ClusterId2 = people2ClusterId;
                    }

                    AccompanyItem accompanyItem = new AccompanyItem();
                    accompanyItem.Info1 = peopleFeatureInfo1;
                    accompanyItem.Info2 = peopleFeatureInfo2;
                    accompanyInfo.List.Add(accompanyItem);

                    accompanyInfo.Count++;

                    resultList.Add(accompanyInfo);
                }
            }
        }
    }

    resultList = resultList.FindAll(a => (a.People2 != null && a.Count >= kpCountThreshold) || a.Count >= countThreshold);

    //去重
    int beforeDistinctCount = resultList.Count;
    resultList = resultList.DistinctBy(a =>
    {
        string str1 = string.Join(",", a.People1.ClusterIds);
        string str2 = a.People2 != null ? string.Join(",", a.People2.ClusterIds) : string.Empty;
        string str3 = a.ClusterId2 ?? string.Empty;
        StringBuilder sb = new StringBuilder();
        foreach (AccompanyItem item in a.List)
        {
            var info2 = item.Info2;
            sb.Append($"{info2.camera_id},{info2.captured_time},{info2.cluster_id}");
        }
        return $"{str1}_{str2}_{str3}_{sb}";
    }).ToList();

    sw.Stop();
    string msg = $"xxx查詢,耗時:{sw.Elapsed.TotalSeconds:0.000} 秒,查詢次數:{queryCount},去重:{beforeDistinctCount}-->{resultList.Count}";
    Console.WriteLine(msg);
    LogUtil.Info(msg);

    return resultList;
}

非同步方法的並行執行

上述程式碼邏輯上是沒有問題的,但效能上有問題。在雙迴圈中多次請求,雖然用了async await非同步,但不是並行,耗時會很長,如何最佳化?下面是並行非同步的寫法(由於是實際程式碼,不是Demo,所以程式碼有點長,可以大致看一下,主要看tasks1和tasks2怎樣組織,怎樣await,以及返回值怎麼獲取):

/// <summary>
/// xxx查詢
/// </summary>
public async Task<List<AccompanyInfo>> Query(string strStartTime, string strEndTime, int kpCountThreshold, int countThreshold, int distanceThreshold, int timeThreshold, List<PeopleCluster> peopleClusterList)
{
    List<AccompanyInfo> resultList = new List<AccompanyInfo>();
    Stopwatch sw = Stopwatch.StartNew();

    //建立字典
    Dictionary<string, PeopleCluster> clusterIdPeopleDict = new Dictionary<string, PeopleCluster>();
    foreach (PeopleCluster peopleCluster in peopleClusterList)
    {
        foreach (string clusterId in peopleCluster.ClusterIds)
        {
            if (!clusterIdPeopleDict.ContainsKey(clusterId))
            {
                clusterIdPeopleDict.Add(clusterId, peopleCluster);
            }
        }
    }

    //組織第一層迴圈task
    Dictionary<PeopleCluster, Task<List<PeopleFeatureInfo>>> tasks1 = new Dictionary<PeopleCluster, Task<List<PeopleFeatureInfo>>>();
    foreach (PeopleCluster people1 in peopleClusterList)
    {
        var task1 = ServiceFactory.Get<PeopleFeatureQueryService>().Query(strStartTime, strEndTime, people1);
        tasks1.Add(people1, task1);
    }

    //計算第一層迴圈task並快取結果,組織第二層迴圈task
    Dictionary<string, Task<List<PeopleFeatureInfo>>> tasks2 = new Dictionary<string, Task<List<PeopleFeatureInfo>>>();
    Dictionary<PeopleCluster, List<PeopleFeatureInfo>> cache1 = new Dictionary<PeopleCluster, List<PeopleFeatureInfo>>();
    foreach (PeopleCluster people1 in peopleClusterList)
    {
        List<PeopleFeatureInfo> peopleFeatureList = await tasks1[people1];
        cache1.Add(people1, peopleFeatureList);
        foreach (PeopleFeatureInfo peopleFeatureInfo1 in peopleFeatureList)
        {
            DateTime capturedTime = DateTime.ParseExact(peopleFeatureInfo1.captured_time, "yyyyMMddHHmmss", CultureInfo.InvariantCulture);
            string strStartTime2 = capturedTime.AddSeconds(-timeThreshold).ToString("yyyyMMddHHmmss");
            string strEndTime2 = capturedTime.AddSeconds(timeThreshold).ToString("yyyyMMddHHmmss");
            var task2 = ServiceFactory.Get<PeopleFeatureQueryService>().QueryExcludeSelf(strStartTime2, strEndTime2, people1);
            string task2Key = $"{strStartTime2}_{strEndTime2}_{string.Join(",", people1.ClusterIds)}";
            tasks2.TryAdd(task2Key, task2);
        }
    }

    //讀取第一層迴圈task快取結果,計算第二層迴圈task
    Dictionary<string, AccompanyInfo> dict = new Dictionary<string, AccompanyInfo>();
    foreach (PeopleCluster people1 in peopleClusterList)
    {
        List<PeopleFeatureInfo> peopleFeatureList = cache1[people1];
        foreach (PeopleFeatureInfo peopleFeatureInfo1 in peopleFeatureList)
        {
            DateTime capturedTime = DateTime.ParseExact(peopleFeatureInfo1.captured_time, "yyyyMMddHHmmss", CultureInfo.InvariantCulture);
            string strStartTime2 = capturedTime.AddSeconds(-timeThreshold).ToString("yyyyMMddHHmmss");
            string strEndTime2 = capturedTime.AddSeconds(timeThreshold).ToString("yyyyMMddHHmmss");
            string task2Key = $"{strStartTime2}_{strEndTime2}_{string.Join(",", people1.ClusterIds)}";
            List<PeopleFeatureInfo> peopleFeatureList2 = await tasks2[task2Key];
            if (peopleFeatureList2.Count > 0)
            {
                foreach (PeopleFeatureInfo peopleFeatureInfo2 in peopleFeatureList2)
                {
                    string key = null;
                    PeopleCluster people2 = null;
                    string people2ClusterId = null;
                    if (clusterIdPeopleDict.ContainsKey(peopleFeatureInfo2.cluster_id.ToString()))
                    {
                        people2 = clusterIdPeopleDict[peopleFeatureInfo2.cluster_id.ToString()];
                        key = $"{string.Join(",", people1.ClusterIds)}_{string.Join(",", people2.ClusterIds)}";
                    }
                    else
                    {
                        people2ClusterId = peopleFeatureInfo2.cluster_id.ToString();
                        key = $"{string.Join(",", people1.ClusterIds)}_{string.Join(",", people2ClusterId)}";
                    }

                    double distance = LngLatUtil.CalcDistance(peopleFeatureInfo1.Longitude, peopleFeatureInfo1.Latitude, peopleFeatureInfo2.Longitude, peopleFeatureInfo2.Latitude);
                    if (distance > distanceThreshold) continue;

                    AccompanyInfo accompanyInfo;
                    if (dict.ContainsKey(key))
                    {
                        accompanyInfo = dict[key];
                    }
                    else
                    {
                        accompanyInfo = new AccompanyInfo();
                        dict.Add(key, accompanyInfo);
                    }

                    accompanyInfo.People1 = people1;
                    if (people2 != null)
                    {
                        accompanyInfo.People2 = people2;
                    }
                    else
                    {
                        accompanyInfo.ClusterId2 = people2ClusterId;
                    }

                    AccompanyItem accompanyItem = new AccompanyItem();
                    accompanyItem.Info1 = peopleFeatureInfo1;
                    accompanyItem.Info2 = peopleFeatureInfo2;
                    accompanyInfo.List.Add(accompanyItem);

                    accompanyInfo.Count++;

                    resultList.Add(accompanyInfo);
                }
            }
        }
    }

    resultList = resultList.FindAll(a => (a.People2 != null && a.Count >= kpCountThreshold) || a.Count >= countThreshold);

    //去重
    int beforeDistinctCount = resultList.Count;
    resultList = resultList.DistinctBy(a =>
    {
        string str1 = string.Join(",", a.People1.ClusterIds);
        string str2 = a.People2 != null ? string.Join(",", a.People2.ClusterIds) : string.Empty;
        string str3 = a.ClusterId2 ?? string.Empty;
        StringBuilder sb = new StringBuilder();
        foreach (AccompanyItem item in a.List)
        {
            var info2 = item.Info2;
            sb.Append($"{info2.camera_id},{info2.captured_time},{info2.cluster_id}");
        }
        return $"{str1}_{str2}_{str3}_{sb}";
    }).ToList();

    //排序
    foreach (AccompanyInfo item in resultList)
    {
        item.List.Sort((a, b) => -string.Compare(a.Info1.captured_time, b.Info1.captured_time));
    }

    sw.Stop();
    string msg = $"xxx查詢,耗時:{sw.Elapsed.TotalSeconds:0.000} 秒,查詢次數:{tasks1.Count + tasks2.Count},去重:{beforeDistinctCount}-->{resultList.Count}";
    Console.WriteLine(msg);
    LogUtil.Info(msg);

    return resultList;
}

上述程式碼說明

  1. 為了使非同步並行化,業務邏輯的雙層迴圈要寫三遍。第三遍雙層迴圈程式碼結構和前面所述普通的非同步方法中的雙層迴圈程式碼結構是一樣的。
  2. 第一、二遍雙層迴圈程式碼是多出來的。第一遍只有一層迴圈。第二遍有兩層迴圈(第三層迴圈是處理資料和請求無關,這裡不討論)。
  3. 寫的時候,可以先寫好普通的非同步方法,然後再透過復貼上修改成並行化的非同步方法。當然,腦力好的可以直接寫。

為什麼說.NET圈的大佬沒有寫過?

  1. 我覺得還真沒有人這樣寫過!
  2. 不吹個牛,部落格沒人看,沒人點贊啊?!
  3. 厲害的是C#,由於C#語法糖,把優秀的程式碼寫簡單了,才是真的優秀。
  4. 我倒是希望有大佬寫個更好的實踐,把我這種寫法淘汰掉,因為這是我能想到的最容易控制的寫法了。
  5. 並行程式碼,很多人都會寫,java、python也能寫,但問題是,水平一般的普通的業務程式設計師,如何無腦地寫這種並行程式碼?
  6. 最差的寫法,例如java的CompletableFuture,和複雜的業務邏輯結合起來,寫法就很複雜了。
  7. 其次的寫法,也是官方文件上有的,大家都能想到的寫法,例如:
List<PeopleFeatureInfo>[] listArray = await Task.WhenAll(tasks2.Values);

在雙迴圈體中,怎麼拿結果?肯定能拿,但又要思考怎麼寫了不是?
而我的寫法,在雙迴圈體中是可以直接拿結果的:

List<PeopleFeatureInfo> list = await tasks2[task2Key];

並行程式碼用Python怎麼寫?

只放C#程式碼沒有說服力,python程式碼我不太會寫,不過,一個同事python寫的很6,他寫的資料探勘程式碼很多都是並行,例如:

def get_es_multiprocess(index_list, people_list, core_percent, rev_clusterid_idcard_dict):
    '''
    多程式讀取es資料,轉為整個資料幀,按時間排序
    :return: 規模較大的資料幀
    '''
    col_list = ["cluster_id", "camera_id", "captured_time"]
    pool = Pool(processes=int(mp.cpu_count() * core_percent))
    input_list = [(i, people_list, col_list) for i in index_list]
    res = pool.map(get_es, input_list)
    if not res:
        return None
    pool.close()
    pool.join()
    df_all = pd.DataFrame(columns=col_list+['longitude', 'latitude'])
    for df in res:
        df_all = pd.concat([df_all, df])
    # 這裡強制轉換為字串!
    df_all['cluster_id_'] = df_all['cluster_id'].apply(lambda x: rev_clusterid_idcard_dict[str(x)])
    del df_all['cluster_id']
    df_all.rename(columns={'cluster_id_': 'cluster_id'}, inplace=True)
    df_all.sort_values(by='captured_time', inplace=True)
    print('=' * 100)
    print('整個資料(聚類前):')
    print(df_all.info())
    cluster_id_list = [(i, df) for i, df in df_all.groupby(['cluster_id'])]
    cluster_id_list_split = [j for j in func(cluster_id_list, 1000000)]
    # todo 縮小資料集,用於除錯!
    data_all = df_all.iloc[:, :]
    return data_all, cluster_id_list_split

上述python程式碼解析

  1. 核心程式碼:
res = pool.map(get_es, input_list)
...省略
pool.join()
...省略

核心程式碼說明:其中get_es是查詢es的方法,應該不是非同步方法,不過這不是重點
2. res是查詢結果,透過並行的方式一次性查出來,放到res中,然後把結果再解出來
3. 注意,這只是單層迴圈,想想雙層迴圈怎麼寫
4. pool.join()會阻塞當前執行緒,失去非同步的好處,這個不好
5. 同事註釋中寫的是"多程式",是寫錯了嗎?實際是多執行緒?還是多程式?
6. 當然,python是有async await非同步寫法的,應該不比C#差,只是同事沒有使用
7. python程式碼,字串太多,字串是最不好維護的。我寫的C#程式碼中的字串裡面都是強型別變數。

把腦力活變成體力活

照葫蘆畫瓢,把腦力活變成體力活,我又寫了一個並行非同步方法(業務邏輯依然有點複雜,主要看tasks1和tasks2怎樣組織,怎樣await,以及返回值怎麼獲取,註釋"比對xxx"下面的程式碼和並行非同步無關,可以略過):

/// <summary>
/// xxx查詢
/// </summary>
public async Task<List<SameVehicleInfo>> Query(string strStartTime, string strEndTime, int kpCountThreshold, int timeThreshold, List<PeopleCluster> peopleClusterList)
{
    List<SameVehicleInfo> resultList = new List<SameVehicleInfo>();
    Stopwatch sw = Stopwatch.StartNew();

    //組織第一層迴圈task,查xxx
    Dictionary<PeopleCluster, Task<List<PeopleFeatureInfo>>> tasks1 = new Dictionary<PeopleCluster, Task<List<PeopleFeatureInfo>>>();
    foreach (PeopleCluster people1 in peopleClusterList)
    {
        var task1 = ServiceFactory.Get<PeopleFeatureQueryService>().Query(strStartTime, strEndTime, people1);
        tasks1.Add(people1, task1);
    }

    //計算第一層迴圈task並快取結果,組織第二層迴圈task,精確搜xxx
    Dictionary<string, Task<List<MotorVehicleInfo>>> tasks2 = new Dictionary<string, Task<List<MotorVehicleInfo>>>();
    Dictionary<PeopleCluster, List<PeopleFeatureInfo>> cache1 = new Dictionary<PeopleCluster, List<PeopleFeatureInfo>>();
    foreach (PeopleCluster people1 in peopleClusterList)
    {
        List<PeopleFeatureInfo> peopleFeatureList = await tasks1[people1];
        cache1.Add(people1, peopleFeatureList);
        foreach (PeopleFeatureInfo peopleFeatureInfo1 in peopleFeatureList)
        {
            string task2Key = $"{peopleFeatureInfo1.camera_id}_{peopleFeatureInfo1.captured_time}";
            var task2 = ServiceFactory.Get<MotorVehicleQueryService>().QueryExact(peopleFeatureInfo1.camera_id, peopleFeatureInfo1.captured_time);
            tasks2.TryAdd(task2Key, task2);
        }
    }

    //讀取第一層迴圈task快取結果,計算第二層迴圈task
    Dictionary<PersonVehicleKey, PersonVehicleInfo> dictPersonVehicle = new Dictionary<PersonVehicleKey, PersonVehicleInfo>();
    foreach (PeopleCluster people1 in peopleClusterList)
    {
        List<PeopleFeatureInfo> peopleFeatureList = cache1[people1];
        foreach (PeopleFeatureInfo peopleFeatureInfo1 in peopleFeatureList)
        {
            string task2Key = $"{peopleFeatureInfo1.camera_id}_{peopleFeatureInfo1.captured_time}";
            List<MotorVehicleInfo> motorVehicleList = await tasks2[task2Key];
            motorVehicleList = motorVehicleList.DistinctBy(a => a.plate_no).ToList();
            foreach (MotorVehicleInfo motorVehicleInfo in motorVehicleList)
            {
                PersonVehicleKey key = new PersonVehicleKey(people1, motorVehicleInfo.plate_no);
                PersonVehicleInfo personVehicleInfo;
                if (dictPersonVehicle.ContainsKey(key))
                {
                    personVehicleInfo = dictPersonVehicle[key];
                }
                else
                {
                    personVehicleInfo = new PersonVehicleInfo()
                    {
                        People = people1,
                        PlateNo = motorVehicleInfo.plate_no,
                        List = new List<PeopleFeatureInfo>()
                    };
                    dictPersonVehicle.Add(key, personVehicleInfo);
                }
                personVehicleInfo.List.Add(peopleFeatureInfo1);
            }
        }
    }

    //比對xxx
    List<PersonVehicleKey> keys = dictPersonVehicle.Keys.ToList();
    Dictionary<string, SameVehicleInfo> dict = new Dictionary<string, SameVehicleInfo>();
    for (int i = 0; i < keys.Count - 1; i++)
    {
        for (int j = i + 1; j < keys.Count; j++)
        {
            var key1 = keys[i];
            var key2 = keys[j];
            var personVehicle1 = dictPersonVehicle[key1];
            var personVehicle2 = dictPersonVehicle[key2];
            if (key1.PlateNo == key2.PlateNo)
            {
                foreach (PeopleFeatureInfo peopleFeature1 in personVehicle1.List)
                {
                    double minTimeDiff = double.MaxValue;
                    int minIndex = -1;
                    for (int k = 0; k < personVehicle2.List.Count; k++)
                    {
                        PeopleFeatureInfo peopleFeature2 = personVehicle2.List[k];
                        DateTime capturedTime1 = DateTime.ParseExact(peopleFeature1.captured_time, "yyyyMMddHHmmss", CultureInfo.InvariantCulture);
                        DateTime capturedTime2 = DateTime.ParseExact(peopleFeature2.captured_time, "yyyyMMddHHmmss", CultureInfo.InvariantCulture);
                        var timeDiff = Math.Abs(capturedTime2.Subtract(capturedTime1).TotalSeconds);
                        if (timeDiff < minTimeDiff)
                        {
                            minTimeDiff = timeDiff;
                            minIndex = k;
                        }
                    }
                    if (minIndex >= 0 && minTimeDiff <= timeThreshold * 60)
                    {
                        PeopleCluster people1 = key1.People;
                        PeopleCluster people2 = key2.People;
                        PeopleFeatureInfo peopleFeatureInfo2 = personVehicle2.List[minIndex];

                        string key = $"{string.Join(",", people1.ClusterIds)}_{string.Join(",", people2.ClusterIds)}"; ;

                        SameVehicleInfo accompanyInfo;
                        if (dict.ContainsKey(key))
                        {
                            accompanyInfo = dict[key];
                        }
                        else
                        {
                            accompanyInfo = new SameVehicleInfo();
                            dict.Add(key, accompanyInfo);
                        }

                        accompanyInfo.People1 = people1;
                        accompanyInfo.People2 = people2;

                        SameVehicleItem accompanyItem = new SameVehicleItem();
                        accompanyItem.Info1 = peopleFeature1;
                        accompanyItem.Info2 = peopleFeatureInfo2;
                        accompanyInfo.List.Add(accompanyItem);

                        accompanyInfo.Count++;

                        resultList.Add(accompanyInfo);
                    }
                }
            }
        }
    }

    resultList = resultList.FindAll(a => a.Count >= kpCountThreshold);

    //篩選,排除xxx
    resultList = resultList.FindAll(a =>
    {
        if (string.Join(",", a.People1.ClusterIds) == string.Join(",", a.People2.ClusterIds))
        {
            return false;
        }
        return true;
    });

    //去重
    int beforeDistinctCount = resultList.Count;
    resultList = resultList.DistinctBy(a =>
    {
        string str1 = string.Join(",", a.People1.ClusterIds);
        string str2 = string.Join(",", a.People2.ClusterIds);
        StringBuilder sb = new StringBuilder();
        foreach (SameVehicleItem item in a.List)
        {
            var info2 = item.Info2;
            sb.Append($"{info2.camera_id},{info2.captured_time},{info2.cluster_id}");
        }
        return $"{str1}_{str2}_{sb}";
    }).ToList();

    //排序
    foreach (SameVehicleInfo item in resultList)
    {
        item.List.Sort((a, b) => -string.Compare(a.Info1.captured_time, b.Info1.captured_time));
    }

    sw.Stop();
    string msg = $"xxx查詢,耗時:{sw.Elapsed.TotalSeconds:0.000} 秒,查詢次數:{tasks1.Count + tasks2.Count},去重:{beforeDistinctCount}-->{resultList.Count}";
    Console.WriteLine(msg);
    LogUtil.Info(msg);

    return resultList;
}

C#的優點

  1. 有人說:我們開發的低程式碼平臺很優秀。C#:我就是低程式碼!
  2. 有人說:我們開發的平臺功能很強大,支援寫SQL、支援寫指令碼。C#:我就是指令碼語言!
  3. 有人說:我們用spark、flink分散式。C#:並行非同步高效能高吞吐,單機就可以,只要用到的例如kafka和es是叢集就行。

相關文章