多執行緒C++更新MYSQL

zengmuansha發表於2024-03-15

繼續上次那個需求的最佳化

使用MYSQL CONNECTOR/C++ JDBC介面開發MYSQL 應用程式

MYSQL C++介面
用AI和VSCODE開發MYSQLC++(JDBC)應用指令碼
如何最佳化千倍降低加密耗時?

在這三篇文章介紹使用C++以及介面開發應用程式,或者小工具
最後最佳化是加密,放棄系統呼叫,直接用OPENSSL函式庫.

今天我們繼續最佳化,使用C++多執行緒 去並行更新MYSQL
下面先拿單執行緒資料看看

單執行緒模型

Total Rows:1

TotalRow1: 1538
TotalRow2: 1538
ColName: count(*)
SQLSTRING_VALUE: 1538
Get SQL LIST Elapsed Time:0s, 11ms, 741us, 658ns
ENCRY NEW KEY Elapsed Time:0s, 6ms, 950us, 639ns
UPDATE SQL Elapsed Time:4s, 585ms, 155us, 288ns
TotalUpdateRows=1538
ALL Elapsed Time:4s, 639ms, 383us, 839ns

Performance counter stats for './main.exe':

238.39 msec task-clock # 0.028 CPUs utilized
3,118 context-switches # 0.013 M/sec
31 cpu-migrations # 0.130 K/sec
1,392 page-faults # 0.006 M/sec
564,511,231 cycles # 2.368 GHz
274,603,902 instructions # 0.49 insn per cycle
38,662,453 branches # 162.179 M/sec
2,288,515 branch-misses # 5.92% of all branches

8.484480222 seconds time elapsed
0.041685000 seconds user
0.282866000 seconds sys
產生3866萬個分支; 分支預測失敗率5.92%

多執行緒模型

SQLSTRING_VALUE: 1538

Get SQL LIST Elapsed Time:0s, 15ms, 676us, 971ns
ENCRY NEW KEY Elapsed Time:0s, 9ms, 639us, 807ns
RUN THREAD:0
RUN THREAD:1
RUN THREAD:2
RUN THREAD:3
Update RemainRow:2 Actaul Update Rows:3
Thread 3:TotalUpdateRows=383
Thread 2:TotalUpdateRows=384
Thread 0:TotalUpdateRows=384
0:執行緒已經結束執行。
Thread 1:TotalUpdateRows=384
1:執行緒已經結束執行。
2:執行緒已經結束執行。
3:執行緒已經結束執行。
Thread Update Elapsed Time:3s, 148ms, 616us, 18ns
MAIN FREE:
ALL Elapsed Time:3s, 197ms, 509us, 467ns


Performance counter stats for './main.exe':

163.42 msec task-clock # 0.048 CPUs utilized
1,604 context-switches # 0.010 M/sec
54 cpu-migrations # 0.330 K/sec
1,529 page-faults # 0.009 M/sec
421,362,571 cycles # 2.578 GHz
292,511,736 instructions # 0.69 insn per cycle
39,041,804 branches # 238.906 M/sec
1,662,566 branch-misses # 4.26% of all branches

3.426073087 seconds time elapsed

0.041813000 seconds user
0.187438000 seconds sys

這麼看來多執行緒下只提升了1秒 25% 有點雞骨!

CPU耗時163.42毫秒;

發生1604次上下文切換;

54次CPU遷移;
4.2136億CPU週期;

產生2.9251億指令;

每週期0.69個指令;

產生39041萬個分支

分支預測失敗率4.26%

綠色是提升,紅色代表下降

簡單的C++ 多執行緒


#
include 
<iostream>

# include <thread>
# include <vector>

void threadFunction ( int id)
{
// 子執行緒的程式碼邏輯
std:: cout << "子執行緒 " << id << " 執行完畢" << std:: endl;
}

int main ()
{
std:: vector< std::thread> threads;

// 建立多個子執行緒並將它們新增到向量中
for ( int i = 0; i < 5; ++i)
{
threads.push_back( std::thread(threadFunction, i));
}

// 等待所有子執行緒完成
for ( auto& t : threads)
{
t.join();
}

std:: cout << "所有子執行緒已結束" << std:: endl;

return 0;
}



呼叫執行緒庫<thread.h> 這是C++的庫.
另外還有LINUX系統執行緒庫,一般為POSIX執行緒庫 #include<pthread.h>
定義了 執行緒 陣列
std::vector<std::thread> threads;
建立執行緒並插入到陣列後面
threads.push_back(std::thread(threadFunction, i));
建立執行緒 並分配工作函式,緊跟著函式的引數i
std ::thread(threadFunction, i)

執行緒建立就立馬執行;
下個迴圈就是主程式空轉,等待工作執行緒全部結束.

編寫多執行緒函式的時候遇到的問題如下
1 執行緒工作函式的引數不支援引用 報錯是無法過載
2 執行緒引數 無論是引用,指標 都是值COPY方式
3 執行緒結束後會摧毀一切,包含引數傳來的物件
4 JOIN()和 ThreadArray[i].detach(); 區別

2和1有點矛盾......
3 意味著它會FREE 掉MYSQL的連結,報DUBLE FREE

*** Error in `/home/shark/projects/CPP_Projects/UPDATE_MYSQL_CRYPTO/main.exe': double free or corruption (fasttop): 0x0000000000655150
*** Error in `/home/shark/projects/CPP_Projects/UPDATE_MYSQL_CRYPTO/main.exe': invalid fastbin entry (free): 0x00000000006680a0


透過 GDB 堆疊瞭解到釋放了connect對像

(gdb) bt
#0  0x00007ffff6195989 in raise () from /lib64/libc.so.6
#1  0x00007ffff6197098 in abort () from /lib64/libc.so.6
#2  0x00007ffff61d6197 in __libc_message () from /lib64/libc.so.6
#3  0x00007ffff61dd56d in _int_free () from /lib64/libc.so.6
#4  0x00007ffff75d0bdd in CRYPTO_free () from lib64/libcrypto.so.1.0.0
#5  0x00007ffff76973f7 in X509_VERIFY_PARAM_free () from lib64/libcrypto.so.1.0.0
#6  0x00007ffff79963c9 in SSL_free () from lib64/libssl.so.1.0.0
#7  0x00007ffff6e39fa8 in vio_ssl_delete (vio=0x643c70) at ../../mysql-8.0.11/vio/viossl.cc:350
#8  0x00007ffff6e19807 in end_server (mysql=0x642920) at ../../mysql-8.0.11/sql-common/client.cc:1486
#9  0x00007ffff6e19b0b in cli_safe_read_with_ok (mysql=mysql@entry=0x642920, parse_ok=parse_ok@entry=false, is_data_packet=is_data_packet@entry=0x0) at ../../mysql-8.0.11/sql-common/client.cc:1010
#10 0x00007ffff6e19c9f in cli_safe_read (mysql=mysql@entry=0x642920, is_data_packet=is_data_packet@entry=0x0) at ../../mysql-8.0.11/sql-common/client.cc:1115
#11 0x00007ffff6e1abe9 in cli_read_query_result (mysql=0x642920) at ../../mysql-8.0.11/sql-common/client.cc:5209
#12 0x00007ffff6e24bbf in execute (stmt=stmt@entry=0x743790, packet=packet@entry=0x7ffff0000a30 "", length=length@entry=171) at ../../mysql-8.0.11/libmysql/libmysql.cc:1933
#13 0x00007ffff6e25a15 in cli_stmt_execute (stmt=0x743790) at ../../mysql-8.0.11/libmysql/libmysql.cc:2057
#14 0x00007ffff6e26bb4 in mysql_stmt_execute (stmt=0x743790) at ../../mysql-8.0.11/libmysql/libmysql.cc:2392
#15 0x00007ffff6df9c2a in sql::mysql::MySQL_Prepared_Statement::do_query() () from lib64/libmysqlcppconn.so.7
#16 0x00007ffff6df60de in sql::mysql::MySQL_Prepared_Statement::executeUpdate() () from lib64/libmysqlcppconn.so.7
#17 0x0000000000405379 in ProcessUpdateData (RowLimit=100, prep_stmt=0x7436e0, subDataArray=std::vector of length 384, capacity 384 = {...}) at main.cpp:440
#18 0x000000000040ba35 in std::_Bind_simple<void (*(int, sql::PreparedStatement*, std::vector<TUpdateData, std::allocator<TUpdateData> >))(int, sql::PreparedStatement*, std::vector<TUpdateData, std::allocator<TUpdateData> >)>::_M_invoke<0ul, 1ul, 2ul>(std::_Index_tuple<0ul, 1ul, 2ul>) (this=0x751d70) at /usr/include/c++/4.8.2/functional:1732
#19 0x000000000040b854 in std::_Bind_simple<void (*(int, sql::PreparedStatement*, std::vector<TUpdateData, std::allocator<TUpdateData> >))(int, sql::PreparedStatement*, std::vector<TUpdateData, std::allocator<TUpdateData> >)>::operator()() (this=0x751d70) at /usr/include/c++/4.8.2/functional:1720
#20 0x000000000040b7dd in std::thread::_Impl<std::_Bind_simple<void (*(int, sql::PreparedStatement*, std::vector<TUpdateData, std::allocator<TUpdateData> >))(int, sql::PreparedStatement*, std::vector<TUpdateData, std::allocator<TUpdateData> >)> >::_M_run() (this=0x751d58) at /usr/include/c++/4.8.2/thread:115
#21 0x00007ffff6aedda0 in ?? () from /lib64/libstdc++.so.6
#22 0x00007ffff7bc7df3 in start_thread () from /lib64/libpthread.so.0
#23 0x00007ffff62563dd in clone () from /lib64/libc.so.6


為此不得不NEW一個出來


//傳給執行緒工作函式的必須是NEW出新物件,執行緒退出它會摧毀所有傳過去的物件

sql::mysql::MySQL_Driver *driverThread = new sql::mysql::MySQL_Driver();
sql::Connection *connectThread = driverThread->connect(URL, config[ "DB_USER"], config[ "DB_PASW"]);
connectThread->setSchema(config[ "DB_NAME"]);
sql::PreparedStatement *prep_stmtThread = connectThread->prepareStatement(config[ "UPDATE_SQL"]);
std:: cout << "RUN THREAD:" << i << std:: endl;
ThreadArray[i] = std::thread(ProcessUpdateData, i, prep_stmtThread, subUpdateDataArray); // 值傳給執行緒函式


4 AI 給了錯誤提示 迴圈啟動執行緒,然後馬上join() 導致主程式等待執行緒完成,再建立下個執行緒. 換成detach() 主程式就不等待了,立馬把所有執行緒啟動起來了. 可是這樣主執行緒執行完後面的程式碼就結束了.主程式一結束立馬把所有執行緒都殺了. 為此使用下面語句進行等待一段時間. 這就沒有意思了

std::this_thread::sleep_for(std::chrono::seconds(10));


所以採用雙迴圈模式,先把所有執行緒啟動,執行,然後再另外個迴圈等待
其實這英文單詞取得很繞 join() 應該是wait() detach()應該是nowait()

最後C++相對於C語言來說程式設計非常便利和友好化
1 動態陣列 vercort<>
可以RESIZE() 後插自動RESIZE()
2 陣列動態定義大小

std::thread ThreadArray[WorkThreadNum];

3 型別自動推動
auto i=3; 自動推動i是整型
4 智慧指標
auto_ptr std::auto_ptr<int> ap1; 使用完不用delete,自動釋放。
unique_ptr 獨立指標,禁止指標複製;
shared_ptr 共享指標,透過計數器方式解決釋放問題

主程式 多執行緒呼叫核心原始碼

auto start_time_Thread = 
std::chrono::high_resolution_clock::now();

u_int32_t RemainRows = 0; // max 42,9496,7295
u_int32_t PerThreadProcessRow = 0; // max 42,9496,7295
u_int8_t WorkThreadNum = std::stoi(config[ "WORK_THEAD"]); // max 256
int ROWLIMIT = std::stoi(config[ "ROW_LIMIT"]); // max 65535

if (Total_row1 % WorkThreadNum > 0)
{
PerThreadProcessRow = Total_row1 / WorkThreadNum;
RemainRows = Total_row1 % WorkThreadNum;
}
else
{
PerThreadProcessRow = Total_row1 / WorkThreadNum;
}

// UpdateDataArray.resize(Old_Data_Map.size());
// subUpdateDataArray.resize(PerThreadProcessRow);
// 兩個MAP 物件 合併成一個結構體動態容器
std:: vector<TUpdateData> UpdateDataArray (Old_Data_Map.size()), subUpdateDataArray (PerThreadProcessRow);
auto Old_it = Old_Data_Map.begin();
auto New_it = New_Data_Map.begin();
for ( int i = 0; Old_it != Old_Data_Map.end(); i++, ++Old_it, ++New_it)
{
UpdateDataArray[i].Card_id = Old_it->first;
UpdateDataArray[i].Old_Encry = Old_it->second;
UpdateDataArray[i].New_Encry = New_it->second;
}

//生成執行緒陣列,分配資料,啟動執行緒
std::thread ThreadArray[WorkThreadNum];
for ( int i = 0; i < WorkThreadNum; i++)
{
for ( int j = 0; j < PerThreadProcessRow; j++)
{
int k = i * PerThreadProcessRow + j;
subUpdateDataArray[j] = UpdateDataArray[k];
}
//傳給執行緒工作函式的必須是NEW出新物件,執行緒退出它會摧毀所有傳過去的物件
sql::mysql::MySQL_Driver *driverThread = new sql::mysql::MySQL_Driver();
sql::Connection *connectThread = driverThread->connect(URL, config[ "DB_USER"], config[ "DB_PASW"]);
connectThread->setSchema(config[ "DB_NAME"]);
sql::PreparedStatement *prep_stmtThread = connectThread->prepareStatement(config[ "UPDATE_SQL"]);
std:: cout << "RUN THREAD:" << i << std:: endl;
ThreadArray[i] = std::thread(ProcessUpdateData, i, prep_stmtThread, subUpdateDataArray); // 值傳給執行緒函式
// ThreadArray[i].detach();
}

//使用主執行緒連結語句物件完成剩餘資料的更新

sql::PreparedStatement *prep_stmt = nullptr;
int UpdateCount= 0;
int TotalUpdateRows= 0;
prep_stmt = con->prepareStatement(config[ "UPDATE_SQL"]);
if (RemainRows > 0 )
{
for ( int f=WorkThreadNum*PerThreadProcessRow -1; f < UpdateDataArray.size();f++)
{
prep_stmt->setString( 1, UpdateDataArray[f].New_Encry);
prep_stmt->setString( 2, UpdateDataArray[f].Card_id);
prep_stmt->setString( 3, UpdateDataArray[f].Old_Encry);
UpdateCount = prep_stmt->executeUpdate();
TotalUpdateRows = UpdateCount + TotalUpdateRows;
}
std:: cout << "Update RemainRow:"<<RemainRows<< "\t Actaul Update Rows:"<<TotalUpdateRows<< std:: endl;
RemainRows = 0;
}

//等待所有執行緒工作完成

for ( int i = 0; i < WorkThreadNum; ++i)
{
if (ThreadArray[i].joinable())
{
ThreadArray[i].join();
std:: cout << i << ":執行緒已經結束執行。" << std:: endl;
}
}
auto end_time_Thread = std::chrono::high_resolution_clock::now();
std:: cout << "Thread Update Elapsed Time:" << Calc_time_diff(end_time_Thread, start_time_Thread) << std:: endl;

執行緒工作函式



void 
ProcessUpdateData
(
int RowLimit, sql::PreparedStatement *prep_stmt, 
std::
vector<TUpdateData> subDataArray)

{

u_int64_t UpdateCount, TotalUpdateRows;
UpdateCount = 0;
TotalUpdateRows = 0;
try
{
// prep_stmt->getConnection()->setAutoCommit(false);
for ( auto &pArray : subDataArray)
{
prep_stmt->setString( 1, pArray.New_Encry);
prep_stmt->setString( 2, pArray.Card_id);
prep_stmt->setString( 3, pArray.Old_Encry);
UpdateCount = prep_stmt->executeUpdate();
TotalUpdateRows = UpdateCount + TotalUpdateRows;
}

std:: cout << "Thread " << RowLimit << ":TotalUpdateRows=" << TotalUpdateRows << std:: endl;
}
catch (sql::SQLException &e)
{
std:: cout << "# ERR: SQLException in " << __FILE__; // 列印異常 檔名 函式名 行號 SQL異常 程式碼 狀態等資訊
std:: cout << "(" << __FUNCTION__ << ") on line " << __LINE__ << std:: endl;
std:: cout << "SQLException: " << e.what() << std:: endl;
std:: cout << "MySQL error code: " << e.getErrorCode() << std:: endl;
std:: cout << "SQLState: " << e.getSQLState() << std:: endl;
}
}

執行緒引數不支援引用,否則報錯

    In file included from /usr/include/c++/4.8.2/thread:39:0,from main.cpp:14:
    /usr/include/c++/4.8.2/functional: In instantiation of 
    ‘struct std::_Bind_simple<void (*(int, sql::PreparedStatement*, std::vector<TUpdateData>))
    (int&, sql::PreparedStatement*, std::vector<TUpdateData>)>’:/usr/include/c++/4.8.2/thread:137:47:
      required from ‘std::thread::thread(_Callable&&, _Args&& ...) 
      [with _Callable = void (&)(int&, sql::PreparedStatement*, std::vector<TUpdateData>);
       _Args = {int&, sql::PreparedStatement*&, std::vector<TUpdateData, std::allocator<TUpdateData> >&}]’
       main.cpp:209:99:   required from here/usr/include/c++/4.8.2/functional:1697:61: 
       錯誤:no type named ‘type’
    in ‘class std::result_of<void (*(int, sql::PreparedStatement*, std::vector<TUpdateData>))(int&,
    sql::PreparedStatement*, std::vector<TUpdateData>)>’       
    typedef typename result_of<_Callable(_Args...)>::type result_type;                                                             ^/usr/include/c++/4.8.2/functional:1727:9: 錯誤:no type named ‘type’ 
    in ‘class std::result_of<void (*(int, sql::PreparedStatement*, std::vector<TUpdateData>))(int&, sql::PreparedStatement*, std::vector<TUpdateData>)>’         _M_invoke(_Index_tuple<_Indices...>)

    這樣定義是不對的

    void ProcessUpdateData(int& RowLimit, sql::PreparedStatement *prep_stmt, std::vector<TUpdateData> subDataArray)



    日後有空研究執行緒的同步技術...



    來自 “ ITPUB部落格 ” ,連結:https://blog.itpub.net/680758/viewspace-3009142/,如需轉載,請註明出處,否則將追究法律責任。

    相關文章