python 操作 PostgreSQL 資料庫，執行緒並行修改 5w 條資料，效能優化

娃哈哈店長發表於2019-12-31

原文網址 : https://learnku.com/articles/38998?order_by=created_at&

之前的任務中，我們建立了，日誌系統，操作excel文件和資料庫的介面、高德地圖的介面。但是城市的資料有2000多條，我們每次單執行緒地去存取資料，太慢了。後來我們測試、例如OP文件OP資料庫都屬於IO操作比較密集的操作。子優化運算元據庫的時候，我們選定執行緒池並行，建立新的xls表，接下來讓我們從新的額xls表中讀取資料並存入資料庫POSTGRE吧?

上一個文章地址：部落格：python 操作 Excel 文件 IO 頻繁，多執行緒 / 執行緒池修改 2800 條資料

獲取新xls表中的所有資料並整理為列表形式返回

其實修改的程式碼量不大，但是要考慮保留之前我們用的函式和方法還要繼續能使用。

excel2sql.py中：

在我們建立物件OpExcel時，為檔案url和sheet名新增預設的值：

class OpExcel(object):
    def __init__(
        self, 
        url:"str型別的檔案路徑" = config.src_path+"\\data\\2019最新全國城市省市縣區行政級別對照表(194).xls", 
        sheet:"excel中的表單名" = "全國城市省市縣區域列表"):

當我們不給予引數時，我們開啟的檔案時原始檔案。

當我們開啟新的擁有地域名和所有經緯度的xls檔案時像如下：

test = OpExcel(config.src_path+"\\data\\test.xls","全國城市省市縣區域列表")

則我們可以查詢到新的檔案和資料

之前我們寫過的函式init_SampleViaProvince_name，時通過傳參而返回指定省的第1級別的地點和地域名（list）形式

我們修改為預設不輸入傳輸則返回所有的資料：

# 通過高德地圖API查詢每個地域名的經緯度，並返回為list資料
    def init_SampleViaProvince_name(
        self, 
        Province_name:"省名" = None
        ) ->"insert的資料，列表形式[('地域名1','1','經緯度'),('地域名2','1','經緯度')]":
        geo_app = Geo_mapInterface(config.geo_key)
        all_data = [self.sh_data.row_values(i) for i in range(self.rows)]
        print(all_data)
        if Province_name:# 生成測試用資料

            cities_data=[ [ ["".join(i[0:3]),1,'test'], ["".join(i[1:3]),1,'test']][i[0]==i[1]] for i in all_data if i[0] == Province_name]
        else: # 生成實際資料
            # 生成第1級別地點和經緯度
            cities_data=[[["".join(i[0:3]),1,i[4]],["".join(i[1:3]),1,i[4]]][i[0]==i[1]] for i in all_data[1:]]
            # 繼續新增第0級別的地點和經緯度：
            for i in all_data:
                cities_data.append(["".join(i[0:2-(i[0]==i[1])]),0,i[4]])
        # for i in cities_data:
        #     i.append(geo_app.get_coordinatesViaaddress("".join(i[0])))
        return cities_data

測試一下：

f:/workspace/env/Scripts/python.exe f:/workspace/城市距離爬取/process_data/excel2sql.py
-----
犁哈薩克自治州', 0, 12345.0], ['新疆省伊犁哈薩克自治州', 0, 12345.0], ['新疆省伊犁哈薩
（展示部分資料）
---

成功。

新的函式已經定義好，那麼我們可以在主函式中同樣利用之前的方法將所有的資料傳入到資料庫中。

同樣採用執行緒並行池，來op資料庫，修改之前的程式碼，優化?

#定義在樣本表中插入資料操作
    def insert_sample_data(self, 
    condition : "insert語句" = "insert into sample_data(address,ad_type,coordinates) values (%s,%s,%s)", 
    params : "insert資料，列表形式[('地域名1','1','經緯度'),('地域名2','1','經緯度')]" = [('地域名1','1','經緯度'),('地域名2','1','經緯度')]
    ) -> "字典形式的批量插入資料結果" :
        def test(l):
            # 插入一條資料
            self.cur.execute(condition,l)
        executor = ThreadPoolExecutor(max_workers=17)
        try:
            # self.cur.executemany(condition,params)
            for i in params:
                # 利用執行緒池--concurrent.futures模組來管理多執行緒：
                future = executor.submit(test,i)
            self.conn.commit()
            result={'code':'0000','message':'執行批量插入操作成功','data':len(params)}
            logging.basicConfig(stream=open(config.src_path + '/log/syserror.log', encoding="utf-8", mode="a"), level = logging.DEBUG, format='%(asctime)s %(filename)s[line:%(lineno)d] %(levelname)s %(message)s')
            logger = logging.getLogger(__name__)
            logger.info("在樣本表sample_data中插入資料{}條，操作：{}!".format(result['data'],result['message']))
        except psycopg2.Error as e:
            self.conn.rollback()  # 執行回滾操作
            result={'code':'9999','message':'執行批量插入異常','data':[]}
            print ("資料庫錯誤|insert_data : %s" % (e.args[0]))
            logging.basicConfig(stream=open(config.src_path + '/log/syserror.log', encoding="utf-8", mode="a"), level = logging.DEBUG, format='%(asctime)s %(filename)s[line:%(lineno)d] %(levelname)s %(message)s')
            logger = logging.getLogger(__name__)
            logger.exception(e)
        return result

好的我們先在main.py用測試資料看一下：

test_data = [['新疆省省直轄行政單位', 0, 12345.0], ['新疆省省直轄行政單位', 0, 12345.0], ['新疆省省直轄行政單位', 0, 12345.0], ['新疆省省直轄行政單位', 0, 12345.0]]
    # print("獲取湖北省的資料",data_insert_sample)
    start = datetime.datetime.now()
    result = op_postgre.insert_sample_data("insert into sample_data(address,ad_type,coordinates) values (%s,%s,%s)",test_data)
    print(datetime.datetime.now() - start)
    # 初始化樣表
    # result =op_postgre.ini_table("sample_data")
    if result['code']=='0000':
        print("運算元據{}條，操作：{}!".format(result['data'],result['message']))
    else:
        print(result)

執行成功！：

(env) PS F:\workspace> & f:/workspace/env/Scripts/python.exe f:/workspace/城市距離爬取/main.py
資料總共條數，和列數： 2867 5
0:00:00.003990
運算元據4條，操作：執行批量插入操作成
功!

上所有資料：

 data_insert_sample = excel_data.init_SampleViaProvince_name()
    print("獲取的資料量",len(data_insert_sample))

(env) PS F:\workspace> & f:/workspace/env/Scripts/python.exe f:/workspace/城市距離爬取/main.py
資料總共條數，和列數： 2867 5
獲取的資料 5734

開始執行：

(env) PS F:\workspace> & f:/workspace/env/Scripts/python.exe f:/workspace/城市距離爬取/main.py
資料總共條數，和列數： 2867 5
獲取的資料量 5734
0:00:00.114661
運算元據5734條，操作：執行批量插入操
作成功!
(env) PS F:\workspace>

速度有點離譜的快，我們看看資料是否真的存進去了？

？？為何只有503條資料？

分析一下程式碼：

        def test(l):
            # 插入一條資料
            self.cur.execute(condition,l)
        executor = ThreadPoolExecutor(max_workers=17)
        try:
            # self.cur.executemany(condition,params)
            for i in params:
                # 利用執行緒池--concurrent.futures模組來管理多執行緒：
                future = executor.submit(test,i)
            self.conn.commit()

估計是在我們執行緒池裡的資料還沒執行完畢的時候，主執行緒就已經關閉，導致資料庫連結斷開，經過測試，當我們把池的大小擴大，存入的資料就越多，估計是這個原因。

但是不能無止境的擴大池的大小波。。

我們試試鎖住當前執行緒：

首先我們給opsql物件建立一個任務池：

def __init__(self, 
    host_db : '資料庫服務主機' = 'localhost', 
    user_db: '資料庫使用者名稱' = 'postgres', 
    passwd_db: '資料庫密碼' = '1026shenyang', 
    name_db: '資料庫名稱' = 'linezone', 
    port_db: '埠號，整型數字'=5432):
        self.all_task = [] # 任務池

我們把任務存放到任務池，用wait函式，當我們任務池所有任務完成時，再關閉伺服器：


def test(l):
            # 插入一條資料
            self.cur.execute(condition,l)
            self.conn.commit()
····主要程式碼
 # self.cur.executemany(condition,params)
            for i in params:
                # 利用執行緒池--concurrent.futures模組來管理多執行緒：
                self.all_task.append(executor.submit(test,i))

·····
·····
#資料庫關閉
    def __del__(self):
        wait(self.all_task, return_when=ALL_COMPLETED)
        self.conn.close()

執行成功：

(env) PS F:\workspace> & f:/workspace/env/Scripts/python.exe f:/workspace/城市距離爬取/main.py
資料總共條數，和列數： 2867 5
獲取的資料量 5734
0:00:00.099693
運算元據5734條，操作：執行批量插入操
作成功!
5734

檢視資料庫，確實存在5734條資料！！！nice！！

python 操作 PostgreSQL 資料庫，執行緒並行修改 5w 條資料，效能優化

從資料庫中調取地點資料，並1-1地來通過高德API來獲取距離和路徑，處理少量資料（0級城市）

因為我們需要比較地地點是：

0級別和0級別距離
1級別和1級別距離

那我我們將兩種地點分類：

def test_3():
     # generate_sampledata()
    op_postgre = opsql.OperationDbInterface()
    addList = op_postgre.select_all("sample_data")['data']
    test = geo_map.Geo_mapInterface()
    print("所有地點個數",len(addList))
    # 第0級別的資料
    list_0 =[list(i) for i in addList if i[1] == 0]
    # 第1級別的資料
    list_1 =[list(i) for i in addList if i[1] == 1]

我們檢視一下我們地點地所有資料：

(env) PS F:\workspace> & f:/workspace/env/Scripts/python.exe f:/workspace/城市距離爬取/main.py
所有地點個數 5733
第0級別的資料 2867
第1級別的資料 2866

？？？？有bug，為何兩個級別地資料差不多？，不可能地。

我們分析程式碼：

發現了問題：再excel2sql.py檔案種：

# 繼續新增第0級別的地點和經緯度：
for i in all_data:
    cities_data.append(["".join(i[0:2-(i[0]==i[1])]),0,i[4]])
改為：
for i in all_data:
    if cities_data.count(["".join(i[0:2-(i[0]==i[1])]),0,i[4]]) ==0:
        cities_data.append(["".join(i[0:2-(i[0]==i[1])]),0,i[4]])

通過list地count方法，簡單地防止資料種有重複地存在。

之後得到地結果：

(env) PS F:\workspace> & f:/workspace/env/Scripts/python.exe f:/workspace/城市距離爬取/main.py
資料總共條數，和列數： 2867 5
獲取的資料量 3206
0:00:00.067819
運算元據3206條，操作：執行批量插入操
作成功!
3206
(env) PS F:\workspace>

3206條資料！

通過高德api，，geo/geo_map.py介面來通過處理分析資料，來構造dict_route = {'origin':[],'destination':[],'distance':[],'route':[]}這種字典。
這個字典地意思是

出發點-目的地點-距離-行程來對應地
dict_route[origin][0]- dict_route[destination][0]-dict_route[distance][0]-dict_route[route][0]

我們還是先不用api（三個key都超流量了，先用假的資料來操作一下吧）。
同樣到時候修改程式碼就可以了。

編寫新的函式，傳入目的地和始發地，可以獲得三組資料

def get_disViaCoordinates(self,
    addList:"一個列表存放地址資料"
    ) ->  "{'origin':[],'destination':[],'distance':[],'route':[]}":
        dict_route = {'origin':[],'destination':[],'distance':[],'route':[]}
        def test(origin,destination):
            url2='https://restapi.amap.com/v3/direction/driving?origin='+origin+'&destination='+destination+'&extensions=all&output=json&key=3e2235273dd2c0ca2421071fbb96def4'
            #編碼
            newUrl2 = parse.quote(url2, safe="/:=&?#+!$,;'@()*[]")
            #傳送請求
            response2 = urllib.request.urlopen(newUrl2)
            #接收資料
            data2 = response2.read()
            #解析json檔案
            jsonData2 = json.loads(data2)
            #輸出該json中所有road的值
            # print(jsonData2)
            road=jsonpath.jsonpath(jsonData2,'$..road')
            #從json檔案中提取距離
            distance = jsonData2['route']['paths'][0]['distance']
            return origin, destination, distance, road

執行緒池-將任務池化，並用all_task裝入。。因為到時候u我們需要等所有地資料查詢完畢後，才能返回dict_route字典：

class Geo_mapInterface(object):
    def __init__(self,
    key:"高德地圖apikey值" = '3e2235273dd2c0ca2421071fbb96def4'):
        self.addList = [('湖北省武漢市江岸區', 1, '114.278760,30.592688'), ('湖北省武漢市江漢區', 1, '114.270871,30.601430'), ('湖北省武漢市喬口區', 1, '114.214920,30.582202')] #建立一個列表存放地址資料
        # self.dict = dict(set(self.addList))#建立一個字典用於存放地址經緯度資料
        self.key = key
        self.all_task = []

def get_disViaCoordinates(self,
    addList:"一個列表存放地址資料"
    ) ->  "{'origin':[],'destination':[],'distance':[],'route':[]}":
        dict_route = {'origin':[],'destination':[],'distance':[],'route':[]}
        def test(origin,destination):
            # API炸了，先用假資料吧
            # url2='https://restapi.amap.com/v3/direction/driving?origin='+origin+'&destination='+destination+'&extensions=all&output=json&key=3e2235273dd2c0ca2421071fbb96def4'
            # #編碼
            # newUrl2 = parse.quote(url2, safe="/:=&?#+!$,;'@()*[]")
            # #傳送請求
            # response2 = urllib.request.urlopen(newUrl2)
            # #接收資料
            # data2 = response2.read()
            # #解析json檔案
            # jsonData2 = json.loads(data2)
            # #輸出該json中所有road的值
            # # print(jsonData2)
            # road=jsonpath.jsonpath(jsonData2,'$..road')
            # #從json檔案中提取距離
            # distance = jsonData2['route']['paths'][0]['distance']
            # 假資料
            distance = 999
            road = ['路徑1', '路徑2', '路徑3', ]
            # time.sleep(0.0001)
            # print(origin,destination)
            #字典dict_route中追加資料
            dict_route['origin'].append(origin)
            dict_route['destination'].append(destination)
            dict_route['distance'].append(distance)
            dict_route['route'].append(road)

        # print(addList)
        executor = ThreadPoolExecutor(max_workers=17)
        for m in range(len(addList)):    
            # print(addList[m])
            for n in range(m,len(addList)):
                if m!=n:
                    print('get_tetst',m,n)
                    #從addList中得到地址的名稱，經緯度
                    origin = addList[m][0]
                    destination = addList[n][0]
                    self.all_task.append(executor.submit(test,origin,destination))

        wait(self.all_task, return_when=ALL_COMPLETED)
        print(len(self.all_task))
        return dict_route

再main中地test3中：

 def test_3():
     # generate_sampledata()
    op_postgre = opsql.OperationDbInterface()
    addList = op_postgre.select_all("sample_data")['data']
    test = geo_map.Geo_mapInterface()
    print("所有地點個數",len(addList))
    # 第0級別的資料
    list_0 =[list(i) for i in addList if i[1] == 0]
    print("第0級別的資料",len(list_0))
    # 第1級別的資料
    list_1 =[list(i) for i in addList if i[1] == 1]
    print("第1級別的資料",len(list_1))
    # print(test.key)
    start = datetime.datetime.now()
    all_test = test.get_disViaCoordinates(list_0[:])
    # all_test = {'origin': ['湖北省武漢市江岸區', '湖北省武漢市江岸區', '湖北省武漢市江漢區'], 'destination': ['湖北省武漢市江漢區', '湖北省武漢市喬口區', '湖北省武漢市喬口區'], 'distance': ['1520', '9197', '7428'], 'route': [['臺北一路', '新華路'], ['臺北一路', '臺北路', '解放大道', '解放大道', '解放大道', '解放大道', '解放大道', '解放大道', '解放大道', '解放大道', '二環線輔路', '沿河大道'], ['新華路', '建設大道', '建設大道', '建設大道', '建設大道', '沿河大道']]}
    print("第0級別的資料",len(list_0))
    print("第1級別的資料",len(list_1))

    print(len(all_test))
    print(datetime.datetime.now() - start)
    # for x in range(len(all_test)-1):
    #     print(x)
    #     data = [str(all_test[i][x]) for i in all_test]
    #     print(data)
    # all_data = [[str(all_test[i][x]) for i in all_test] for x in range(len(all_test)-1)]
    # # print(all_data)
    # result = op_postgre.insert_sample_route("insert into sample_route(origin,destination,distance,route) values (%s,%s,%s,%s)",all_data)
    # # 初始化樣表
    # result =op_postgre.ini_table("sample_route")
    # if result['code']=='0000':
    #     print("運算元據{}條，操作：{}!".format(result['data'],result['message']))
    # else:
    #     print(result)

我們嘗試測試0級別地城市：

(env) PS F:\workspace> & f:/workspace/env/Scripts/python.exe f:/workspace/城市距離爬取/main.py
所有地點個數 3206
第0級別的資料 340
第1級別的資料 2866
57630
第0級別的資料 340
第1級別的資料 2866
4
0:00:02.515260
0

握手原則，list_0中1-1對應地關係又57630條，2秒跑完，但是我們還沒存到postgresql資料庫中，讓我們來試一下啊。。。

我們之前已經對opsql地函式進行了重寫，這裡就沒必要了！

直接i修改test3()函式：

def test_3():
     # generate_sampledata()
    op_postgre = opsql.OperationDbInterface()
    addList = op_postgre.select_all("sample_data")['data']
    test = geo_map.Geo_mapInterface()
    print("所有地點個數",len(addList))
    # 第0級別的資料
    list_0 =[list(i) for i in addList if i[1] == 0]
    print("第0級別的資料",len(list_0))
    # 第1級別的資料
    list_1 =[list(i) for i in addList if i[1] == 1]
    print("第1級別的資料",len(list_1))
    # print(test.key)
    # print(list_0)
    start = datetime.datetime.now()
    all_test = test.get_disViaCoordinates(list_0)
    print(len(all_test['origin']))
    # all_test = {'origin': ['湖北省武漢市江岸區', '湖北省武漢市江岸區', '湖北省武漢市江漢區'], 'destination': ['湖北省武漢市江漢區', '湖北省武漢市喬口區', '湖北省武漢市喬口區'], 'distance': ['1520', '9197', '7428'], 'route': [['臺北一路', '新華路'], ['臺北一路', '臺北路', '解放大道', '解放大道', '解放大道', '解放大道', '解放大道', '解放大道', '解放大道', '解放大道', '二環線輔路', '沿河大道'], ['新華路', '建設大道', '建設大道', '建設大道', '建設大道', '沿河大道']]}
    print("第0級別的資料",len(list_0))
    print("第1級別的資料",len(list_1))

    print(len(all_test))
    print(datetime.datetime.now() - start)

    all_data = [[str(all_test[i][x]) for i in all_test] for x in range(len(all_test['origin']))]
    print(len(all_data))
    result = op_postgre.insert_sample_route("insert into sample_route(origin,destination,distance,route) values (%s,%s,%s,%s)",all_data)
    # # 初始化樣表
    # result =op_postgre.ini_table("sample_route")
    if result['code']=='0000':
        print("運算元據{}條，操作：{}!".format(result['data'],result['message']))
    else:
        print(result)

呼叫並執行：

(env) PS F:\workspace> & f:/workspace/env/Scripts/python.exe f:/workspace/城市距離爬取/main.py
所有地點個數 3206
第0級別的資料 340
第1級別的資料 2866
57630
57630
57630
第0級別的資料 340
第1級別的資料 2866
4
0:00:02.049519
57630
運算元據57630條，操作：執行批量插入操
作成功!
57630

檢視資料庫57000多條資料存入完畢！

運算元據57630條，操作：執行批量插入操
作成功!
0:00:03.419934
57630

搞定！！！?

後面看了看程式碼，當處理list_0的時候，我們需要通過握手原則來生成0級城市和0級城市的11對應關係，
在之前的程式碼上我做了優化：

    def get_disViaCoordinates(self,
    addList:"一個列表存放地址資料",
    ) ->  "{'origin':[],'destination':[],'distance':[],'route':[]}":
        dict_route = {'origin':[],'destination':[],'distance':[],'route':[]}
        def test(origin,destination):
            # API炸了，先用假資料吧
            # url2='https://restapi.amap.com/v3/direction/driving?origin='+origin+'&destination='+destination+'&extensions=all&output=json&key=3e2235273dd2c0ca2421071fbb96def4'
            # #編碼
            # newUrl2 = parse.quote(url2, safe="/:=&?#+!$,;'@()*[]")
            # #傳送請求
            # response2 = urllib.request.urlopen(newUrl2)
            # #接收資料
            # data2 = response2.read()
            # #解析json檔案
            # jsonData2 = json.loads(data2)
            # #輸出該json中所有road的值
            # # print(jsonData2)
            # road=jsonpath.jsonpath(jsonData2,'$..road')
            # #從json檔案中提取距離
            # distance = jsonData2['route']['paths'][0]['distance']
            # 假資料
            distance = 999
            road = ['路徑1', '路徑2', '路徑3', ]
            # time.sleep(0.0001)
            # print(origin,destination)
            #字典dict_route中追加資料
            dict_route['origin'].append(origin)
            dict_route['destination'].append(destination)
            dict_route['distance'].append(distance)
            dict_route['route'].append(road)

        def base(a,b=0):
            for i in range(len(a)-1):
                origin = a[b][0]
                destination = a[i+1][0]
                self.all_task.append(executor.submit(test,origin,destination))
                # print(a[b],a[i+1])
                #
                if i+1 == len(a)-1:
                    b += 1
                    a = a[b:]
                    if len(a) == 1:
                        break
                    base(a,b=0)

        # print(addList)
        executor = ThreadPoolExecutor(max_workers=99)
        # 當城市數量級比較小的時候
        if len(addList)<=1000:
            base(addList)
            wait(self.all_task, return_when=ALL_COMPLETED)
            print(len(self.all_task))

        else:
           pass

        # print(len(dict_route['origin']))
        return dict_route

從時間複雜度的角度來看，執行的複雜度比之前少了一個級別。nlogn。之前是n^2

但是當我們想去操作1級別的城市的時候，發現電腦記憶體不足，不過一想也是。

1級別城市2800多個但是0級別的只有300多個。當握手原理去匹配資料之後，數量級肯定不是一個單位。

腦殼疼?

最後丟擲一個討論，我們利用公式（握手原理，滿無向圖的邊為N(N-1)/2）算出，2800多條城市1-1關係為400w條，如何更好的插入資料庫呢？

本作品系原創（除文中說明外），採用《CC 協議》許可，轉載必須註明作者和本文連結

文章來源首發於我的部落格Stray_Camel(＾Ｕ＾)ノ~ＹＯ

Django資料庫效能優化之 - 使用Python集合操作
2022-07-05
Django資料庫優化Python
python使用多執行緒備份資料庫
2021-05-29
Python執行緒資料庫
javascript執行緒及與執行緒有關的效能優化
2019-01-05
JavaScript執行緒優化
資料庫效能優化2
2020-09-12
資料庫優化
Python 連線mysql資料庫進行操作
2021-09-09
PythonMySql資料庫
【PG執行計劃】Postgresql資料庫執行計劃統計資訊簡述
2019-01-10
SQL資料庫
【實戰】利用多執行緒優化查詢百萬級資料
2021-11-19
執行緒優化
postgresql 資料庫基本操作
2018-06-19
SQL資料庫
springboot配置執行緒池使用多執行緒插入資料
2019-01-19
Spring Boot執行緒
Java 執行緒與同步的效能優化
2020-07-19
Java執行緒優化
達夢資料庫執行緒簡單介紹
2020-09-30
資料庫執行緒
11. shell多執行緒備份資料庫
2019-01-29
執行緒資料庫
03—執行時資料區概述及執行緒
2020-10-25
執行緒
基於python的ftp(含資料庫、含多執行緒、含socketserver)
2021-01-12
PythonFTP資料庫執行緒Server
Mysql資料庫的join演算法介紹，優美的執行優化
2019-03-25
MySql資料庫演算法優化
資料庫效能優化有哪些方式
2019-02-25
資料庫優化
Shell多執行緒備份資料庫的指令碼
2019-01-07
執行緒資料庫指令碼
使用python進行Oracle資料庫效能趨勢分析
2018-06-14
PythonOracle資料庫
ABP預設模板修改預設資料庫型別並初始化資料庫資料
2024-08-14
資料庫型別
如何使用python多執行緒有效爬取大量資料？
2021-09-11
Python執行緒
Python 利用pymysql和openpyxl操作MySQL資料庫並插入Excel資料
2023-10-30
PythonMySql資料庫Excel
【python】爬取疫情資料並進行視覺化
2020-09-24
Python視覺化
如何爬取 python 進行多執行緒跑資料的內容
2023-11-09
Python執行緒
資料庫查詢優化：使用explain分析sql語句執行效率
2018-06-05
資料庫優化AISQL
（MFC）子執行緒的資料如何傳遞給主執行緒中？
2018-08-02
執行緒
資料庫效能優化-索引與sql相關優化
2018-08-01
資料庫優化索引SQL
[原始碼解析] PyTorch分散式優化器(2)----資料並行優化器
2021-12-08
原始碼PyTorch分散式優化並行
[譯] 在 Python 中，如何運用 Dask 資料進行並行資料分析
2018-12-24
Python並行
大資料之執行緒高階部分
2020-03-11
大資料執行緒
PostgreSQL DBA(9) - 執行計劃資料結構
2018-09-29
SQL資料結構
Python資料預處理:Dask和Numba並行化加速!
2018-06-06
Python並行
Kettle資料庫資源庫連線執行示例
2020-10-11
資料庫
如何對分散式 NewSQL 資料庫 TiDB 進行效能調優
2018-04-10
分散式SQL資料庫TiDB
資料庫優化
2018-05-27
資料庫優化
mybatis（問題1）：更改操作執行了，資料庫中卻未執行
2020-12-18
MyBatis資料庫
Python 操作 SQLite 資料庫
2018-12-07
PythonSQLite資料庫
Python操作SQLite資料庫
2019-05-14
PythonSQLite資料庫
python操作mongodb資料庫
2024-10-23
PythonMongoDB資料庫

python 操作 PostgreSQL 資料庫，執行緒並行修改 5w 條資料，效能優化

獲取新xls表中的所有資料並整理為列表形式返回

同樣採用執行緒並行池，來op資料庫，修改之前的程式碼，優化?

從資料庫中調取地點資料，並1-1地來通過高德API來獲取距離和路徑，處理少量資料（0級城市）

相關文章