往返讀取後臺資料的代價

潘文佳發表於2012-04-28

資料庫最重要是的為前臺應用服務。在眾多決定應用效能的因素中，如何快速有效從後臺讀取資料很大程度上地影響到最終效果。本文將對不同的資料往返(round-trip)讀取進行比較和歸納總結。最後的結果非常出人意料。往往在時間緊迫的情況下，我們會本能地使用最簡單的方法來完成任務，但是這種編譯習慣會讓我們的前臺應用的效能大打折扣。

返回 15,000 條資料：這個測試會從一個表格裡面讀取15000條資料。我們通過用三種不同的編譯方式來看如何提高資料庫提取的效率。

以下這個指令碼用來建立表格然後放入一百萬條資料。因為我們需要足夠多的資料來完成3個測試，每個測試讀取新鮮的資料，所以建立了一百萬條。我建立的這個列表每15000條資料一小組，這樣確保了測試讀取15000條資料的準確性。不會因為資料的不同，而影響測試的結果。

這個指令碼稍作修改就可以放在MS SQL伺服器上跑：

create table test000 (
    intpk int primary key
   ,filler char(40)
)

--  BLOCK 1, first 5000 rows
--  pgAdmin3: run as pgScript
--  All others: modify as required
--
declare @x,@y;
set @x = 1;
set @y = string(40,40,1);
while @x <= 5000 begin
    insert into test000 (intpk,filler)
    values ((@x-1)*200 +1,'@y');

    set @x = @x + 1;
end

-- BLOCK 2, put 5000 rows aside
--
select  * into test000_temp from test000

-- BLOCK 3, Insert the 5000 rows 199 more
--          times to get 1million altogether
--  pgAdmin3: run as pgScript
--  All others: modify as required
--
declare @x;
set @x = 1;
while @x <= 199 begin
    insert into test000 (intpk,filler)
    select intpk+@x,filler from test000_temp;

    set @x = @x + 1;
end

create table test000 (

intpk int primary key

,filler char(40)

)

-- BLOCK 1, first 5000 rows

-- pgAdmin3: run as pgScript

-- All others: modify as required

declare @x,@y;

set @x = 1;

set @y = string(40,40,1);

while @x <= 5000 begin

insert into test000 (intpk,filler)

values ((@x-1)*200 +1,'@y');

set @x = @x + 1;

end

-- BLOCK 2, put 5000 rows aside

select * into test000_temp from test000

-- BLOCK 3, Insert the 5000 rows 199 more

-- times to get 1million altogether

-- pgAdmin3: run as pgScript

-- All others: modify as required

declare @x;

set @x = 1;

while @x <= 199 begin

insert into test000 (intpk,filler)

select intpk+@x,filler from test000_temp;

set @x = @x + 1;

end

測試－：基本程式碼

最簡單的程式碼就是通過一個直白的查詢語句跑15000次往返。

# Make a database connection
$dbConn = pg_connect("dbname=roundTrips user=postgres");

# Program 1, Individual explicit fetches
$x1 = rand(0,199)*5000 + 1;
$x2 = $x1 + 14999;
echo "\nTest 1, using $x1 to $x2";
$timeBegin = microtime(true);
while ($x1++ <= $x2) {
    $dbResult = pg_exec("select * from test000 where intpk=$x1");
    $row = pg_fetch_array($dbResult);
}
$elapsed = microtime(true)-$timeBegin;
echo "\nTest 1, elapsed time: ".$elapsed;
echo "\n";

# Make a database connection

$dbConn = pg_connect("dbname=roundTrips user=postgres");

# Program 1, Individual explicit fetches

$x1 = rand(0,199)*5000 + 1;

$x2 = $x1 + 14999;

echo "\nTest 1, using $x1 to $x2";

$timeBegin = microtime(true);

while ($x1++ <= $x2) {

$dbResult = pg_exec("select * from test000 where intpk=$x1");

$row = pg_fetch_array($dbResult);

}

$elapsed = microtime(true)-$timeBegin;

echo "\nTest 1, elapsed time: ".$elapsed;

echo "\n";

測試二：準備語句（Prepared Statement）

這個程式碼通過在迴圈前做一個準備語句，雖然還是跑15000個往返，但是每次只是變化準備語句的引數。

# Make a database connection
$dbConn = pg_connect("dbname=roundTrips user=postgres");

# Program 2, Individual fetches with prepared statements
$x1 = rand(0,199)*5000 + 1;
$x2 = $x1 + 14999;
echo "\nTest 2, using $x1 to $x2";
$timeBegin = microtime(true);
$dbResult = pg_prepare("test000","select * from test000 where intpk=$1");
while ($x1++ <= $x2) {
    $pqResult = pg_execute("test000",array($x1));
    $row = pg_fetch_all($pqResult);
}
$elapsed = microtime(true)-$timeBegin;
echo "\nTest 2, elapsed time: ".$elapsed;
echo "\n";

# Make a database connection

$dbConn = pg_connect("dbname=roundTrips user=postgres");

# Program 2, Individual fetches with prepared statements

$x1 = rand(0,199)*5000 + 1;

$x2 = $x1 + 14999;

echo "\nTest 2, using $x1 to $x2";

$timeBegin = microtime(true);

$dbResult = pg_prepare("test000","select * from test000 where intpk=$1");

while ($x1++ <= $x2) {

$pqResult = pg_execute("test000",array($x1));

$row = pg_fetch_all($pqResult);

}

$elapsed = microtime(true)-$timeBegin;

echo "\nTest 2, elapsed time: ".$elapsed;

echo "\n";

測試三：一個往返

我們準備一個語句命令去拿到所有15000條資料，然後把他們一次返回過來。

# Make a database connection
$dbConn = pg_connect("dbname=roundTrips user=postgres");

# Program 3, One fetch, pull all rows
$timeBegin = microtime(true);
$x1 = rand(0,199)*5000 + 1;
$x2 = $x1 + 14999;
echo "\nTest 3, using $x1 to $x2";
$dbResult = pg_exec(
    "select * from test000 where intpk between $x1 and $x2"
);
$allRows = pg_fetch_all($dbResult);
$elapsed = microtime(true)-$timeBegin;
echo "\nTest 3, elapsed time: ".$elapsed;
echo "\n";

# Make a database connection

$dbConn = pg_connect("dbname=roundTrips user=postgres");

# Program 3, One fetch, pull all rows

$timeBegin = microtime(true);

$x1 = rand(0,199)*5000 + 1;

$x2 = $x1 + 14999;

echo "\nTest 3, using $x1 to $x2";

$dbResult = pg_exec(

"select * from test000 where intpk between $x1 and $x2"

);

$allRows = pg_fetch_all($dbResult);

$elapsed = microtime(true)-$timeBegin;

echo "\nTest 3, elapsed time: ".$elapsed;

echo "\n";

結果

一共跑了5次，平均結果如下

基本	準備	一次往返
~1.800 秒	~1.150 秒	~0.045 秒

相比基本程式碼，最後一個一次往返的邏輯快了大概40倍，比用準備語句快了25倍左右。

伺服器和語言是否會影響效能呢？

這個測試是在PHP/PostgresSQL上做的。其他語言和伺服器上會不會得到不同的結果呢？如果是同樣的硬體，有可能這個資料絕對值會有所差異，但是相對的差距應該是差不多。從一個往返里面讀取所有要索引的資料條比人和多次往返的語句都要快。

活用活學

這次測試最顯而易見的結論就是任何多於一條資料的索引都應該使用這個方法。實際上，我們應該把這個設定為預設語法，除非有絕好的理由。那麼有哪些好理由呢？

我跟我們的程式設計師聊過，有一位同學說：“你看，我們的應用每次都是隻要20－100個資料。絕對不會多了。我實在想象不出20－100個資料的讀取值得翻新所有程式碼。”所以我聽了以後，又去試了一下，實際上是這個方法確實只有100以上的才能看見顯著區別。在20的時候，幾乎沒有區別。到了100, 一次往返的比基本的快6倍，比第二種方法快4倍。所以，使用與否的判斷在於個人。

但是這裡還有一個要考慮的因素是有多少同時進行的讀取在進行。如果你的系統是基於實時的設計，那麼就有可能是不同的情況。我們這個測試是基於一個使用者，如果是多個使用者同時讀取，這種使用者行為會帶給資料庫一些額外的負擔，我們需要一個更加巨集觀的環境來比較。

還有一個反對的聲音有可能是“我們是從不同的表格裡面讀取資料。我們在某些執行緒上我們走一條條的，需要從不同的表格裡面一條條讀取。”如果是這樣的話，你絕對需要使用一次往返，用幾個JOIN一次拿到。如果一個表格，都是慢10倍，幾個就慢好幾十倍了。

原文：database-programmer 編譯：伯樂線上 – 潘文佳

【如需轉載，請標註並保留原文連結、譯文連結和譯者等資訊，謝謝合作！】

jquery使用ajax讀取後臺資料在表格中顯示
2017-04-08
jQuery
jquery ajax從後臺讀取的資料無法賦值給變數
2017-03-23
jQuery賦值變數
SAP Cloud for Customer(C4C)前臺顯示的資料是如何從後臺讀取的
2021-04-06
Cloud
替換資料庫的代價與真假國產
2023-11-01
資料庫
Jmeter 從 CSV 中讀取的資料後多了一個空格
2024-04-22
JMeter
spark讀取hbase的資料
2019-04-05
Spark
Fiori Launchpad 點選 tile 之後，讀取業務資料呼叫的是哪個後臺系統的 OData 服務
2023-04-04
讀取CSV資料
2020-10-12
excel 資料讀取
2012-12-24
Excel
登入驗證判斷，獲取後臺資料
2019-06-30
java後臺建立url連線，獲取介面資料
2018-01-05
Java
前臺怎樣獲取後臺ajax資料簡單介紹
2017-02-10
php mysqli query 查詢資料庫後讀取內容的方法
2014-05-22
PHPMySql資料庫
QTP讀取Excel資料的方法
2008-05-20
QTExcel
如何找出 SAP電商雲產品明細頁面讀取後臺資料的程式碼具體位置
2021-06-04
Spark讀取MySQL資料
2020-12-31
SparkMySql
讀取JSON資料
2020-10-12
JSON
PHPExcel讀取excel資料
2017-10-08
PHPExcel
利用反射讀取資料庫資料
2020-04-04
反射資料庫
sqlserver讀取oracle資料庫資料
2024-03-11
SQLServerOracle資料庫
PbootCMS後臺自動清理快取runtime資料夾
2024-10-25
boot快取
使用 jQuery 讀取 Vue 元件的資料
2018-12-31
jQueryVue元件
eazyexcel 讀取excel資料插入資料庫
2020-11-04
Excel資料庫
POI 分批讀取Excel資料
2022-05-17
Excel
Jsp讀取MySQL資料
2018-12-30
JSMySql
Spark讀取elasticsearch資料指南
2022-06-08
SparkElasticsearch
python讀取MySQL資料
2021-01-02
PythonMySql
TensorFlow讀取CSV資料
2017-05-12
建造者模式讀取資料
2024-07-30
模式
讀取資料夾檔案
2024-05-31
在 Excel 裡使用 ODBC 讀取 SAP BTP 平臺上 CDS view 的資料
2021-09-09
ExcelView
js前臺如何使用後臺返回的資料
2017-03-16
JS
如何解析 Ethereum 資料:讀取 LevelDB 資料
2019-11-01
用Python讀取excel中的資料
2014-04-14
PythonExcel
LINQ讀取簡單的XML資料
2012-08-02
XML
用sessionBean讀取文字資料的問題
2003-09-28
SessionBean
一段讀取資料表的例子
2007-07-01
Java讀取暫存器資料的方法
2024-09-16
Java

往返讀取後臺資料的代價

相關文章