Teradata 之top n與sample n
Teradata取n條樣本資料的方法有兩種:
select top x * from table;
select * from table sample n;
那麼二者有什麼區別呢?說明如下:
TOP N
如果有Order By關鍵字首先要對資料庫的資料進行排序,然後獲取N條資料或者抽樣比率為N;
如果沒有Order By關鍵字,要做一次STAT FUNCTION的全AMP收集,然後選擇某一個或者幾個AMP 來抽取資料。
Sample N
首先要對資料庫的資料進行全表掃描,然後獲取N資料;
採用的是Sampling的形式。
效能比較:
當資料量比較小的時候,TOP N的速度要比Sample的速度快;
當資料量比較大的時候,Sample N的速度要比TOP的速度快。
測試:
Explain select top 10 * from PD_PORTAL.TOPIC_COMP_DETAIL
1) First, we lock a distinct PD_PORTAL."pseudo table" for read on a
RowHash to prevent global deadlock for PD_PORTAL.TOPIC_COMP_DETAIL.
2) Next, we lock PD_PORTAL.TOPIC_COMP_DETAIL for read.
3) We do an all-AMPs STAT FUNCTION step from
PD_PORTAL.TOPIC_COMP_DETAIL by way of an all-rows scan with no
residual conditions into Spool 5, which is redistributed by hash
code to all AMPs. The result rows are put into Spool 1
(group_amps), which is built locally on the AMPs. This step is
used to retrieve the TOP 10 rows. One AMP is randomly selected to
retrieve 10 rows. If this step retrieves less than 10 rows, then
execute step 4. The size is estimated with low confidence to be
10 rows (27,460 bytes).
4) We do an all-AMPs STAT FUNCTION step from
PD_PORTAL.TOPIC_COMP_DETAIL by way of an all-rows scan with no
residual conditions into Spool 5 (Last Use), which is
redistributed by hash code to all AMPs. The result rows are put
into Spool 1 (group_amps), which is built locally on the AMPs.
This step is used to retrieve the TOP 10 rows. The size is
estimated with low confidence to be 10 rows (27,460 bytes).
5) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of
statement 1.
Explain select * from PD_PORTAL.TOPIC_COMP_DETAIL sample 10
1) First, we lock a distinct PD_PORTAL."pseudo table" for read on a
RowHash to prevent global deadlock for PD_PORTAL.TOPIC_COMP_DETAIL.
2) Next, we lock PD_PORTAL.TOPIC_COMP_DETAIL for read.
3) We do an all-AMPs SAMPLING step from PD_PORTAL.TOPIC_COMP_DETAIL
by way of an all-rows scan with no residual conditions into Spool
1 (group_amps), which is built locally on the AMPs. Samples are
specified as a number of rows.
4) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of
statement 1.
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/16723161/viewspace-1040546/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- TOP N 查詢 SQLSQL
- rownum—top-N分析
- Dynamics CRM實體系列之1:N、N:1以及N:N關係
- 訊號與槽N對N
- 分組求TOP N記錄
- Storm常見模式2——TOP N介紹ORM模式
- 在MySQL中,如何實現Top N及M至N段的記錄查詢?MySql
- MySQL中資料型別(char(n)、varchar(n)、nchar(n)、nvarchar(n)的區別)MySql資料型別
- 階乘之和 輸入n,計算S=1!+2!+3!+…+n!的末6位(不含前導0)。n≤10 6 ,n!表示 前n個正整數之積。
- imp INDEXES=N CONSTRAINTS=NIndexAI
- 推薦系統入門(Top-N recommendation)
- LeetCode之N-Repeated Element in Size 2N Array(Kotlin)LeetCodeKotlin
- Linux 下vi 學習之查詢 - 向前 向後 n 及NLinux
- mysql的varchar(N)和int(N)的含義及其與char區別MySql
- 計算2的N次冪n 可輸入,n為自然數
- 關於宏定義 Bin(n),LongToBin(n),LongToBin(0x##n##L)
- 利用javascript獲取圖片的top N主色值JavaScript
- 降序索引應用在Top N問題上的應用索引
- 獲取top N cpu pid的sql資訊指令碼SQL指令碼
- VARCHAR2(N CHAR)與VARCHAR2(N)的區別[Oracle基礎]Oracle
- ${string::N}和${string:N}字元提取字元
- char(n)和varchar2(n)區別
- hive經典案列--top N(行轉列\列轉行)Hive
- C4top-N個數求和 (分數求和模擬)
- CSS perspective(n)CSS
- SqlServer N字首SQLServer
- NLP入門之N元語法模型模型
- N層結構與中介軟體(zt)
- 360N7對比N6 Pro的區別對比 60N7和360N6 Pro哪個好?
- ASE160N08-ASEMI低壓N溝道MOS管ASE160N08
- ASE180N08-ASEMI低壓N溝道MOS管ASE180N08
- Laravel 5 關聯查詢 —— N 對 N 簡單例子Laravel單例
- 面試官:MySQL 中 varchar(n) 中 n 最大取值為多少?面試MySql
- 【原創】生成n*n蛇形矩陣的演算法矩陣演算法
- 行列式求值,從 $n!$ 最佳化到 $n^3$
- n元語法
- N皇后問題
- NuxtI18nUX