【開發篇sql】 條件和表示式(九) group by語句

yellowlee發表於2010-05-17

9group by語句

Group by主要用來分組統計,這個是開發中經常被使用的語句,先來看幾個語句的執行情況:

例如求各部門內的最大工錢值,根據部門來分組統計:

SQL> select deptno,max(sal) from scott.emp group by deptno;

 

    DEPTNO   MAX(SAL)

---------- ----------

        30       2850

        20       3000

        10       5000

 

 

Execution Plan

----------------------------------------------------------

Plan hash value: 4067220884

 

---------------------------------------------------------------------------

| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |

---------------------------------------------------------------------------

|   0 | SELECT STATEMENT   |      |     3 |    21 |     4  (25)| 00:00:01 |

|   1 |  HASH GROUP BY     |      |     3 |    21 |     4  (25)| 00:00:01 |

|   2 |   TABLE ACCESS FULL| EMP  |    14 |    98 |     3   (0)| 00:00:01 |

---------------------------------------------------------------------------

 

 

Statistics

----------------------------------------------------------

          1  recursive calls

          0  db block gets

          3  consistent gets

          0  physical reads

          0  redo size

        524  bytes sent via SQL*Net to client

        400  bytes received via SQL*Net from client

          2  SQL*Net roundtrips to/from client

          0  sorts (memory)

          0  sorts (disk)

          3  rows processed

增加一個order by deptno子句

SQL> select deptno,max(sal) from scott.emp group by deptno order by deptno;

 

    DEPTNO   MAX(SAL)

---------- ----------

        10       5000

        20       3000

        30       2850

 

 

Execution Plan

----------------------------------------------------------

Plan hash value: 15469362

 

---------------------------------------------------------------------------

| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |

---------------------------------------------------------------------------

|   0 | SELECT STATEMENT   |      |     3 |    21 |     4  (25)| 00:00:01 |

|   1 |  SORT GROUP BY     |      |     3 |    21 |     4  (25)| 00:00:01 |

|   2 |   TABLE ACCESS FULL| EMP  |    14 |    98 |     3   (0)| 00:00:01 |

---------------------------------------------------------------------------

 

 

Statistics

----------------------------------------------------------

          1  recursive calls

          0  db block gets

          3  consistent gets

          0  physical reads

          0  redo size

        524  bytes sent via SQL*Net to client

        400  bytes received via SQL*Net from client

          2  SQL*Net roundtrips to/from client

          1  sorts (memory)

          0  sorts (disk)

          3  rows processed

增加一個order by max(sal)子句

SQL> select deptno,max(sal) from scott.emp group by deptno order by max(sal);

 

    DEPTNO   MAX(SAL)

---------- ----------

        30       2850

        20       3000

        10       5000

 

 

Execution Plan

----------------------------------------------------------

Plan hash value: 2664716850

 

----------------------------------------------------------------------------

| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |

----------------------------------------------------------------------------

|   0 | SELECT STATEMENT    |      |     3 |    21 |     5  (40)| 00:00:01 |

|   1 |  SORT ORDER BY      |      |     3 |    21 |     5  (40)| 00:00:01 |

|   2 |   HASH GROUP BY     |      |     3 |    21 |     5  (40)| 00:00:01 |

|   3 |    TABLE ACCESS FULL| EMP  |    14 |    98 |     3   (0)| 00:00:01 |

----------------------------------------------------------------------------

 

 

Statistics

----------------------------------------------------------

          1  recursive calls

          0  db block gets

          3  consistent gets

          0  physical reads

          0  redo size

        524  bytes sent via SQL*Net to client

        400  bytes received via SQL*Net from client

          2  SQL*Net roundtrips to/from client

          1  sorts (memory)

          0  sorts (disk)

          3  rows processed

上述三個語句略有不同,不過都是從全表掃描emp表開始執行,然後在此基礎上進行group by或者order by,具體方式有所不同,這裡僅僅把執行計劃和統計資訊給出來,具體在效能調整一章中進行詳細測試分析與敘述。

主要來看看group by的增強語句,例如:having 子句,grouping sets,roll up,cube,group_id

Having 子句

改進上述查詢語句,只需要工錢大於等於3000的記錄:

SQL>  select deptno,max(sal) from scott.emp group by deptno having max(sal) >=3000 order by max(sal);

 

    DEPTNO   MAX(SAL)

---------- ----------

        20       3000

        10       5000

 

 

Execution Plan

----------------------------------------------------------

Plan hash value: 3611938775

 

-----------------------------------------------------------------------------

| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |

-----------------------------------------------------------------------------

|   0 | SELECT STATEMENT     |      |     1 |     7 |     5  (40)| 00:00:01 |

|   1 |  SORT ORDER BY       |      |     1 |     7 |     5  (40)| 00:00:01 |

|*  2 |   FILTER             |      |       |       |            |          |

|   3 |    HASH GROUP BY     |      |     1 |     7 |     5  (40)| 00:00:01 |

|   4 |     TABLE ACCESS FULL| EMP  |    14 |    98 |     3   (0)| 00:00:01 |

-----------------------------------------------------------------------------

Rollup語句

Rollup可以理解為一維多層的統計,往往在需要按層次統計的時候用到。

修改一下需求,現在需要按照部門,分工作職位來統計工錢的總和,並求全公司之和,可以簡單的使用rollup來完成:

SQL> select a.deptno, b.job, sum(b.sal)

  2    from scott.dept a, scott.emp b

  3   where a.deptno = b.deptno

  4   group by rollup(a.deptno, b.job);

 

    DEPTNO JOB       SUM(B.SAL)

---------- --------- ----------

        10 CLERK           1300

        10 MANAGER         2450

        10 PRESIDENT       5000

        10                 8750

        20 CLERK           1900

        20 ANALYST         6000

        20 MANAGER         2000

        20                 9900

        30 CLERK            950

        30 MANAGER         2850

        30 SALESMAN        5600

 

    DEPTNO JOB       SUM(B.SAL)

---------- --------- ----------

        30                 9400

                          28050

 

13 rows selected.

可以看到使用rollup使得sum增加了聚合的級別,即實現了對多層進行分組統計計算。

現在稍稍修改一下需求,不需要對全公司進行統計,只需要計算各部門的總和和部門下不同工種的求和,那麼修改後的語句可以如下:

 

SQL> select a.deptno, b.job, sum(b.sal)

  2    from scott.dept a, scott.emp b

  3   where a.deptno = b.deptno

  4   group by a.deptno,rollup( b.job);

 

    DEPTNO JOB       SUM(B.SAL)

---------- --------- ----------

        10 CLERK           1300

        10 MANAGER         2450

        10 PRESIDENT       5000

        10                 8750

        20 CLERK           1900

        20 ANALYST         6000

        20 MANAGER         2000

        20                 9900

        30 CLERK            950

        30 MANAGER         2850

        30 SALESMAN        5600

 

    DEPTNO JOB       SUM(B.SAL)

---------- --------- ----------

        30                 9400

 

12 rows selected.

 

再增加一個工頭的欄位,需要知道不同的工頭下面工人的情況:

SQL> select a.deptno, b.job,b.mgr, sum(b.sal)

  2    from scott.dept a, scott.emp b

  3   where a.deptno = b.deptno

  4   group by a.deptno,rollup( b.job,b.mgr);

 

    DEPTNO JOB              MGR SUM(B.SAL)

---------- --------- ---------- ----------

        10 CLERK           7782       1300

        10 CLERK                      1300

        10 MANAGER         7839       2450

        10 MANAGER                    2450

        10 PRESIDENT                  5000

        10 PRESIDENT                  5000

        10                            8750

        20 CLERK           7788       1100

        20 CLERK           7902        800

        20 CLERK                      1900

        20 ANALYST         7566       6000

 

    DEPTNO JOB              MGR SUM(B.SAL)

---------- --------- ---------- ----------

        20 ANALYST                    6000

        20 MANAGER         7839       2000

        20 MANAGER                    2000

        20                            9900

        30 CLERK           7698        950

        30 CLERK                       950

        30 MANAGER         7839       2850

        30 MANAGER                    2850

        30 SALESMAN        7698       5600

        30 SALESMAN                   5600

        30                            9400

 

22 rows selected.

可以這樣理解上面rollup語句:

group by col1,rollup(col2,col3)會進行如下計算:

(col1,col2,col3)

(col1,col2)

(col1)

 

Cube語句

Cube的意思是立方,主要是用來進行多維度的統計的,Cube有時也可以代替rollup,比如group by col1,rollup(col2)就可以用group by col1,cube(col2)來代替,可以用下面的語句帶實現前面rollup的實現2

SQL> select a.deptno, b.job, sum(b.sal)

  2    from scott.dept a, scott.emp b

  3   where a.deptno = b.deptno

  4   group by a.deptno,cube( b.job);

 

    DEPTNO JOB       SUM(B.SAL)

---------- --------- ----------

        10                 8750

        10 CLERK           1300

        10 MANAGER         2450

        10 PRESIDENT       5000

        20                 9900

        20 CLERK           1900

        20 ANALYST         6000

        20 MANAGER         2000

        30                 9400

        30 CLERK            950

        30 MANAGER         2850

 

    DEPTNO JOB       SUM(B.SAL)

---------- --------- ----------

        30 SALESMAN        5600

 

12 rows selected.

可以看到除了結果集的排序不同以外,其他均相同。這時候使用rollupcube分別進行了如下的計算:

Rollup:

(col1,col2)

(col1)

Cube:

(col1)

(col1,col2)

注意上述表示式的順序

但是cubegroup by col1,cube(col2,col3)rollup差別較大,事實上它的計算如下:

(col1)

(col1,col3)

(col1,col2)

(col1,col2,col3)

group by cube(col1,col2,col3)則會計算23次方次,即維度為3.計算如下:

(col1),(col2),(col3),(col2,col3),(col1,col2),(col1,col3),(col1,col2,col3),()

 

Grouping sets語句

如果說rollupcubeoracle預定義了的計算維度,那麼grouping sets則可以理解為可以自己設定計算維度的一個表示式,用下面一個例子來看:

SQL> select a.deptno, b.job,b.mgr, sum(b.sal)

  2    from scott.dept a, scott.emp b

  3   where a.deptno = b.deptno

  4   group by a.deptno,cube( b.job,b.mgr)

  5  minus

  6  select a.deptno, b.job,b.mgr, sum(b.sal)

  7    from scott.dept a, scott.emp b

  8   where a.deptno = b.deptno

  9   group by a.deptno,grouping sets((b.job,b.mgr),(b.job),(b.mgr),());

 

no rows selected

說明了group by a.deptno,cube( b.job,b.mgr)

group by a.deptno,grouping sets((b.job,b.mgr),(b.job),(b.mgr),())的結果是完全一樣的,根據對cube的理解則grouping sets也很容易理解,不在累述。

 

Grouping_idgroup_id函式

對於使用cube或者rollup合作而後grouping sets的語句,可以使用group_id()或者grouping_id()來獲得不同的分組:

SQL> select a.deptno, b.job, sum(b.sal),group_id() g_id

  2    from scott.dept a, scott.emp b

  3   where a.deptno = b.deptno

  4   group by a.deptno,cube( a.deptno,b.job);

 

    DEPTNO JOB       SUM(B.SAL)       G_ID

---------- --------- ---------- ----------

        10 CLERK           1300          0

        10 MANAGER         2450          0

        10 PRESIDENT       5000          0

        20 CLERK           1900          0

        20 ANALYST         6000          0

        20 MANAGER         2000          0

        30 CLERK            950          0

        30 MANAGER         2850          0

        30 SALESMAN        5600          0

        10 CLERK           1300          1

        10 MANAGER         2450          1

 

    DEPTNO JOB       SUM(B.SAL)       G_ID

---------- --------- ---------- ----------

        10 PRESIDENT       5000          1

        20 CLERK           1900          1

        20 ANALYST         6000          1

        20 MANAGER         2000          1

        30 CLERK            950          1

        30 MANAGER         2850          1

        30 SALESMAN        5600          1

        10                 8750          0

        20                 9900          0

        30                 9400          0

        10                 8750          1

 

    DEPTNO JOB       SUM(B.SAL)       G_ID

---------- --------- ---------- ----------

        20                 9900          1

        30                 9400          1

 

24 rows selected.

 

SQL> select group_id() gp_id,

  2         grouping_id(a.deptno, b.job,b.mgr) gpp_id,

  3         a.deptno, b.job,b.mgr,sum(b.sal)

  4    from scott.dept a, scott.emp b

  5   where a.deptno = b.deptno

  6   group by a.deptno,grouping sets( a.deptno,b.job,b.mgr);

 

     GP_ID     GPP_ID     DEPTNO JOB              MGR SUM(B.SAL)

---------- ---------- ---------- --------- ---------- ----------

         0          2         20                 7839       2000

         0          2         10                 7839       2450

         0          2         30                 7698       6550

         0          2         20                 7566       6000

         0          2         10                 7782       1300

         0          2         20                 7902        800

         0          2         10                            5000

         0          2         30                 7839       2850

         0          2         20                 7788       1100

         0          1         10 CLERK                      1300

         0          1         10 MANAGER                    2450

 

     GP_ID     GPP_ID     DEPTNO JOB              MGR SUM(B.SAL)

---------- ---------- ---------- --------- ---------- ----------

         0          1         10 PRESIDENT                  5000

         0          3         10                            8750

         0          1         20 CLERK                      1900

         0          1         20 ANALYST                    6000

         0          1         20 MANAGER                    2000

         0          3         20                            9900

         0          1         30 CLERK                       950

         0          1         30 MANAGER                    2850

         0          1         30 SALESMAN                   5600

         0          3         30                            9400

 

21 rows selected.

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/16179598/viewspace-662904/,如需轉載,請註明出處,否則將追究法律責任。

相關文章