Oracle統計分析函式集,over(partition by..) 的運用

rainlover發表於2010-10-10

oracle的分析函式over 及開窗函式
一:分析函式over
Oracle從8.1.6開始提供分析函式,分析函式用於計算基於組的某種聚合值,它和聚合函式的不同之處是
對於每個組返回多行,而聚合函式對於每個組只返回一行。
下面通過幾個例子來說明其應用。                                      
1:統計某商店的營業額。       
     date       sale
     1           20
     2           15
     3           14
     4           18
     5           30
    規則:按天統計:每天都統計前面幾天的總額
    得到的結果:
    DATE   SALE       SUM
    ----- -------- ------
    1      20        20           --1天          
    2      15        35           --1天+2天          
    3      14        49           --1天+2天+3天          
    4      18        67            .         
    5      30        97            .
    
2:統計各班成績第一名的同學資訊
    NAME   CLASS S                        
    ----- ----- ----------------------
    fda    1      80                    
    ffd    1      78                    
    dss    1      95                    
    cfe    2      74                    
    gds    2      92                    
    gf     3      99                    
    ddd    3      99                    
    adf    3      45                    
    asdf   3      55                    
    3dd    3      78             
  
    通過:  
    --
    select * from                                                                      
    (                                                                           
    select name,class,s,rank()over(partition by class order by s desc) mm from t2
    )                                                                           
    where mm=1
    --
    得到結果:
    NAME   CLASS S                       MM                                                                                       
    ----- ----- ---------------------- ----------------------
    dss    1      95                      1                     
    gds    2      92                      1                     
    gf     3      99                      1                     
    ddd    3      99                      1         
  
    注意:
    1.在求第一名成績的時候,不能用row_number(),因為如果同班有兩個並列第一,row_number()只返回一個結果         
    2.rank()和dense_rank()的區別是:
      --rank()是跳躍排序,有兩個第二名時接下來就是第四名
      --dense_rank()l是連續排序,有兩個第二名時仍然跟著第三名
    
    
3.分類統計 (並顯示資訊)
    A   B   C                     
    -- -- ----------------------
    m   a   2                     
    n   a   3                     
    m   a   2                     
    n   b   2                     
    n   b   1                     
    x   b   3                     
    x   b   2                     
    x   b   4                     
    h   b   3
   select a,c,sum(c)over(partition by a) from t2               
   得到結果:
   A   B   C        SUM(C)OVER(PARTITIONBYA)     
   -- -- ------- ------------------------
   h   b   3        3                       
   m   a   2        4                       
   m   a   2        4                       
   n   a   3        6                       
   n   b   2        6                       
   n   b   1        6                       
   x   b   3        9                       
   x   b   2        9                       
   x   b   4        9                       

   如果用sum,group by 則只能得到
   A   SUM(C)                           
   -- ----------------------
   h   3                     
   m   4                     
   n   6                     
   x   9                     
   無法得到B列值      

=====
select * from test

資料:
A B C
1 1 1
1 2 2
1 3 3
2 2 5
3 4 6


---將B欄位值相同的對應的C 欄位值加總
select a,b,c, SUM(C) OVER (PARTITION BY B) C_Sum
from test

A B C C_SUM
1 1 1 1
1 2 2 7
2 2 5 7
1 3 3 3
3 4 6 6

---如果不需要已某個欄位的值分割,那就要用 null

eg: 就是將C的欄位值summary 放在每行後面

select a,b,c, SUM(C) OVER (PARTITION BY null) C_Sum
from test

A B C C_SUM
1 1 1 17
1 2 2 17
1 3 3 17
2 2 5 17
3 4 6 17

求個人工資佔部門工資的百分比

SQL> select * from salary;

NAME DEPT SAL
---------- ---- -----
a 10 2000
b 10 3000
c 10 5000
d 20 4000

SQL> select name,dept,sal,sal*100/sum(sal) over(partition by dept) percent from salary;

NAME DEPT SAL PERCENT
---------- ---- ----- ----------
a 10 2000 20
b 10 3000 30
c 10 5000 50
d 20 4000 100

二:開窗函式          
      開窗函式指定了分析函式工作的資料視窗大小,這個資料視窗大小可能會隨著行的變化而變化,舉例如下:
1:    
   over(order by salary) 按照salary排序進行累計,order by是個預設的開窗函式
   over(partition by deptno)按照部門分割槽
2:
over(order by salary range between 5 preceding and 5 following)
   每行對應的資料視窗是之前行幅度值不超過5,之後行幅度值不超過5
   例如:對於以下列
     aa
     1
     2
     2
     2
     3
     4
     5
     6
     7
     9
  
   sum(aa)over(order by aa range between 2 preceding and 2 following)
   得出的結果是
            AA                       SUM
            ---------------------- -------------------------------------------------------
            1                       10                                                     
            2                       14                                                     
            2                       14                                                     
            2                       14                                                     
            3                       18                                                     
            4                       18                                                     
            5                       22                                                     
            6                       18                                                               
            7                       22                                                               
            9                       9                                                                
            
   就是說,對於aa=5的一行 ,sum為   5-1<=aa<=5+2 的和
   對於aa=2來說 ,sum=1+2+2+2+3+4=14     ;
   又如 對於aa=9 ,9-1<=aa<=9+2 只有9一個數,所以sum=9    ;
             
3:其它:
     over(order by salary rows between 2 preceding and 4 following)
          每行對應的資料視窗是之前2行,之後4行
4:下面三條語句等效:          
     over(order by salary rows between unbounded preceding and unbounded following)
          每行對應的資料視窗是從第一行到最後一行,等效:
     over(order by salary range between unbounded preceding and unbounded following)
           等效
     over(partition by null)

sum(nid) over(partition by v1 order by nid)(2007-05-16 16:22:48)
  分類:sql語句
 

SQL> select n1,v1,nid,sum(nid) over(order by nid) as sum
  2  from t1;

        N1 V1                NID        SUM
---------- ---------- ---------- ----------
         1 aa                 61         61
         2 aa                 62        123
         3 aa                 63        186
         4 aa                 64        250

取nid列的累積和,即下面以emp表為例的按部門“連續”求總和

 ==================================================================

按v1分組取nid的和

SQL> select v1,sum(nid) over (partition by v1 order by v1) as sum_nid from t1;

V1            SUM_NID
---------- ----------
aa                187
aa                187
aa                187
bb                 83

按v1分組取nid的和,並重復行只顯示一行

SQL> select distinct * from (select v1,sum(nid) over (partition by v1) as sum_nid from t1);

V1            SUM_NID
---------- ----------
aa                187
bb                 83

==================================================================

再以emp為例

使用子分割槽查出各部門薪水連續的總和。注意按部門分割槽 over(...)條件的不同
sum(sal) over (partition by deptno order by ename) 按部門“連續”求總和
sum(sal) over (partition by deptno) 按部門求總和
sum(sal) over (order by deptno,ename) 不按部門“連續”求總和
sum(sal) over () 不按部門,求所有員工總和,效果等同於sum(sal)。

sql> break on deptno skip 1  -- 為效果更明顯,把不同部門的資料隔段顯示。

SQL> select deptno,ename,sal,
  2      sum(sal) over (partition by deptno order by ename) 部門連續求和,
  3      sum(sal) over (partition by deptno) 部門總和,
  4      100*round(sal/sum(sal) over (partition by deptno),4) 部門份額,
  5      sum(sal) over () 總和,
  6      sum(sal) over (order by deptno,ename) 連續求和,
  7      100*round(sal/sum(sal) over (),4) 總份額
  8  from emp;

    DEPTNO ENAME             SAL 部門連續求和   部門總和   部門份額       總和   連續求和     總份額

---------- ---------- ---------- ------------ ---------- ---------- ---------- ---------- ----------

        10 CLARK            2450         2450       8750         28      29025       2450       8.44

           KING             5000         7450       8750      57.14      29025       7450      17.23

           MILLER           1300         8750       8750      14.86      29025       8750       4.48


        20 ADAMS            1100         1100      10875      10.11      29025       9850       3.79

           FORD             3000         4100      10875      27.59      29025      12850      10.34

           JONES            2975         7075      10875      27.36      29025      15825      10.25

           SCOTT            3000        10075      10875      27.59      29025      18825      10.34

           SMITH             800        10875      10875       7.36      29025      19625       2.76


        30 ALLEN            1600         1600       9400      17.02      29025      21225       5.51

           BLAKE            2850         4450       9400      30.32      29025      24075       9.82

           JAMES             950         5400       9400      10.11      29025      25025       3.27

           MARTIN           1250         6650       9400       13.3      29025      26275       4.31

           TURNER           1500         8150       9400      15.96      29025      27775       5.17

           WARD             1250         9400       9400       13.3      29025      29025       4.31

已選擇14行。

綜合的例子,求和規則有按部門分割槽的,有不分割槽的例子

SQL> select deptno,ename,sum(sal) over(partition by deptno order by sal) as sum_sal,
  2         sum(sal) over(order by deptno,sal) as sum_dept_sal
  3  from emp;

    DEPTNO ENAME         SUM_SAL SUM_DEPT_SAL
---------- ---------- ---------- ------------
        10 MILLER           1300         1300
           CLARK            3750         3750
           KING             8750         8750

        20 SMITH             800         9550
           ADAMS            1900        10650
           JONES            4875        13625
           SCOTT           10875        19625
           FORD            10875        19625

        30 JAMES             950        20575
           WARD             3450        23075
           MARTIN           3450        23075
           TURNER           4950        24575
           ALLEN            6550        26175
           BLAKE            9400        29025


已選擇14行。

來一個逆序的,即部門從大到小排列,部門裡各員工的薪水從高到低排列,累計和的規則不變。

SQL> select deptno,ename,sal,
  2         sum(sal) over (partition by deptno order by deptno desc,sal desc) as sum_sal_order,
  3         sum(sal) over (order by deptno desc,sal desc) as sum
  4  from emp;

    DEPTNO ENAME             SAL SUM_SAL_ORDER        SUM
---------- ---------- ---------- ------------- ----------
        30 BLAKE            2850          2850       2850
           ALLEN            1600          4450       4450
           TURNER           1500          5950       5950
           WARD             1250          8450       8450
           MARTIN           1250          8450       8450
           JAMES             950          9400       9400

        20 SCOTT            3000          6000      15400
           FORD             3000          6000      15400
           JONES            2975          8975      18375
           ADAMS            1100         10075      19475
           SMITH             800         10875      20275

        10 KING             5000          5000      25275
           CLARK            2450          7450      27725
           MILLER           1300          8750      29025


已選擇14行。

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/11320622/viewspace-675686/,如需轉載,請註明出處,否則將追究法律責任。

相關文章