Oracle和MySQL分組查詢GROUP BY
Oracle和MySQL分組查詢GROUP BY
(一)Oracle和MySQL中的分組(GROUP BY)有什麼區別?
Oracle對於GROUP BY是嚴格的,所有要SELECT出來的欄位必須在GROUP BY後邊出現,否則會報錯:“ORA-00979: not a GROUP BY expression”。而MySQL則不同,如果SELECT出來的欄位在GROUP BY後面沒有出現,那麼會隨機取出一個值,而這樣查詢出來的資料不準確,語義也不明確。所以,作者建議在寫SQL語句的時候,應該給資料庫一個非常明確的指令,而不是讓資料庫去猜測,這也是寫SQL語句的一個非常良好的習慣。
下面給出一個示例。有一張T_MAX_LHR表,資料如下圖所示,有3個欄位ARTICLE、AUTHOR和PRICE。請選出每個AUTHOR的PRICE最高的記錄(要包含所有欄位)。
ARTICLE |
AUTHOR |
PRICE |
0001 |
B |
3.99 |
0002 |
A |
10.99 |
0003 |
C |
1.69 |
0004 |
B |
19.95 |
0005 |
A |
6.96 |
首先給出建表語句:
CREATE TABLE T_MAX_LHR (ARTICLE VARCHAR2(30),AUTHOR VARCHAR2(30),PRICE NUMBER); --Oracle
--CREATE TABLE T_MAX_LHR (ARTICLE VARCHAR(30),AUTHOR VARCHAR(30),PRICE FLOAT); --MySQL oracle通用
INSERT INTO T_MAX_LHR VALUES ('0001','B',3.99);
INSERT INTO T_MAX_LHR VALUES ('0002','A',10.99);
INSERT INTO T_MAX_LHR VALUES ('0003','C',1.69);
INSERT INTO T_MAX_LHR VALUES ('0004','B',19.95);
INSERT INTO T_MAX_LHR VALUES ('0005','A',6.96);
COMMIT;
SELECT * FROM T_MAX_LHR;
在Oracle中的資料:
LHR@orclasm > SELECT * FROM T_MAX_LHR;
ARTICLE AUTHOR PRICE
-------- -------- ----------
0001 B 3.99
0002 A 10.99
0003 C 1.69
0004 B 19.95
0005 A 6.96
在MySQL中的資料:
mysql> SELECT * FROM T_MAX_LHR;
+---------+--------+-------+
| ARTICLE | AUTHOR | PRICE |
+---------+--------+-------+
| 0001 | B | 3.99 |
| 0002 | A | 10.99 |
| 0003 | C | 1.69 |
| 0004 | B | 19.95 |
| 0005 | A | 6.96 |
+---------+--------+-------+
5 rows in set (0.00 sec)
分析資料後,正確答案應該是:
ARTICLE |
AUTHOR |
PRICE |
0002 |
A |
10.99 |
0003 |
C |
1.69 |
0004 |
B |
19.95 |
對於這個例子,很容易想到的SQL語句如下所示:
SELECT T.ARTICLE,T.AUTHOR,MAX(T.PRICE) FROM T_MAX_LHR T GROUP BY T.AUTHOR;
SELECT * FROM T_MAX_LHR T GROUP BY T.AUTHOR;
在Oracle中執行上面的SQL語句報錯:
LHR@orclasm > SELECT T.ARTICLE,T.AUTHOR,MAX(T.PRICE) FROM T_MAX_LHR T GROUP BY T.AUTHOR;
SELECT T.ARTICLE,T.AUTHOR,MAX(T.PRICE) FROM T_MAX_LHR T GROUP BY T.AUTHOR
*
ERROR at line 1:
ORA-00979: not a GROUP BY expression
LHR@orclasm > SELECT * FROM T_MAX_LHR T GROUP BY T.AUTHOR;
SELECT * FROM T_MAX_LHR T GROUP BY T.AUTHOR
*
ERROR at line 1:
ORA-00979: not a GROUP BY expression
在MySQL中執行同樣的SQL語句不會報錯:
mysql> select version();
+-------------------------------------------+
| version() |
+-------------------------------------------+
| 5.6.21-enterprise-commercial-advanced-log |
+-------------------------------------------+
mysql> select @@sql_mode;
+--------------------------------------------+
| @@sql_mode |
+--------------------------------------------+
| STRICT_TRANS_TABLES,NO_ENGINE_SUBSTITUTION |
+--------------------------------------------+
1 row in set (0.00 sec)
mysql> SELECT T.ARTICLE,T.AUTHOR,MAX(T.PRICE) FROM T_MAX_LHR T GROUP BY T.AUTHOR;
+---------+--------+--------------+
| ARTICLE | AUTHOR | MAX(T.PRICE) |
+---------+--------+--------------+
| 0002 | A | 10.99 |
| 0001 | B | 19.95 |
| 0003 | C | 1.69 |
+---------+--------+--------------+
3 rows in set (0.00 sec)
mysql> SELECT * FROM T_MAX_LHR T GROUP BY T.AUTHOR;
+---------+--------+-------+
| ARTICLE | AUTHOR | PRICE |
+---------+--------+-------+
| 0002 | A | 10.99 |
| 0001 | B | 3.99 |
| 0003 | C | 1.69 |
+---------+--------+-------+
3 rows in set (0.00 sec)
mysql> set sql_mode='STRICT_TRANS_TABLES,NO_ENGINE_SUBSTITUTION,ONLY_FULL_GROUP_BY';
Query OK, 0 rows affected (0.00 sec)
mysql> select @@sql_mode;
+---------------------------------------------------------------+
| @@sql_mode |
+---------------------------------------------------------------+
| ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ENGINE_SUBSTITUTION |
+---------------------------------------------------------------+
1 row in set (0.00 sec)
mysql> SELECT T.ARTICLE,T.AUTHOR,MAX(T.PRICE) FROM T_MAX_LHR T GROUP BY T.AUTHOR;
ERROR 1055 (42000): 'lhrdb.T.ARTICLE' isn't in GROUP BY
mysql> SELECT * FROM T_MAX_LHR T GROUP BY T.AUTHOR;
ERROR 1055 (42000): 'lhrdb.T.ARTICLE' isn't in GROUP BY
可以看出,在MySQL 5.6中,雖然執行不報錯,可以查詢出資料,但是從結果來看資料並不是最終想要的結果,甚至資料是錯亂的。需要注意的是,在MySQL 5.7中,執行SQL語句“SELECT T.ARTICLE,T.AUTHOR,MAX(T.PRICE) FROM T_MAX_LHR T GROUP BY T.AUTHOR;”會報錯,因為在MySQL 5.7中,sql_mode的值中包含了ONLY_FULL_GROUP_BY,含義為對於GROUP BY聚合操作,如果在SELECT中的列沒有在GROUP BY中出現,那麼將認為這個SQL是不合法的,會丟擲錯誤。下面給出幾種正確的寫法(在Oracle和MySQL中均可執行):
(1)使用相關子查詢
SELECT *
FROM T_MAX_LHR T
WHERE (T.AUTHOR, T.PRICE) IN (SELECT NT.AUTHOR, MAX(NT.PRICE) PRICE
FROM T_MAX_LHR NT
GROUP BY NT.AUTHOR)
ORDER BY T.ARTICLE;
SELECT *
FROM T_MAX_LHR T
WHERE T.PRICE = (SELECT MAX(NT.PRICE) PRICE
FROM T_MAX_LHR NT
WHERE T.AUTHOR = NT.AUTHOR)
ORDER BY T.ARTICLE;
(2)使用非相關子查詢
SELECT T.*
FROM T_MAX_LHR T
JOIN (SELECT NT.AUTHOR, MAX(NT.PRICE) PRICE
FROM T_MAX_LHR NT
GROUP BY NT.AUTHOR) T1
ON T.AUTHOR = T1.AUTHOR
AND T.PRICE = T1.PRICE
ORDER BY T.ARTICLE;
(3)使用LEFT JOIN語句
SELECT T.*
FROM T_MAX_LHR T
LEFT OUTER JOIN T_MAX_LHR T1
ON T.AUTHOR = T1.AUTHOR
AND T.PRICE < T1.PRICE
WHERE T1.ARTICLE IS NULL
ORDER BY T.ARTICLE;
在Oracle中的執行結果:
LHR@orclasm > SELECT T.*
2 FROM T_MAX_LHR T
3 LEFT OUTER JOIN T_MAX_LHR T1
4 ON T.AUTHOR = T1.AUTHOR
5 AND T.PRICE < T1.PRICE
6 WHERE T1.ARTICLE IS NULL
7 ORDER BY T.ARTICLE;
ARTICLE AUTHOR PRICE
-------- -------- ----------
0002 A 10.99
0003 C 1.69
0004 B 19.95
在MySQL中的執行結果:
mysql> SELECT T.*
-> FROM T_MAX_LHR T
-> LEFT OUTER JOIN T_MAX_LHR T1
-> ON T.AUTHOR = T1.AUTHOR
-> AND T.PRICE < T1.PRICE
-> WHERE T1.ARTICLE IS NULL
-> ORDER BY T.ARTICLE;
+---------+--------+-------+
| ARTICLE | AUTHOR | PRICE |
+---------+--------+-------+
| 0002 | A | 10.99 |
| 0003 | C | 1.69 |
| 0004 | B | 19.95 |
+---------+--------+-------+
3 rows in set (0.00 sec)
& 說明:
有關GROUP BY的更多內容可以參考我的部落格:http://blog.itpub.net/26736162/viewspace-2142345/。
(二)Oracle和MySQL中的分組(GROUP BY)後的聚合函式分別是什麼?
在Oracle中,可以用WM_CONCAT函式或LISTAGG分析函式;在MySQL中可以使用GROUP_CONCAT函式。示例如下:
首先給出建表語句:
CREATE TABLE T_MAX_LHR (ARTICLE VARCHAR2(30),AUTHOR VARCHAR2(30),PRICE NUMBER); --Oracle
--CREATE TABLE T_MAX_LHR (ARTICLE VARCHAR(30),AUTHOR VARCHAR(30),PRICE FLOAT); --MySQL oracle通用
INSERT INTO T_MAX_LHR VALUES ('0001','B',3.99);
INSERT INTO T_MAX_LHR VALUES ('0002','A',10.99);
INSERT INTO T_MAX_LHR VALUES ('0003','C',1.69);
INSERT INTO T_MAX_LHR VALUES ('0004','B',19.95);
INSERT INTO T_MAX_LHR VALUES ('0005','A',6.96);
COMMIT;
SELECT * FROM T_MAX_LHR;
在MySQL中:
mysql> SELECT T.AUTHOR, GROUP_CONCAT(T.ARTICLE), GROUP_CONCAT(T.PRICE)
-> FROM T_MAX_LHR T
-> GROUP BY T.AUTHOR;
+--------+-------------------------+-----------------------+
| AUTHOR | GROUP_CONCAT(T.ARTICLE) | GROUP_CONCAT(T.PRICE) |
+--------+-------------------------+-----------------------+
| A | 0002,0005 | 10.99,6.96 |
| B | 0001,0004 | 3.99,19.95 |
| C | 0003 | 1.69 |
+--------+-------------------------+-----------------------+
3 rows in set (0.00 sec)
在Oracle中:
LHR@orclasm > SELECT T.AUTHOR, WM_CONCAT(T.ARTICLE) ARTICLE, WM_CONCAT(T.PRICE) PRICE
2 FROM T_MAX_LHR T
3 GROUP BY T.AUTHOR;
AUTHOR ARTICLE PRICE
-------- --------------- ---------------
A 0002,0005 10.99,6.96
B 0001,0004 3.99,19.95
C 0003 1.69
LHR@orclasm > SELECT T.AUTHOR,
2 LISTAGG(T.ARTICLE, ',') WITHIN GROUP(ORDER BY T.PRICE) ARTICLE,
3 LISTAGG(T.PRICE, ',') WITHIN GROUP(ORDER BY T.PRICE) PRICE
4 FROM T_MAX_LHR T
5 GROUP BY T.AUTHOR;
AUTHOR ARTICLE PRICE
-------- --------------- ---------------
A 0005,0002 6.96,10.99
B 0001,0004 3.99,19.95
C 0003 1.69
& 說明:
有關GROUP BY的更多內容可以參考我的部落格:http://blog.itpub.net/26736162/viewspace-2142345/。
原作者不知道是誰了,這個圖不是小麥苗畫的。
MySQL分組查詢group by使用示例
(1) group by的含義:將查詢結果按照1個或多個欄位進行分組,欄位值相同的為一組
(2) group by可用於單個欄位分組,也可用於多個欄位分組
select * from employee;
+------+------+--------+------+------+-------------+
| num | d_id | name | age | sex | homeaddr |
+------+------+--------+------+------+-------------+
| 1 | 1001 | 張三 | 26 | 男 | beijinghdq |
| 2 | 1002 | 李四 | 24 | 女 | beijingcpq |
| 3 | 1003 | 王五 | 25 | 男 | changshaylq |
| 4 | 1004 | Aric | 15 | 男 | England |
+------+------+--------+------+------+-------------+
select * from employee group by d_id,sex;
select * from employee group by sex;
+------+------+--------+------+------+------------+
| num | d_id | name | age | sex | homeaddr
|+------+------+--------+------+------+------------+
| 2 | 1002 | 李四 | 24 | 女 | beijingcpq |
| 1 | 1001 | 張三 | 26 | 男 | beijinghdq |
+------+------+--------+------+------+------------+
根據sex欄位來分組,sex欄位的全部值只有兩個('男'和'女'),所以分為了兩組 當group by單獨使用時,只顯示出每組的第一條記錄 所以group by單獨使用時的實際意義不大
group by + group_concat()
(1) group_concat(欄位名)可以作為一個輸出欄位來使用,
(2) 表示分組之後,根據分組結果,使用group_concat()來放置每一組的某欄位的值的集合
select sex from employee group by sex;
+------+
| sex |
+------+
| 女 |
| 男 |
+------+
select sex,group_concat(name) from employee group by sex;
+------+--------------------+
| sex | group_concat(name) |
+------+--------------------+
| 女 | 李四 |
| 男 | 張三,王五,Aric |
+------+--------------------+
select sex,group_concat(d_id) from employee group by sex;
+------+--------------------+
| sex | group_concat(d_id) |
+------+--------------------+
| 女 | 1002 |
| 男 | 1001,1003,1004 |
+------+--------------------+
group by + 集合函式
(1) 通過group_concat()的啟發,我們既然可以統計出每個分組的某欄位的值的集合,那麼我們也可以通過集合函式來對這個"值的集合"做一些操作
select sex,group_concat(age) from employee group by sex;
+------+-------------------+
| sex | group_concat(age) |
+------+-------------------+
| 女 | 24 |
| 男 | 26,25,15 |
+------+-------------------+
分別統計性別為男/女的人年齡平均值
select sex,avg(age) from employee group by sex;
+------+----------+
| sex | avg(age) |
+------+----------+
| 女 | 24.0000 |
| 男 | 22.0000 |
+------+----------+
分別統計性別為男/女的人的個數
select sex,count(sex) from employee group by sex;
+------+------------+
| sex | count(sex) |
+------+------------+
| 女 | 1 |
| 男 | 3 |
+------+------------+
group by + having
(1) having 條件表示式:用來分組查詢後指定一些條件來輸出查詢結果
(2) having作用和where一樣,但having只能用於group by
select sex,count(sex) from employee group by sex having count(sex)>2;
+------+------------+
| sex | count(sex) |
+------+------------+
| 男 | 3 |
+------+------------+
group by + with rollup
(1) with rollup的作用是:在最後新增一行,來記錄當前列裡所有記錄的總和
select sex,count(age) from employee group by sex with rollup;
+------+------------+