概述
在sql的使用中,我們總是碰到需要刪除重複資料的情況,但是又不能全部刪除完,必須要保留至少一個重複的資料。重複的記錄根據兩個欄位a2,a3判斷(實際使用中可以擴充為多個)
實現過程
1. 建立表A,資料如下
a1 | a2 | a3 |
---|---|---|
1 | 1 | 1 |
2 | 1 | 2 |
3 | 2 | 2 |
4 | 2 | 2 |
5 | 3 | 3 |
6 | 2 | 2 |
2. 選擇重複的行:
select *,count(*)
from A group by a2,a3
having count(*)>1;
a1 | a2 | a3 | count(*) |
---|---|---|---|
3 | 2 | 2 | 3 |
使用in來找到我們想要的ID
SELECT *
FROM A
WHERE (a2,a3) IN
(SELECT A.`a2`,A.`a3`
FROM A
GROUP BY A.`a2`,A.`a3`
HAVING COUNT(*)>1)
a1 | a2 | a3 |
---|---|---|
3 | 2 | 2 |
4 | 2 | 2 |
6 | 2 | 2 |
那麼後面就很好辦了:
3. 選出要刪除的值:
SELECT *
FROM A
WHERE (a2, a3) IN
(SELECT `a2`,`a3`
FROM A
GROUP BY A.`a2`,A.`a3`
HAVING COUNT(*) > 1)
AND a1 NOT IN
(SELECT MIN(a1)
FROM A
GROUP BY A.`a2`,A.`a3`
HAVING COUNT(*) > 1) ;
-- 結果是保留a1最小的值,其他選項全部選出,
-- 請注意此時並不是將Select 改為delete就可以了,如果你直接這樣改的話,會報如下錯誤:
-- You can`t specify target table `A` for update in FROM clause
-- 該錯誤提示你,不能先select出同一表中的某些值,再update這個表(在同一語句中)。所以要稍微修改一下。
4. 刪除值
sql語句如下:
//建立中間表
CREATE TABLE F(a1 INTEGER,a2 INTEGER,a3 INTEGER);
//將要刪除的資料插入中間表
INSERT INTO F (
SELECT *
FROM A
WHERE (a2, a3) IN (SELECT `a2`,`a3`
FROM A GROUP BY A.`a2`,A.`a3`
HAVING COUNT(*) > 1)
AND a1 NOT IN
(SELECT MIN(a1) FROM A
GROUP BY A.`a2`,A.`a3`
HAVING COUNT(*) > 1)) ;
//刪除中間表
DELETE FROM A WHERE a1 IN (SELECT a1 FROM F);
SELECT *FROM A;
a1 | a2 | a3 |
---|---|---|
1 | 1 | 1 |
2 | 1 | 2 |
3 | 2 | 2 |
5 | 3 | 3 |
備註
如果說不用保留一行資料的話那麼就簡單多了,只需要一個很簡單的sql語句:
DELETE FROM A WHERE (a2,a3) IN (SELECT a2,a3 FROM A GROUP BY a2,a3 HAVING COUNT(*)>1)