視窗函式 row_number 去重複

張博208發表於2017-02-15

ROW_NUMBER 的使用基本上倒不會遺忘了, 這是很久以前的程式碼庫了, 整理出來對初學者有幫助.

如果去繼續研究研究 SQL Server 2012 中的Window Function 新特性, 就會發現裡面還有更多很強大的東東, 很多內容和這裡的 ROW_NUMBER 的概念或者使用方式有些相似.

理解了ROW_NUMBER 以及 PARTITION BY 的使用再去看 2012的 Windows Function 新特性的話會比較容易些.

-- ROW_NUMBER function
 
DECLARE @DEMO TABLE
(
   Name  VARCHAR(15),
   ADDR1 VARCHAR(50),
   ADDR2 VARCHAR(50),
   CITY  VARCHAR(50),
   ZIP   CHAR(6)
)
-- Insert testing records
INSERT INTO @DEMO VALUES
('Zhang San','Leshan Road 33','','Shanghai','200081'),
('Li Si','Zhaojiabang Road 23','Nan song Street 9002','Shanghai','200083'),
('Wang Wu','Huashan Road 27','Weihai zhong Road 1039','Shanghai','200081'),
('Sun liu','Jiaoda dong Round 12','Gao liangqiao xiejie 19','Beijing','100023'),
('Yang Qi','Zhong guan cun nan A39','','Beijing','100009'),
('Zhu Ba','Cuihua nan 30','haiding road 32','Beijing','100103'),
('Huang Jiu','Lv you Road 33','Huan dong jie 39','Wuhan','420021')
 
-- Original query
SELECT *
FROM @DEMO

-- Only to set row number for each record.
SELECT ROW_NUMBER() OVER(ORDER BY addr.city) AS 'ID',
       addr.Name,
       addr.ADDR1,
       addr.CITY,
       addr.ZIP
FROM @DEMO AS addr

-- Set group first, then set row number for each record under the group - city
SELECT ROW_NUMBER() OVER (PARTITION BY addr.CITY
                          ORDER BY addr.CITY) AS 'ID',
       addr.Name,
       addr.ADDR1,
       addr.CITY,
       addr.ZIP
FROM @DEMO AS addr

-- Set group first, then set row number for each record under the group - city and zip
SELECT ROW_NUMBER() OVER (PARTITION BY addr.CITY,
                                       addr.ZIP
                          ORDER BY addr.CITY) AS 'ID',
       addr.Name,
       addr.ADDR1,
       addr.CITY,
       addr.ZIP
FROM @DEMO AS addr


通過了上面的這個小例子,再看看去重就很簡單了.

-- Remove duplicated records
 
DECLARE @DupDemo TABLE
(
   A VARCHAR(15),
   B VARCHAR(15),
   C VARCHAR(15)
)
 
-- Insert testing records
INSERT INTO @DupDemo VALUES
('1','a','2'),
('1','a','1'),
('2','c','6'),
('2','c','3'),
('1','a','1'),
('3','b','3')
 
-- Original Records
SELECT *
FROM @DupDemo

-- Show the duplicated records, only to check column A and B.
SELECT ROW_NUMBER() OVER(PARTITION BY A,
                                      B
                    ORDER BY C) AS ID,
             A,
             B,
             C
      FROM @DupDemo

-- Delete the duplicated records
DELETE D
FROM(
      SELECT ROW_NUMBER() OVER(PARTITION BY A,
                                            B
                               ORDER BY C) AS ID,
             A,
             B,
             C
      FROM @DupDemo
)D
WHERE D.ID > 1
 
-- After delete operation
SELECT *
FROM @DupDemo

-- Show DEMO 2
DELETE FROM
@DupDemo
 
-- Insert test records
INSERT INTO @DupDemo VALUES
('1','a','2'),
('1','a','1'),
('2','c','6'),
('2','c','3'),
('1','a','1'),
('3','b','3');
 
-- With TEMP table
WITH Dup AS
(
   SELECT ROW_NUMBER() OVER(PARTITION BY A,
                                            B
                               ORDER BY C) AS ID,
             A,
             B,
             C
      FROM @DupDemo
)
 
DELETE FROM
Dup
WHERE ID > 1;
 
-- Show records after delete operation.
SELECT *
FROM @DupDemo


語句:

Delete T From
(Select Row_Number() Over(Partition By [ID],[Name],[Age],[Sex] order By [ID]) As RowNumber,* From #Temp)T
Where T.RowNumber > 1


相關文章