oracle9i以後排重的好辦法

wangzh3發表於2007-02-03

表中排重的常規做法有:

1 distinct 受限於表的資料量的大小和機器配置

2 rowid 方法.

delete from table where rowid in (select a.rowid from table a,table b where

a.col1=b.col1 and a.col2=b.col2 ...and a.rowid

這個相對來說是效率比較高的.

最近看書,發現9i及以後的分析函式,效率更高.

使用如下的例子進行測試.

sms_sys_param_test 表裡面有260萬,找出不重複的只用12s(機器10.40.95.105)

sms_sys_param_test的資料裡面不重複的是91條<wbr>,來源於sms_sys_param.
採用 insert into sms_sys_param_test select * from sms_sys_param_test反覆複製.沒有建任何所引.


select * from
(select sms_sys_param_test.*,row<wbr>_number() over(partition by PARAID order by rowid) rn from sms_sys_param_test)

where rn=1;


如果重複的資料很多,可以採用如下方式排重.
create table newtable_name
as select * from
(select sms_sys_param_test.*,row<wbr>_number() over(partition by PARAID order by rowid) rn from sms_sys_param_test)

where rn=1;

truncate table sms_sys_param_test;

insert into sms_sys_param_test select .... from newtable_name;

drop table newtable_name;


如果很少,可以 把上述的rn=1改為rn>1 .

delete from sms_sys_param_test where rowid in (
select rowid from (select rowid,sms_sys_param_test.*,row<wbr>_number() over(partition by PARAID order by rowid) rn from sms_sys_param_test)
where rn>1)

刪除重複的記錄.

參考文章:

[@more@]

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/47869/viewspace-897149/,如需轉載,請註明出處,否則將追究法律責任。

相關文章