Oracle删除重复数据

时间：2018-07-17 16:22:21 阅读：178 评论：0 收藏：0 [点我收藏+]

背景：有两个数据库（源数据库，和目标数据库），每天把源数据库了数据同步到目标数据库中，由于各种原因，怕数据丢失，所有同步8天前后的数据（有主键，不要担心重复，每天十几万条，表中已经有6千万条），但是不知道哪天有同事把主键误drop掉。

统计的BI报表数据多的离谱。经过的一番折腾，问题解决了。下面总结一下几种方法：

1）闪回：oracle有闪回技术，可以利用recyclebin（回收站）查询删除的的主键，但是这之前要把重复的数据删除。

2）利用rowid查询重复数据并且干掉相同数据除rowid最小，语句：

delete from 表 a where (a.Id,a.seq) in(select Id,seq from 表 group by Id,seq having count(*)> 1) and rowid not in (select min(rowid) from 表group by Id,seq having count(*)>1)

这条dml语句就是噩梦，因为有"not in" 如果你的数据量大，请慎用。

3）也就是经过实践的方法，效率还可以，大概5分钟就删除了。步奏如下：

1.查询表中的重复数据

select * from 表1 a where (a.Id,a.seq) in(select Id,seq from 表1 group by Id,seq having count(*)> 1) （a.Id,a.seq 是有重复的主键）

2.建一张表

create table lsb as select * from 表1 a where (a.Id,a.seq) in(select Id,seq from 表1 group by Id,seq having count(*)> 1); commit ;(这样lsb的表结构就和表1的表结构一样)

3.删除表1里的重复数据

delete from 表1 a where (a.Id,a.seq) in(select Id,seq from 表1 group by Id,seq having count(*)> 1) ;

commit;

4.查询lsb表中的rowid最小的数据

select * from lsb a where a.rowid in(select min(rowid) from lsb group by Id,seq having count(*)> 1)

5.把查询出来的rowid插入到表1里

insert into 表1 select * from lsb a where a.rowid in(select min(rowid) from lsb group by Id,seq having count(*)> 1) ;

commit;

6.drop table lsb;

4）整体步奏

create table lsb as select * from 表1 a where (a.Id,a.seq) in(select Id,seq from 表1 group by Id,seq having count(*)> 1); --也可以是临时表效率更高（不需要写磁盘）

commit ;

delete from 表1 a where (a.Id,a.seq) in(select Id,seq from 表1 group by Id,seq having count(*)> 1) ;

commit;

insert into 表1 select * from lsb a where a.rowid in(select min(rowid) from lsb group by Id,seq having count(*)> 1) ;

commit;

drop table lsb;

Oracle删除重复数据

标签：除了删除表重复数据 rop 同步 ble 报表语句效率

原文地址：https://www.cnblogs.com/zengchenri/p/9323105.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行