标签:merge join sort join 排序合并连接
一. sort merge joins连接(排序合并连接) 原理
SQL> CREATE TABLE t1 ( 2 id NUMBER NOT NULL, 3 n NUMBER, 4 pad VARCHAR2(4000), 5 CONSTRAINT t1_pk PRIMARY KEY(id) 6 ); Table created. SQL> CREATE TABLE t2 ( 2 id NUMBER NOT NULL, 3 t1_id NUMBER NOT NULL, 4 n NUMBER, 5 pad VARCHAR2(4000), 6 CONSTRAINT t2_pk PRIMARY KEY(id), 7 CONSTRAINT t2_t1_fk FOREIGN KEY (t1_id) REFERENCES t1 8 ); Table created. SQL> CREATE TABLE t3 ( 2 id NUMBER NOT NULL, 3 t2_id NUMBER NOT NULL, 4 n NUMBER, 5 pad VARCHAR2(4000), 6 CONSTRAINT t3_pk PRIMARY KEY(id), 7 CONSTRAINT t3_t2_fk FOREIGN KEY (t2_id) REFERENCES t2 8 ); Table created. SQL> CREATE TABLE t4 ( 2 id NUMBER NOT NULL, 3 t3_id NUMBER NOT NULL, 4 n NUMBER, 5 pad VARCHAR2(4000), 6 CONSTRAINT t4_pk PRIMARY KEY(id), 7 CONSTRAINT t4_t3_fk FOREIGN KEY (t3_id) REFERENCES t3 8 ); Table created. SQL> execute dbms_random.seed(0) PL/SQL procedure successfully completed. SQL> INSERT INTO t1 SELECT rownum, rownum, dbms_random.string('a',50) FROM dual CONNECT BY level <= 10 ORDER BY dbms_random.random; 10 rows created. SQL> INSERT INTO t2 SELECT 100+rownum, t1.id, 100+rownum, t1.pad FROM t1, t1 dummy ORDER BY dbms_random.random; 100 rows created. SQL> INSERT INTO t3 SELECT 1000+rownum, t2.id, 1000+rownum, t2.pad FROM t2, t1 dummy ORDER BY dbms_random.random; 1000 rows created. SQL> INSERT INTO t4 SELECT 10000+rownum, t3.id, 10000+rownum, t3.pad FROM t3, t1 dummy ORDER BY dbms_random.random; 10000 rows created. SQL> COMMIT; Commit complete.使用 hint 让执行计划以 T3 作为驱动表
SQL> select /*+ leading(t3) use_merge(t4) */ * 2 from t3, t4 3 where t3.id = t4.t3_id and t3.n = 1100; 10 rows selected. SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last')); PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------------------------------- SQL_ID g0rdyg9hdh9m0, child number 0 ------------------------------------- select /*+ leading(t3) use_merge(t4) */ * from t3, t4 where t3.id = t4.t3_id and t3.n = 1100 Plan hash value: 3831111046 ----------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem | ----------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 10 |00:00:00.02 | 119 | | | | | 1 | MERGE JOIN | | 1 | 10 | 10 |00:00:00.02 | 119 | | | | | 2 | SORT JOIN | | 1 | 1 | 1 |00:00:00.01 | 15 | 2048 | 2048 | 2048 (0)| |* 3 | TABLE ACCESS FULL| T3 | 1 | 1 | 1 |00:00:00.01 | 15 | | | | |* 4 | SORT JOIN | | 1 | 10000 | 10 |00:00:00.02 | 104 | 974K| 535K| 865K (0)| | 5 | TABLE ACCESS FULL| T4 | 1 | 10000 | 10000 |00:00:00.01 | 104 | | | | ----------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - filter("T3"."N"=1100) 4 - access("T3"."ID"="T4"."T3_ID") filter("T3"."ID"="T4"."T3_ID")使用 hint 让执行计划以 T4 作为驱动表
SQL> select /*+ leading(t4) use_merge(t3) */ * 2 from t3, t4 3 where t3.id = t4.t3_id and t3.n = 1100; 10 rows selected. SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last')); PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------------------------------------- SQL_ID gxuwn06y1c1az, child number 0 ------------------------------------- select /*+ leading(t4) use_merge(t3) */ * from t3, t4 where t3.id = t4.t3_id and t3.n = 1100 Plan hash value: 875334572 ----------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem | ----------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 10 |00:00:00.04 | 119 | | | | | 1 | MERGE JOIN | | 1 | 10 | 10 |00:00:00.04 | 119 | | | | | 2 | SORT JOIN | | 1 | 10000 | 1001 |00:00:00.04 | 104 | 974K| 535K| 865K (0)| | 3 | TABLE ACCESS FULL| T4 | 1 | 10000 | 10000 |00:00:00.01 | 104 | | | | |* 4 | SORT JOIN | | 1001 | 1 | 10 |00:00:00.01 | 15 | 2048 | 2048 | 2048 (0)| |* 5 | TABLE ACCESS FULL| T3 | 1 | 1 | 1 |00:00:00.01 | 15 | | | | ----------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 4 - access("T3"."ID"="T4"."T3_ID") filter("T3"."ID"="T4"."T3_ID") 5 - filter("T3"."N"=1100)从返回的执行计划结果中我们可以看到:
2. 以 T3 为驱动表时, T3 访问一次, T4 也是访问一次; 以 T4 为驱动表时, T4 访问一次, T3 也是访问一次
3. 需要排序, 如果 PGA 空间重足时在 PGA 中排序, 不如果不足则交换到磁盘上排序
SQL> select /*+ leading(t3) use_merge(t4) */ * 2 from t3, t4 3 where t3.id = t4.t3_id and t3.n = 1100 and t4.n = 10034; SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last')); PLAN_TABLE_OUTPUT -------------------------------------------------------------------------------------------------------------------------- -------------------------------------------------------------------------------------------------------------------------- SQL_ID bg9h60c7ak3ud, child number 0 ------------------------------------- select /*+ leading(t3) use_merge(t4) */ * from t3, t4 where t3.id = t4.t3_id and t3.n = 1100 and t4.n = 10034 Plan hash value: 3831111046 ----------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem | ----------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 119 | | | | | 1 | MERGE JOIN | | 1 | 1 | 1 |00:00:00.01 | 119 | | | | | 2 | SORT JOIN | | 1 | 1 | 1 |00:00:00.01 | 15 | 2048 | 2048 | 2048 (0)| |* 3 | TABLE ACCESS FULL| T3 | 1 | 1 | 1 |00:00:00.01 | 15 | | | | |* 4 | SORT JOIN | | 1 | 1 | 1 |00:00:00.01 | 104 | 2048 | 2048 | 2048 (0)| |* 5 | TABLE ACCESS FULL| T4 | 1 | 1 | 1 |00:00:00.01 | 104 | | | | ----------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - filter("T3"."N"=1100) 4 - access("T3"."ID"="T4"."T3_ID") filter("T3"."ID"="T4"."T3_ID") 5 - filter("T4"."N"=10034) SQL> create index t4_n on t4(n); Index created. SQL> select /*+ leading(t3) use_merge(t4) */ * 2 from t3, t4 3 where t3.id = t4.t3_id and t3.n = 1100 and t4.n = 10034; SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last')); PLAN_TABLE_OUTPUT ----------------------------------------------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------------------------------------------------- SQL_ID bg9h60c7ak3ud, child number 0 ------------------------------------- select /*+ leading(t3) use_merge(t4) */ * from t3, t4 where t3.id = t4.t3_id and t3.n = 1100 and t4.n = 10034 Plan hash value: 1501658231 ------------------------------------------------------------------------------------------------------------------------------------ | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem | ------------------------------------------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 18 | 1 | | | | | 1 | MERGE JOIN | | 1 | 1 | 1 |00:00:00.01 | 18 | 1 | | | | | 2 | SORT JOIN | | 1 | 1 | 1 |00:00:00.01 | 15 | 0 | 2048 | 2048 | 2048 (0)| |* 3 | TABLE ACCESS FULL | T3 | 1 | 1 | 1 |00:00:00.01 | 15 | 0 | | | | |* 4 | SORT JOIN | | 1 | 1 | 1 |00:00:00.01 | 3 | 1 | 2048 | 2048 | 2048 (0)| | 5 | TABLE ACCESS BY INDEX ROWID| T4 | 1 | 1 | 1 |00:00:00.01 | 3 | 1 | | | | |* 6 | INDEX RANGE SCAN | T4_N | 1 | 1 | 1 |00:00:00.01 | 2 | 1 | | | | ------------------------------------------------------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 3 - filter("T3"."N"=1100) 4 - access("T3"."ID"="T4"."T3_ID") filter("T3"."ID"="T4"."T3_ID") 6 - access("T4"."N"=10034) SQL> create index t3_n on t3(n); Index created. SQL> select /*+ leading(t3) use_merge(t4) */ * 2 from t3, t4 3 where t3.id = t4.t3_id and t3.n = 1100 and t4.n = 10034; SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last')); PLAN_TABLE_OUTPUT --------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------------------------------------------------------------------------------------------------- SQL_ID bg9h60c7ak3ud, child number 0 ------------------------------------- select /*+ leading(t3) use_merge(t4) */ * from t3, t4 where t3.id = t4.t3_id and t3.n = 1100 and t4.n = 10034 Plan hash value: 1827980052 ------------------------------------------------------------------------------------------------------------------------------------ | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem | ------------------------------------------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 6 | 1 | | | | | 1 | MERGE JOIN | | 1 | 1 | 1 |00:00:00.01 | 6 | 1 | | | | | 2 | SORT JOIN | | 1 | 1 | 1 |00:00:00.01 | 3 | 1 | 2048 | 2048 | 2048 (0)| | 3 | TABLE ACCESS BY INDEX ROWID| T3 | 1 | 1 | 1 |00:00:00.01 | 3 | 1 | | | | |* 4 | INDEX RANGE SCAN | T3_N | 1 | 1 | 1 |00:00:00.01 | 2 | 1 | | | | |* 5 | SORT JOIN | | 1 | 1 | 1 |00:00:00.01 | 3 | 0 | 2048 | 2048 | 2048 (0)| | 6 | TABLE ACCESS BY INDEX ROWID| T4 | 1 | 1 | 1 |00:00:00.01 | 3 | 0 | | | | |* 7 | INDEX RANGE SCAN | T4_N | 1 | 1 | 1 |00:00:00.01 | 2 | 0 | | | | ------------------------------------------------------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 4 - access("T3"."N"=1100) 5 - access("T3"."ID"="T4"."T3_ID") filter("T3"."ID"="T4"."T3_ID") 7 - access("T4"."N"=10034)从上面的执行计划中可以看出, 全表扫描后最后使用的 buffer 为 119, 在一个表上建立索引使用索引范围扫描后 buffer 为 18, 在两个表上建立的索引使用索引范围扫描后 buffer 为 6. 由此可以见, 在表的谓词条件上如果有索引的话, 将会提高执行效率.
此外, 由于 sort merge joins 需要先在 PGA 中进行排序,, 如果 PGA 空间不足, 就会将数据交换到磁盘上进行排序。由于, 磁盘相对于内存来说是慢速设备,因此在磁盘上排序会比在内存上排序慢, 另外排序排序消耗的时间还需要加上数据在内存和磁盘上传输的时间,因此尽可能减少磁盘排序的次数也就会提高执行效率, 有两种方法会减少磁盘排序:
1. 增大 PGA 的大小, 如果是 oracle 10g,需要增加参数 pga_aggregate_target 的大小,如果是 oracle 11g,则增加 memory_target 的大小
2. 减少排序的数据量, 一些不需要的字段就不要写在 select 后面
四. 小结
遇到 sql 调优时,如果执行计划显示表的连接方式是 sort merge join:
首先,看看 sql 语句是不是表的连接方式有没有可能转换为 hash join(等值连接条件)
其次,只能使用 sort merge join 时看看表的谓词条件上是不是有索引
最后,看看执行计划排序占用的内存大小是不是在磁盘上有排序, 是不是能够避免在磁盘上排序
参考: <<基于 oracle 的 sql 优化>>
<<收获, 不止 oracle>>
<<Troubleshooting oracle performance>>
oracle 表连接 - sort merge joins 排序合并连接
标签:merge join sort join 排序合并连接
原文地址:http://blog.csdn.net/dataminer_2007/article/details/41907581