标签:style blog http color 使用 os io 数据
1. 背景介绍
什么是semi-join?
所谓的semi-join是指semi-join子查询。 当一张表在另一张表找到匹配的记录之后,半连接(semi-jion)返回第一张表中的记录。与条件连接相反,即使在右节点中找到几条匹配的记录,左节点 的表也只会返回一条记录。另外,右节点的表一条记录也不会返回。半连接通常使用IN 或 EXISTS 作为连接条件。 该子查询具有如下结构:
SELECT ... FROM outer_tables WHERE expr IN (SELECT ... FROM inner_tables ...) AND ...
即在where条件的“IN”中的那个子查询。
这种查询的特点是我们只关心outer_table中与semi-join相匹配的记录。
换句话说,最后的结果集是在outer_tables中的,而semi-join的作用只是对outer_tables中的记录进行筛选。这也是我们进行
semi-join优化的基础,即我们只需要从semi-join中获取到最少量的足以对outer_tables记录进行筛选的信息就足够了。
所谓的最少量,体现到优化策略上就是如何去重。
以如下语句为例:
select * from Country where Country.Code in (select City.country from City where City.Population>1*1000*1000);
当中的semi-join: “
select City.country from City where City.Population>1*1000*1000
” 可能返回的结果集如下: China(Beijin), China(Shanghai), France(Paris)...
我们可以看到这里有2个China,分别来至2条城市记录Beijin和Shanghai,
但实际上我们只需要1个China就足够对outer_table
optimizer_switch
system variable. The semijoin
flag controls whether semi-joins are used. If it is set to on
, the firstmatch
, loosescan
, and materialization
flags enable finer control over the permitted semi-join strategies. These flags are on
by default.The use of semi-join strategies is indicated in EXPLAIN
output as follows:
Semi-joined tables show up in the outer select. EXPLAIN EXTENDED
plus SHOW WARNINGS
shows the rewritten query, which displays the semi-join structure. From this you can get an idea about which tables were pulled out of the semi-join. If a subquery was converted to a semi-join, you will see that the subquery predicate is gone and its tables and WHERE
clause were merged into the outer query join list and WHERE
clause.
Temporary table use for Duplicate Weedout is indicated by Start temporary
and End temporary
in the Extra
column. Tables that were not pulled out and are in the range of EXPLAIN
output rows covered by Start temporary
and End temporary
will have their rowid
in the temporary table.
FirstMatch(
in the tbl_name
)Extra
column indicates join shortcutting.
LooseScan(
in the m
..n
)Extra
column indicates use of the LooseScan strategy. m
and n
are key part numbers.
As of MySQL 5.6.7, temporary table use for materialization is indicated by rows with a select_type
value of MATERIALIZED
and rows with a table
value of <subquery
.N
>
Before MySQL 5.6.7, temporary table use for materialization is indicated in the Extra
column by Materialize
if a single table is used, or by Start materialize
and End materialize
if multiple tables are used. If Scan
is present, no temporary table index is used for table reads. Otherwise, an index lookup is used.
mysql> SELECT @@optimizer_switch\G *************************** 1. row *************************** @@optimizer_switch: index_merge=on,index_merge_union=on, index_merge_sort_union=on, index_merge_intersection=on, engine_condition_pushdown=on, index_condition_pushdown=on, mrr=on,mrr_cost_based=on, block_nested_loop=on,batched_key_access=off, materialization=on,semijoin=on,loosescan=on, firstmatch=on, subquery_materialization_cost_based=on, use_index_extensions=on
mysql中的semi-join,布布扣,bubuko.com
标签:style blog http color 使用 os io 数据
原文地址:http://www.cnblogs.com/xiaotengyi/p/3908347.html