标签:tmp roo size group by 列表 性能 can price 做了
声明
先看一个执行计划
(root@localhost) [test]> desc select * from l;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------+
有个潜规则叫:id相等从上往下看,id不等从下往上看
主要优化对象是index和ALL,有两种情况可以考虑保留index
只查询索引列,不回表或者使用索引进行排序或者聚合
优化器可能使用到的索引
优化器实际选择的索引
使用索引的字节长度
优化器预估的记录数量
根据条件过滤得到的记录的百分比
(root@localhost) [dbt3]> DESC SELECT
-> *
-> FROM
-> part
-> WHERE
-> p_partkey IN (SELECT
-> l_partkey
-> FROM
-> lineitem
-> WHERE
-> l_shipdate BETWEEN ‘1997-01-01‘ AND ‘1997-02-01‘)
-> ORDER BY p_retailprice DESC
-> LIMIT 10;
+----+--------------+-------------+------------+--------+----------------------------------------------+--------------+---------+---------------------+--------+----------+----------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------+-------------+------------+--------+----------------------------------------------+--------------+---------+---------------------+--------+----------+----------------------------------+
| 1 | SIMPLE | part | NULL | ALL | PRIMARY | NULL | NULL | NULL | 197706 | 100.00 | Using where; Using filesort |
| 1 | SIMPLE | <subquery2> | NULL | eq_ref | <auto_key> | <auto_key> | 5 | dbt3.part.p_partkey | 1 | 100.00 | NULL |
| 2 | MATERIALIZED | lineitem | NULL | range | i_l_shipdate,i_l_suppkey_partkey,i_l_partkey | i_l_shipdate | 4 | NULL | 138672 | 100.00 | Using index condition; Using MRR |
+----+--------------+-------------+------------+--------+----------------------------------------------+--------------+---------+---------------------+--------+----------+----------------------------------+
3 rows in set, 1 warning (0.01 sec)
id 顺序
1 ② part表(外表)和subquery2(id=2产生的14w记录的表)进行关联,对于part表中所有记录都要关联,一共是19w行,再和l_partkey进行关联,最后排序用到using filesort
1 ③ 内表要加索引,所以mysql优化器自动把第一步取出来的数据添加了一个唯一索引,in里面是去重的(这其实是做了一个物化),所以是唯一索引,eq_ref表示通过唯一索引进行关联,和外表中的p_partkey关联
2 ① 先查lineitem表,是一个range范围查询,使用了i_l_shipdate索引,l_shipdate是date类型,占用四个字节,预估14万行记录,过滤出百分之百,materiallized表示产生了一张实际的表,并且去添加了索引,l_partkey,唯一索引(in里面是去重的)
注意一个细节
(root@localhost) [dbt3]> DESC SELECT
-> *
-> FROM
-> part
-> WHERE
-> p_partkey IN (SELECT
-> l_partkey
-> FROM
-> lineitem
-> WHERE
-> l_shipdate BETWEEN ‘1997-01-01‘ AND ‘1997-01-07‘)
-> ORDER BY p_retailprice DESC
-> LIMIT 10;
+----+--------------+-------------+------------+--------+----------------------------------------------+--------------+---------+-----------------------+-------+----------+----------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------+-------------+------------+--------+----------------------------------------------+--------------+---------+-----------------------+-------+----------+----------------------------------------------+
| 1 | SIMPLE | <subquery2> | NULL | ALL | NULL | NULL | NULL | NULL | NULL | 100.00 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | part | NULL | eq_ref | PRIMARY | PRIMARY | 4 | <subquery2>.l_partkey | 1 | 100.00 | NULL |
| 2 | MATERIALIZED | lineitem | NULL | range | i_l_shipdate,i_l_suppkey_partkey,i_l_partkey | i_l_shipdate | 4 | NULL | 29148 | 100.00 | Using index condition; Using MRR |
+----+--------------+-------------+------------+--------+----------------------------------------------+--------------+---------+-----------------------+-------+----------+----------------------------------------------+
3 rows in set, 1 warning (0.00 sec)
驱动表就变成了subquerry2,这时候优化器又把子查询作为了外表,说明优化器很聪明
in的子查询,优化器会帮你重写成join,并且帮你选择子查询到底是内表还是外表
(root@localhost) [dbt3]> DESC select
-> a.*
-> from
-> part a,
-> (select distinct
-> l_partkey
-> from
-> lineitem
-> where l_shipdate between ‘1997-01-01‘ and ‘1997-02-01‘) b
-> where
-> a.p_partkey=b.l_partkey
-> order by a.p_retailprice desc
-> limit 10;
+----+-------------+------------+------------+--------+----------------------------------------------+--------------+---------+-------------+--------+----------+---------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+--------+----------------------------------------------+--------------+---------+-------------+--------+----------+---------------------------------------------------+
| 1 | PRIMARY | <derived2> | NULL | ALL | NULL | NULL | NULL | NULL | 138672 | 100.00 | Using where; Using temporary; Using filesort |
| 1 | PRIMARY | a | NULL | eq_ref | PRIMARY | PRIMARY | 4 | b.l_partkey | 1 | 100.00 | NULL |
| 2 | DERIVED | lineitem | NULL | range | i_l_shipdate,i_l_suppkey_partkey,i_l_partkey | i_l_shipdate | 4 | NULL | 138672 | 100.00 | Using index condition; Using MRR; Using temporary |
+----+-------------+------------+------------+--------+----------------------------------------------+--------------+---------+-------------+--------+----------+---------------------------------------------------+
3 rows in set, 1 warning (0.00 sec)
这么改写,b表永远是外表,子查询只是产生一个派生表,但是没办法给它建索引,如果子查询出来的结果集很大,这时候性能就不如in了,in的话优化器会把它作为内表
(root@localhost) [dbt3]> DESC select max(l_extendedprice)
-> from orders,lineitem
-> where o_orderdate between ‘1995-01-01‘ and ‘1995-01-31‘
-> and l_orderkey=o_orderkey;
+----+-------------+----------+------------+-------+--------------------------------------------+---------------+---------+------------------------+-------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+-------+--------------------------------------------+---------------+---------+------------------------+-------+----------+--------------------------+
| 1 | SIMPLE | orders | NULL | range | PRIMARY,i_o_orderdate | i_o_orderdate | 4 | NULL | 40696 | 100.00 | Using where; Using index |
| 1 | SIMPLE | lineitem | NULL | ref | PRIMARY,i_l_orderkey,i_l_orderkey_quantity | PRIMARY | 4 | dbt3.orders.o_orderkey | 3 | 100.00 | NULL |
+----+-------------+----------+------------+-------+--------------------------------------------+---------------+---------+------------------------+-------+----------+--------------------------+
2 rows in set, 1 warning (0.00 sec)
orderkey上有索引,但是没用,用的是pk,orders表示外表,根据过滤条件把数据过滤出来做外表,然后跟lineitem表关联,用的是pk,关联的列是orders.o_orderkey
如果强行走orderkey索引,成本很高,需要回表,通过主键不用回表
(root@localhost) [dbt3]> DESC select *
-> from
-> lineitem
-> where
-> l_shipdate <= ‘1995-12-32‘
-> union
-> select
-> *
-> from
-> lineitem
-> where
-> l_shipdate >= ‘1997-01-01‘;
+----+--------------+------------+------------+------+---------------+------+---------+------+---------+----------+-----------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------+------------+------------+------+---------------+------+---------+------+---------+----------+-----------------+
| 1 | PRIMARY | lineitem | NULL | ALL | i_l_shipdate | NULL | NULL | NULL | 5409799 | 33.33 | Using where |
| 2 | UNION | lineitem | NULL | ALL | i_l_shipdate | NULL | NULL | NULL | 5409799 | 50.00 | Using where |
|NULL| UNION RESULT | <union1,2> | NULL | ALL | NULL | NULL | NULL | NULL | NULL | NULL | Using temporary |
+----+--------------+------------+------------+------+---------------+------+---------+------+---------+----------+-----------------+
3 rows in set, 3 warnings (0.10 sec)
union result合并两张表 会using temporary,使用临时表,union会去重,所以又去建了临时表,在上面加了唯一索引,这里就用了两个索引,所以一个sql只能用一条索引是不对的
(root@localhost) [employees]> DESC SELECT
-> emp_no,
-> dept_no,
-> (SELECT
-> COUNT(1)
-> FROM
-> dept_emp t2
-> WHERE
-> t1.emp_no <= t2.emp_no) AS row_num
-> FROM
-> dept_emp t1;
+----+--------------------+-------+------------+-------+----------------+--------+---------+------+--------+----------+------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------------+-------+------------+-------+----------------+--------+---------+------+--------+----------+------------------------------------------------+
| 1 | PRIMARY | t1 | NULL | index | NULL | emp_no | 4 | NULL | 331570 | 100.00 | Using index |
| 2 | DEPENDENT SUBQUERY | t2 | NULL | ALL | PRIMARY,emp_no | NULL | NULL | NULL | 331570 | 33.33 | Range checked for each record (index map: 0x3) |
+----+--------------------+-------+------------+-------+----------------+--------+---------+------+--------+----------+------------------------------------------------+
2 rows in set, 2 warnings (0.00 sec)
对于这个sql,先执行了1再执行了2,2是dependent subquery,要依赖子查询,所以先执行了1,所以t1是外表,t2是内表,每次得关联33w * 33%次数,一共关联33w次,一共是33w * 10w次
行号问题,性能非常差
标签:tmp roo size group by 列表 性能 can price 做了
原文地址:https://www.cnblogs.com/---wunian/p/9220424.html