标签:
默认oracle会收集表中各个列的统计信息,但是会忽略列之间的关联关系。在大多情况下,优化器假设在复杂查询中的列之间是独立的。当where子句后指定了一个表的多个列条件时,优化器通常会将多个列的选择性(selectivity)相乘得到where语句的选择性,导致优化器做出错误判断!
Oracle 11g引入了多列统计信息概念,如果上面情况列关联性很好,可以做多列统计信息收集,让优化器做出正确判断。
在oracle 10g中,只有在一些特殊场合,优化器才会考虑列之间的关联关系:
-The optimizer used the number of distinct keys in an index to estimate selectivity provided all columns of a conjunctive predicate match all columns of a concatenated index key. In addition, the predicates must be equalities used in equijoins.
- If you set DYNAMIC_SAMPLING to level 4, the optimizer used dynamic sampling to estimate the selectivity of predicates involving multiple columns from a table. Because the sampling size is quite small, the results are dubious in most cases.
创建Column Groups:
DECLARE cg_name varchar2(30); BEGIN cg_name := dbms_stats.create_extended_stats(null,‘customers‘, ‘(cust_state_province,country_id)‘); END; /
查看Column Groups:
SQL> select extension_name, extension from dba_stat_extensions where table_name=‘CUSTOMERS‘; EXTENSION_NAME EXTENSION ------------------------------ -------------------------------------------------------------------------------- SYS_STU#S#WF25Z#QAHIHE#MOFFMM_ ("CUST_STATE_PROVINCE","COUNTRY_ID") 或者 SQL> select sys.dbms_stats.show_extended_stats_name (‘sh‘,‘customers‘,‘(cust_state_province,country_id)‘) col_group_name from dual; COL_GROUP_NAME -------------------------------------------------- SYS_STU#S#WF25Z#QAHIHE#MOFFMM_
删除:
SQL> exec dbms_stats.drop_extended_stats(‘sh‘,‘customers‘,‘(cust_state_province, country_id)‘);
收集Column Groups的统计信息:
SQL> exec dbms_stats.gather_table_stats(‘sh‘,‘customers‘,method_opt =>‘for all columns size skewonly for columns (cust_state_province,country_id) size skewonly‘);
监控Column Groups:
--查询多列统计信息 SQL> Select extension_name, extension from user_stat_extensions where table_name=‘CUSTOMERS‘; EXTENSION_NAME EXTENSION ------------------------------ -------------------------------------------------------------------------------- SYS_STU#S#WF25Z#QAHIHE#MOFFMM_ ("CUST_STATE_PROVINCE","COUNTRY_ID") SQL> --查看distinct数和柱状图使用情况 SQL> select e.extension col_group, t.num_distinct, t.histogram from user_stat_extensions e, user_tab_col_statistics t where e.extension_name = t.column_name and e.table_name = t.table_name and t.table_name = ‘CUSTOMERS‘; COL_GROUP NUM_DISTINCT HISTOGRAM -------------------------------------------------------------------------------- ------------ --------------- ("CUST_STATE_PROVINCE","COUNTRY_ID") 145 FREQUENCY SQL>
实验:
1)当不使用多列统计信息时,真实结果是3341,执行计划是1132.
SQL> exec dbms_stats.drop_extended_stats(‘sh‘,‘customers‘,‘(cust_state_province,country_id)‘); PL/SQL procedure successfully completed. SQL> select count(*) from sh.customers where CUST_STATE_PROVINCE = ‘CA‘ and country_id=52790; COUNT(*) ---------- 3341 Execution Plan ---------------------------------------------------------- Plan hash value: 296924608 -------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 16 | 405 (1)| 00:00:05 | | 1 | SORT AGGREGATE | | 1 | 16 | | | |* 2 | TABLE ACCESS FULL| CUSTOMERS | 1132 | 18112 | 405 (1)| 00:00:05 | -------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("CUST_STATE_PROVINCE"=‘CA‘ AND "COUNTRY_ID"=52790) Statistics ---------------------------------------------------------- 121 recursive calls 0 db block gets 1685 consistent gets 0 physical reads 0 redo size 527 bytes sent via SQL*Net to client 524 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 15 sorts (memory) 0 sorts (disk) 1 rows processed
2)当使用多列统计信息时,真实结果是3341,执行计划是3437.
SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS(‘SH‘,‘CUSTOMERS‘,METHOD_OPT =>‘FOR ALL COLUMNS SIZE SKEWONLY FOR COLUMNS (CUST_STATE_PROVINCE,COUNTRY_ID) SIZE SKEWONLY‘); PL/SQL procedure successfully completed. SQL> select count(*) from sh.customers where CUST_STATE_PROVINCE = ‘CA‘ and country_id=52790; COUNT(*) ---------- 3341 Execution Plan ---------------------------------------------------------- Plan hash value: 296924608 -------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 16 | 405 (1)| 00:00:05 | | 1 | SORT AGGREGATE | | 1 | 16 | | | |* 2 | TABLE ACCESS FULL| CUSTOMERS | 3437 | 54992 | 405 (1)| 00:00:05 | -------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("CUST_STATE_PROVINCE"=‘CA‘ AND "COUNTRY_ID"=52790) Statistics ---------------------------------------------------------- 8 recursive calls 0 db block gets 1460 consistent gets 0 physical reads 0 redo size 527 bytes sent via SQL*Net to client 524 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
3)即以上情况,使用多列统计信息能让优化器得到更准确的判断!
11G新特性 -- Multicolumn Statistics (Column groups)
标签:
原文地址:http://www.cnblogs.com/abclife/p/4745812.html