Index Seek和Index Scan的区别

Index Seek是Sql Server执行查询语句时利用建立的索引进行查找，索引是B树结构，Sql Server先查找索引树的根节点，一级一级向下查找，在查找到相应叶子节点后，取出叶子节点的数据。对于聚集索引，叶子节点是整个表的数据，能够获取到所有列的数据，而对于非聚集索引，叶子节点存储的是索引列的数据，如果索引有包含列，那么叶子节点中存储有包含列的数据，获取的数据是索引列和包含列，如果还需要其他列的数据，那么必须进行key lookup，根据索引叶子节点包含的“行地址”信息到源表中去获取数据，这部属于Index Seek，而是Key Looup（书签查找）的范畴。Index Seek非常适用于从大数据量的表中返回少量记录的查询。

Index Scan是直接遍历索引树的所有叶子节点，类似于Table Scan。

SQL有一个查询优化分析器 Query Optimizer，其在执行查询之前首先会进行分析，当查询中有可以利用的索引时，其会优先分析使用Index Seek进行查询的效率，在使用Index Seek查询效率并不好的情况下，其会使用Index Scan进行查询。那究竟是在什么情况下会造成Index Seek效率比Index Scan还低呢？

1.在要查询的表中数据不是很多的话，使用Index Seek效率不一定高，因此使用Index seek还要先从索引树开始，然后再利用叶子节点去查找相应的行。在行树比较少的情况下，还没有直接进行Index scan快。

2.在返回的数据量大的情况下，在返回的数据量占总数据量的50%或者超过50%则使用Index Seek效率不一定好，在返回的数据量占10%-15%时，利用Index Seek能获得最佳的性能。

3.在建立索引的列的取值很多是一致的情况下，建立索引不一定能获得很好的效率。其实理由很简单，当建立索引的列取值的变化少的情况下，建立的索引二叉树是矮胖型的，树层次不高，很多行的信息都包含在叶子上，这样的查询显然是不能很好的利用到索引。

http://blog.sqlauthority.com/2009/08/24/sql-server-index-seek-vs-index-scan-diffefence-and-usage-a-simple-note/

SQL SERVER – Index Seek vs. Index Scan – Diffefence and Usage – A Simple Note

In this article we shall examine the two modes of data search and retrieval using indexes- index seeks and index scans, and the differences between the two.

Firstly, let us revisit indexes briefly. An index in a SQL Server database is analogous to the index at the start of a book. That is, its function is to allow you to quickly find the data you are searching for inside the book; in the case of a database, the “book” is a table.

An index scan means that SQL Server reads all rows in a table, and then returns only those rows that satisfy the search criteria. When an index scan is performed, all the rows in the leaf level of the index are scanned. This essentially means that all of the rows of the index are examined instead of the table directly. This is sometimes contrasted to a table scan, in which all the table data is read directly. However, there is usually little difference between an index scan and a table scan.

You may wonder why the Query Optimizer may choose to do an index or table scan. Surely it is much faster to first look up data using an index than to go through all the rows in a table? In fact, for small tables data retrieval via an index or table scan is faster than using the index itself for selection. This is because the added overhead of first reading the index, then reading the pages containing the rows returned by the index, does not offer any performance improvement for a table with only a few rows.

Other reasons to use an index scan would be when an index is not selective enough, and when a query will return a large percentage (greater than 50%) of rows from the table. In such cases the additional overhead of first using the index may result in a small degradation of performance.

An index seek, on the other hand, means that the Query Optimizer relies entirely on the index leaf data to locate rows satisfying the query condition. An index seek will be most beneficial in cases where a small percentage (less than 10 or 15%) of rows will be returned. An index seek will only affect the rows that satisfy a query condition and the pages that contain these qualifying rows; this is highly beneficial, in performance terms, when a table has a very large number of rows.

It is also worth noting that it is usually not worthwhile to create indexes on low-cardinality columns as they would rarely be used by the Query Optimizer. A low-cardinality column is one that contains a very small range of distinct values, for example a ‘Gender’ column would have only two distinct values- Male or Female. An example of a high-cardinality column is of course the primary key column, in which each value is distinct.

In summary, the Query Optimizer generally tries to perform an index seek. If this is not possible or beneficial (for example when the total number of rows is very small) then an index scan is used instead.