Choosing Index Keys
From SQLServerPedia
|
See Also: Main_Page - Database Administration - Database Design - Indexing The effectiveness of an index largely depends on how selective its keys are. In other words, the keys that have one or few rows associated with them will be most efficient. For instance, the primary key of authors table is au_id. This key identifies each author uniquely; each value of author identifier is only associated with a single record. On the other hand, each value of the contract column (1 or 0) in authors table is associated with many authors, and is therefore not very selective. Query optimizer is more likely to use an index with higher selectivity for satisfying individual query requests. Therefore, you should strive to build indexes on key(s) with high selectivity. Index density is the reverse of selectivity. The lower the density, the higher the selectivity. This also means that an index with higher density is less likely to be used by query optimizer than an index with low density. Information about index selectivity and density is stored in distribution statistics. You can retrieve such information by executing DBCC SHOWSTATISTICS, as follows: USE pubs GO DBCC SHOW_STATISTICS ('authors', 'aunmind')
The first row of output will look similar to the following: Updated Rows Rows Sampled Steps Density Average Key Length Jun 13 2003 11:32PM 7075442 7075442 135 7.45E-07 8 Although individual keys might not be very selective, when combined several columns can provide good selectivity. For example, an index on marital status won't be very selective; however, if you combine marital_status column with job_category and income_level the selectivity might go up significantly. It is also important to choose the order of keys in the clustered index built on multiple columns. Since data in a table is ordered according to the structure of the clustered index, the index built on last_name, first_name won't behave the same way as the one built on first_name, last_name. The most selective columns should be specified first. You should attempt to build a clustered index on keys that will NOT change often. Since data is ordered according to the order of clustered index key(s), changing the key value might require moving the index rows from one page to another. Re-shuffling clustered index rows can add undue overhead to the system. Generally, it is recommended to build clustered indexes on columns with narrow data types. For instance an index built on a column with the INTEGER data type will be more efficient than one built on a VARCHAR column. This happens because the non-clustered indexes will have to contain pointers that contain clustered index keys. Leaner indexes are easier to read through and can therefore be more effective. Keep in mind, though, that this advice is rather broad. Effectiveness of your index largely depends on the way your application reads and writes data in the database. Building a clustered index on the column with the smallest data type in the table does not guarantee that the index will be helpful. SQL Server will let you build multiple indexes on the same key(s). However, doing so will not benefit your application. In fact, having duplicate indexes provides more alternatives for query optimizer and therefore can add an overhead in choosing the optimal execution plan. More SQLServerPedia Articles on IndexesHow SQL Server Indexes Work
Types of Indexes in SQL Server
Best Practices on How to Design Database Indexes
Maintaining Indexes for Top PerformanceIndexes need regular maintenance in order to perform well.
|