In SQL 2000 I wanted to find unused indexes but found it to be difficult. I had to capture a huge trace that fully represents the workload and “trust” that the ITW knows what it’s talking about. You could also use the scan started trace event filtered by dbid, objectid, indexid. This procedure was very tedious.
In SS2005, a quick query of the dynamic management views lets you know which indexes are not being used. You can do this in a few minutes what had previously taken days or weeks. This alone is a very powerful feature.
I am proposing that you take it a step further. We are going to use a little logic and the missing index DMV’s to combine indexes and remove indexes that are still used but redundant. The steps in this process would look like this:
1. Remove unused indexes with the unused index script
2. Get your list of tables to analyze.
3. Remove redundant but used indexes.
4. Revue missing index DMV’s for mistaken index drops.
5. Combine indexes that where it is logical to do so.
6. Revue missing index DMV’s for mistaken index drops.
This methodology is most effective and viable when these conditions are met:
· The server has been online and thus collecting stats for a long time.
· The server is not pushing a hardware bottleneck. If so, this should be done during maint window.
· The server is enterprise edition and the tables allow online operations (i.e. no LOB data or partitions).
What are the benefits of removing unused indexes?
· Reduced writes during updates
· Reduced space usage
· Reduced backup\restore space and time
· Reduced index maintenance time
What are the benefits of removing unused indexes, used but redundant indexes and combining indexes?
· Less memory footprint for the same amount of data
· A different angle to index tuning
· Indexes are more likely to be covering
· Reduced writes during updates
· Reduced space usage
· Reduced backup\restore space and time
· Reduced index maintenance time
Now, let’s get down to business…
Remove Unused Indexes
This is pretty straightforward and relatively safe. You want to make sure that your server has been up long enough to get good index usage. This includes infrequent operations like month end reporting, etc.
--Unused indexes
declare @dbid int
select @dbid = db_id()
select object_name(s.object_id) as ObjName
, i.name as IndName
, i.index_id
, user_seeks + user_scans + user_lookups as reads
, user_updates as writes
, sum(p.rows) as rows
from sys.dm_db_index_usage_stats s join sys.indexes i on s.object_id = i.object_id and i.index_id = s.index_id
join sys.partitions p on s.object_id = p.object_id and p.index_id = s.index_id
where objectproperty(s.object_id,'IsUserTable') = 1 and s.index_id > 0 and s.database_id = @dbid
group by object_name(s.object_id), i.name, i.index_id, user_seeks + user_scans + user_lookups, user_updates
order by reads, writes desc
You can most likely drop any indexes with zero or close to zero reads. The more rows, the more space you will reclaim. The more writes, the better write performance you get by removing them. If the index is not being read but writes are minimal and rows are low, there is little benefit of dropping it. Keep in mind that not having an index that you need is a lot worse than having an index you don’t need in most situations. I suggest starting off with a conservative approach.
Now that we have gotten rid of the low hanging fruit, we can get deeper. I get a list of tables by size and work my way down.
select object_name(object_id), max(rows)
from sys.partitions
group by object_name(object_id)
order by 2 desc
After running that query we see that tEvent is the largest table and should provide nice gains. This is where we start. Let’s use this schema for our example.
create table tEvent
(EventID int primary key clustered,
EventType int,
EventName varchar(100),
EventDetailID int,
CustomerID int,
CompanyID int,
DateOpen datetime,
DateClose datetime)
create index ind1 on tEvents(EventDetailID);
create index ind2 on tEvents(CustomerID);
create index ind3 on tEvents(EventDetailID, CustomerID, CompanyID);
create index ind4 on tEvents(CompanyID, DateClose, EventType);
create index ind5 on tEvents(CustomerID, DateClose, EventType);
Removing used but redundant indexes
If we do a sp_helpindex, we can see that ind3 should satisfy queries currently using ind1 and ind2. The keyword is “should”. It’s a fairly safe bet so we drop ind1 and ind2. Now we should monitor the missing index DMV to see if there is any negative impact. We should also see the read count substantially increase by using the unused index query filtered by tEvent. Here is the missing index monitoring query:
--Missing indexes
SELECT sys.objects.name
, (avg_total_user_cost * avg_user_impact) * (user_seeks + user_scans) as Impact
, mid.equality_columns
, mid.inequality_columns
, mid.included_columns
FROM sys.dm_db_missing_index_group_stats AS migs
INNER JOIN sys.dm_db_missing_index_groups AS mig ON migs.group_handle = mig.index_group_handle
INNER JOIN sys.dm_db_missing_index_details AS mid ON mig.index_handle = mid.index_handle
INNER JOIN sys.objects WITH (nolock) ON mid.object_id = sys.objects.object_id
WHERE (migs.group_handle IN
(
SELECT TOP (5000) group_handle
FROM sys.dm_db_missing_index_group_stats WITH (nolock)
ORDER BY (avg_total_user_cost * avg_user_impact) * (user_seeks + user_scans) DESC)
)
and objectproperty(sys.objects.object_id, 'isusertable')=1 and name = 'tEvent'
ORDER BY 2 DESC , 3 desc
As long as the indexes do not come back up in this report, we should be ok. Now we are servicing the same queries with 2 less indexes. Those are index pages that are no longer taking up buffer pool space!
Combining indexes
The same concept can be applied to the following scenario but a little more “feel” and understanding of how the app accesses the data is needed.
create index ind4 on tEvents(CompanyID, DateClose, EventType);
create index ind5 on tEvents(CustomerID, DateClose, EventType);
We know a former DBA added these. It is logical for us, based on our knowledge of the app and the cardinality of the data, to try to replace these indexes with this one.
create index ind6 on tEvents(CustomerID, CompanyID, DateClose, EventType) with (online=on, maxdop=8);
This index should satisfy all queries using both indexes. Again, we want go back to the missing index report to see if SQL thinks it needs one of those indexes.
Now work your way down your list of tables by row count. I was able to reduce the size of a 200GB database by 15% in addition to removing the unused indexes. That’s a big gain! The only problem I ran into was changing the name of an index that had a hint. There were a few indexes that I had to add back but since my approach was conservative it was nothing drastic and completely online.
Let me know if you have any other tips or questions regarding this topic.