Archive for November, 2007

I think I need to add a CPU to my SQL Server host. How can I be sure?

Thursday, November 29th, 2007

Great question. Arriving at an accurate answer requires some legwork…

First of all, if you’ve enabled lightweight pooling you might be surprised that this feature might be wreaking havoc on your system; telling SQL Server to run in fiber mode sounds cool, but it’s rarely adviseable and it makes certain applications function erraticaly (SQLXML) or stop working altogether (CLR). Ken Henderson’s the pro in this area, and published a great article on the topic, aptly named The Perils of Fiber Mode.

Now, as for CPU-bound SQL Servers you can determine how your instance is handling requests by looking at DBCC SQLPERF(umsstats) in SQL Server 2000 (again, Ken’s published a highly in-depth article on schedulers called Inside the SQL Server 2000 User Mode Scheduler) or at sys.dm_os_schedulers in SQL Server 2005. For a good blog on how to use sys.dm_os_schedulers, go to Slava Oks’s blog on the topic.

With either version of SQL Server you need to create a baseline, or at least gather information over time to accurately determine whether your system is CPU bound; just determining that there is CPU pressure at a single point in time isn’t a good enough indication to go about upgrading your server. First, you should determine that your application code is tuned and there are no cascading resource bottlenecks freezing up your system. Tools like Quest’s Performance Analysis, together with SQL Tuning can identify inefficient statements over time and allow you to quickly get optimized plan recommendations; you can even link to Benchmark Factory to test the impact of the newly upgraded statements! Once you’ve analyzed and tuned your workload you can use osql, sqlcmd, or even ostress to execute scripts over time and dump their results to a table. Once you have a set of data you should set about determining if, for example in SQL Server 2005, the number of runnable tasks per each scheduler always greater than 1. If you’ve tuned your environment, and the aforementioned statement is still true, you can present a good argument for adding CPU resources to your server.

I haven’t changed anything, but my SQL Server’s response time has slowed considerably…

Thursday, November 29th, 2007

I’ve recently dealt with a number of cases/queries from DBAs making the same claim (above). What gives? All of these professionals were certain there was an internal issue plaguing their SQL Server when in fact physical file fragmentation was to blame.

Diagnosing and dealing with fragmented files shouldn’t be a last attempt at resovling a performance bottleneck. Rather, file configuration, growth and maintenance procedures should be an integral part of every DBAs planning and maintenance process. Why is fragmentation a problem? Check out this article from sql-server-performance.com to see how fragmentation occurs and how it affects your SQL Server (if you’ve got Quest’s Performance Analysis for SQL Server 6 this information will be presented to you in context-sensitive performance advisory topics). There are a number of tools in the marketplace (like Diskeeper) that can help, but good old-fashioned T-SQL can help overcome the problem just as well.

Quest offers a number of products that can help identify and deal with fragmentation (and a slew of other Disk I/O related slowdowns). For more information as your sales / support engineers about Spotlight, Performance Analysis and Foglight for SQL Server. This triple-threat of database analysis and tuning products interface with a number of Quest tools and dovetail nicely to take the guesswork out of even the toughest performance problems.

MultiServer queries in SQL Server 2008

Wednesday, November 28th, 2007

Half of this post is to show off the new multi server query feature and the other half is to show off the new web 2.0 content with the help of the Live Content module from our friends over at http://mandeeps.com .

So if you check out this video, you can see how useful this could be. Let's say you have all of your servers registered in SSMS 2008. Here are a few possibilities that come to mind.

  • Update backup scripts across the farm.
  • Create a new alert across the farm.
  • Check for version or edition.
  • Report on one or all entries in sys.configurations
  • Change a server wide configuration cross the farm.
  • exec sp_msforeachdb 'alter database "?" set parameterization forced" to change a setting across every database in the farm.
ms_thumb.jpg
Like the video content? Let me know!

What’s a mini-dump in SQL Server 2005?

Wednesday, November 28th, 2007

I was recently trying to diagnose some problems in one of my SQL Servers when I ran across a problem that I hadn’t encountered before involving ‘nonyielding worker threads’.  SQL Server also produced a dump for me to review.  Unfortunately, the dump didn’t make a lot of sense to me. 

 

If you encounter one of SQL Server’s mini-dump files and want more information, you’ll be happy to know that there’s a lot of information on-line:

 

http://www.microsoft.com/technet/prodtechnol/sql/2005/diagandcorrecterrs.mspx

 

This URL is also a wealth of diagnostic and internals information that I recommend for any serious SQL Server professional.

 

Enjoy!

 

-Kev

 

P.S. And check out the new CD by Paramore – best alt rock CD of the year IMO.

How can I compare a Date to the DATETIME value stored in SQL Server and how can I perform quarterly calculations?

Monday, November 26th, 2007

There is a TON to consider when dealing with dates. Still, things don’t have to be too difficult. Once you know the business your code will serve you have to then decide which date format you should use (look at the CONVERT function in books online and you’ll see that you can choose any format from USA with or without the century (mm-dd-yy | mm-dd-yyyy) to ODBC Canonical (yyyy-mm-dd hh:mi:ss(24h)) with many in between. The Internet is filled with examples of code that will format date values for you (like the FormatDate function in SQL Server Magazine InstantDoc 96608).

There is a great article that deals with all sorts of date conversions, comparisons and calculations on www.databasejournal.com. This should be right up your alley!

Data Compression in SQL Server 2008 coming next CTP

Monday, November 26th, 2007

The storage engine team reports data compression will be available in CTP6. Woohoo
Read more.

How can I disable all the foreign key constraints defined for a database?

Monday, November 26th, 2007

Let me start by saying that before you do something like this you should definitely determine why you think it’s a good idea. Foreign keys can slow INSERT operations for example, but the odds are they were created for a reason. Still, if you’ve evaluated all the possibilities and this is something you feel is appropriate, you can use a formatted SELECT statement to generate a script, then execute the script.

Formatted SELECT statements can be a DBAs best friend. Put together with the SQLCMD utility, they can be leveraged to create some very robust maintenance operations. In your case, running a formatted SELECT like the following, output to a file, then using SQLCMD to run the file could make for an automated “disable foreign key checks” routine.


SELECT 'ALTER TABLE ' + QUOTENAME(LTRIM(RTRIM([TABLE_SCHEMA]))) + '.' + QUOTENAME(LTRIM(RTRIM([TABLE_NAME]))) + ' NOCHECK CONSTRAINT ALL' FROM [INFORMATION_SCHEMA].[TABLES] WHERE OBJECTPROPERTY(OBJECT_ID(QUOTENAME(LTRIM(RTRIM([TABLE_SCHEMA]))) + '.' + QUOTENAME(LTRIM(RTRIM([TABLE_NAME])))), 'IsMSShipped') = 0 AND [TABLE_TYPE] = 'BASE TABLE'

I have a database that is log-shipped to two reporting servers and it is out of physical space. I need to create a secondary data file but have a few questions about this…

Monday, November 26th, 2007

Question continued…1) When I issue the create file command, will it be log shipped (I think it will)? 2) Will it fail when the drive specification is not replicated on the secondary servers (I think it will)? 3) What will that impact of that be and how can I work around it? 4) How will log-shipping continue when the secondary datafile is not created (or not created the same) as the primary server? And finally 5) What are your recommendations for doing this with minimal interruption.

The situation you’ve encountered is certainly not rare. At 50,000 feet: yes, you can add a file, yes it can be on different paths on different servers, yes the process will be logged, and yes doing so will throw an error. There’s a fifth yes though, and that’s, “Yes, there’s a simple workaround.”

While the process isn’t as “seamless” as Microsoft says it is at the beginning of the article this Microsoft KB article provides the details you’ll need to add a new database file. To sum up what you’re going to see in the article, you’re going to add the new file, SQL Server’s going to throw an error, then you’re going to manually restore the transaction log backup using WITH MOVE. Once you’ve done that, everything really will be seamless.

How can I use SSIS or something similar to export data to a number of worksheets in the same Excel spreadsheet?

Monday, November 26th, 2007

This MSDN article has what you’re looking for.

By leveraging OPENROWSET you can take this very simple code template and use it to create what you’re after:


INSERT INTO OPENROWSET('Microsoft.Jet.OLEDB.4.0',
'Excel 8.0;Database=C:\myreportspreadsheet.xls;','SELECT name FROM [Sheet_1$]')
SELECT [name] FROM master..sysobjects
INSERT INTO OPENROWSET('Microsoft.Jet.OLEDB.4.0',
'Excel 8.0;Database=C:\myreportspreadsheet.xls;','SELECT name FROM [Sheet_2$]')
SELECT [name] FROM AdventureWorks..sysobjects
GO

Before you run the aforementioned queries, you need to first make sure you’ve got an Excel spreadsheet on the C drive called myreportspreadsheet.xls with a column named name on both Sheet1 and Sheet2. If you run this as a scheduled Windows task you can add the file creates to the process, thereby ensuring the files exist before the job runs.

Take it a step further with the unused and missing index DMV’s

Sunday, November 25th, 2007

In SQL 2000 I wanted to find unused indexes but found it to be difficult. I had to capture a huge trace that fully represents the workload and “trust” that the ITW knows what it’s talking about. You could also use the scan started trace event filtered by dbid, objectid, indexid. This procedure was very tedious.

In SS2005, a quick query of the dynamic management views lets you know which indexes are not being used. You can do this in a few minutes what had previously taken days or weeks. This alone is a very powerful feature.

I am proposing that you take it a step further. We are going to use a little logic and the missing index DMV’s to combine indexes and remove indexes that are still used but redundant. The steps in this process would look like this:

1.       Remove unused indexes with the unused index script

2.       Get your list of tables to analyze.

3.       Remove redundant but used indexes.

4.       Revue missing index DMV’s for mistaken index drops.

5.       Combine indexes that where it is logical to do so.

6.       Revue missing index DMV’s for mistaken index drops.

This methodology is most effective and viable when these conditions are met:

·         The server has been online and thus collecting stats for a long time.

·         The server is not pushing a hardware bottleneck. If so, this should be done during maint window.

·         The server is enterprise edition and the tables allow online operations (i.e. no LOB data or partitions).

What are the benefits of removing unused indexes?

·         Reduced writes during updates

·         Reduced space usage

·         Reduced backup\restore space and time

·         Reduced index maintenance time

What are the benefits of removing unused indexes, used but redundant indexes and combining indexes?

·         Less memory footprint for the same amount of data

·         A different angle to index tuning

·         Indexes are more likely to be covering

·         Reduced writes during updates

·         Reduced space usage

·         Reduced backup\restore space and time

·         Reduced index maintenance time

Now, let’s get down to business…

 

Remove Unused Indexes

This is pretty straightforward and relatively safe. You want to make sure that your server has been up long enough to get good index usage. This includes infrequent operations like month end reporting, etc.

--Unused indexes

declare @dbid int 

select @dbid = db_id() 

 

select object_name(s.object_id) as ObjName

, i.name as IndName

, i.index_id 

, user_seeks + user_scans + user_lookups  as reads

, user_updates as writes 

, sum(p.rows) as rows

from sys.dm_db_index_usage_stats s join sys.indexes i on s.object_id = i.object_id  and i.index_id = s.index_id 

join sys.partitions p on s.object_id = p.object_id and p.index_id = s.index_id

where objectproperty(s.object_id,'IsUserTable') = 1  and s.index_id > 0 and s.database_id = @dbid 

group by object_name(s.object_id), i.name, i.index_id, user_seeks + user_scans + user_lookups, user_updates

order by reads, writes desc 

 

You can most likely drop any indexes with zero or close to zero reads. The more rows, the more space you will reclaim. The more writes, the better write performance you get by removing them. If the index is not being read but writes are minimal and rows are low, there is little benefit of dropping it. Keep in mind that not having an index that you need is a lot worse than having an index you don’t need in most situations. I suggest starting off with a conservative approach.

Now that we have gotten rid of the low hanging fruit, we can get deeper. I get a list of tables by size and work my way down.

select object_name(object_id), max(rows)

from sys.partitions

group by object_name(object_id)

order by 2 desc

After running that query we see that tEvent is the largest table and should provide nice gains. This is where we start. Let’s use this schema for our example.

create table tEvent

(EventID int primary key clustered,

EventType int,

EventName varchar(100),

EventDetailID int,

CustomerID int,

CompanyID int,

DateOpen datetime,

DateClose datetime)

 

create index ind1 on tEvents(EventDetailID);

 

create index ind2 on tEvents(CustomerID);

 

create index ind3 on tEvents(EventDetailID, CustomerID, CompanyID);

 

create index ind4 on tEvents(CompanyID, DateClose, EventType);

 

create index ind5 on tEvents(CustomerID, DateClose, EventType);

Removing used but redundant indexes

If we do a sp_helpindex, we can see that ind3 should satisfy queries currently using ind1 and ind2. The keyword is “should”. It’s a fairly safe bet so we drop ind1 and ind2. Now we should monitor the missing index DMV to see if there is any negative impact. We should also see the read count substantially increase by using the unused index query filtered by tEvent. Here is the missing index monitoring query:

--Missing indexes

SELECT  sys.objects.name

, (avg_total_user_cost * avg_user_impact) * (user_seeks + user_scans) as Impact

, mid.equality_columns

, mid.inequality_columns

, mid.included_columns

FROM sys.dm_db_missing_index_group_stats AS migs

INNER JOIN sys.dm_db_missing_index_groups AS mig ON migs.group_handle = mig.index_group_handle

INNER JOIN sys.dm_db_missing_index_details AS mid ON mig.index_handle = mid.index_handle

INNER JOIN sys.objects WITH (nolock) ON mid.object_id = sys.objects.object_id

WHERE     (migs.group_handle IN

(

SELECT     TOP (5000) group_handle

FROM sys.dm_db_missing_index_group_stats WITH (nolock)

ORDER BY (avg_total_user_cost * avg_user_impact) * (user_seeks + user_scans) DESC)

)

and objectproperty(sys.objects.object_id, 'isusertable')=1 and name = 'tEvent'

ORDER BY 2 DESC , 3 desc

As long as the indexes do not come back up in this report, we should be ok. Now we are servicing the same queries with 2 less indexes. Those are index pages that are no longer taking up buffer pool space!

Combining indexes

The same concept can be applied to the following scenario but a little more “feel” and understanding of how the app accesses the data is needed.

create index ind4 on tEvents(CompanyID, DateClose, EventType);

 

create index ind5 on tEvents(CustomerID, DateClose, EventType);

We know a former DBA added these. It is logical for us, based on our knowledge of the app and the cardinality of the data, to try to replace these indexes with this one.

create index ind6 on tEvents(CustomerID, CompanyID,  DateClose, EventType) with (online=on, maxdop=8);

This index should satisfy all queries using both indexes. Again, we want go back to the missing index report to see if SQL thinks it needs one of those indexes.

Now work your way down your list of tables by row count. I was able to reduce the size of a 200GB database by 15% in addition to removing the unused indexes. That’s a big gain! The only problem I ran into was changing the name of an index that had a hint. There were a few indexes that I had to add back but since my approach was conservative it was nothing drastic and completely online.

Let me know if you have any other tips or questions regarding this topic.