Archive for December, 2007

Mery Christmas and Happy Holidays from SQLServerPedia

Monday, December 24th, 2007

Dear SQL Server community – have a great holiday (and few days off!).  We’ll be back with more SQL Server database knowledge and answers when you get back!

Cheers, the SQLServerPedia staff!

We are Microsoft

Thursday, December 20th, 2007

There is a charity coding event coming up here in Dallas. I have volunteered. I recommend you do so as well if you are local. They need .net developers, DBA's and graphic designers.

http://wearemicrosoft.com

"

Are you a .NET developer, database administartor or graphics designer? Do you want to spend 3 fun-filled days competing with your peers to build the best application?

"

SQLDumper.exe

Thursday, December 20th, 2007

You might recall my recent posting about getting minidumps from SQL Server.  Even if you’ve never looked for dumps from SQL Server, you should get familiar with SQLDumper.exe.  SQLDumper is an internally called process that can produce a stack dump in mini, full, and filtered formats.  SQL Server has a default approach to how it does dumps, but you can alter this default behavior by starting SQL Server using trace flags ranging from 2540 to 2559.  (You should do so only under the advice of PSS, btw.)

 Although SQLDumper can crank out a stack dump from any application, I recommend you stick with SQL Server.  Also, don’t plan on using SQLDumper as general purpose dumping utility.  Check out Microsoft’s recommendation for general purpose debugging at http://www.microsoft.com/whdc/devtools/debugging/default.mspx. However, you can learn more about how and when to use SQLDumper with these resources: 

·         http://support.microsoft.com/kb/917825 – How to use SQLDumper to generate dump files in SQL Server 2005

·         http://support.microsoft.com/kb/827690 – How to use SQLDumper to generate dump files for Windows Applications

 

That doesn’t mean that its easy to get SQLDumper working for you.  About the only way I’ve seen that you can get really good with SQLDumper is to have the unfortunate occurrence of needing it – i.e. lots of crashes.

 

If you have any experience with working with SQLDumper, I’d love to hear your feedback here.

 

Best regards,

 

-Kevin

 

My manager asked me to remove unused indexes in our OLTP database. I have two questions, is this really important and how can I tell?

Tuesday, December 18th, 2007

Excellent questions…

Why remove unused indexes? Because you’re creating extra I/O operations keeping unused indexes around, and if [hopefully] you have maintenance plans ensuring you have up-to-date statistics and that clustered indexes are defragmented, you’re wasting precious time in your maintenance window.
How can you detect unused indexes? Well if you’re using SQL Server 2005, you can use the [sys].[dm_db_index_usage_stats] DMV.

The following query will return the indexes defined in a given database, sorted by the most frequently updated index to the least, and will highlight indexes that haven’t been used since the instance was last brought online:

select OBJECT_NAME(idx.[object_id]) "base table"
,CASE WHEN ist.[object_id] IS NULL THEN '!! UNUSED INDEX: ' ELSE '' END+ISNULL([name],'HEAP') "index name"
,idx.[index_id],[type_desc],[is_primary_key]
,ISNULL([user_updates],0) "user updates"
,ISNULL([user_seeks],0) "user seeks"
,ISNULL([user_scans],0) "user scans"
,ISNULL([user_lookups],0) "user lookups"
from [sys].[indexes] idx
left join [sys].[dm_db_index_usage_stats] ist
on idx.[object_id]=ist.[object_id]
and idx.[index_id]=ist.[index_id]
and ist.[database_id]=db_id()
order by [user_updates] desc, [user_seeks], [user_scans], [user_lookups]

Clearly, the longer your instance has been online, the more confident you can be that the unused indexes specified actually are not being used. Also, it’s imperative that you always remain aware that (for example) month-end, quarterly, semi-annual, etc jobs that might rely on some less frequently used indexes. Still, you don’t necessarily have to drop an unused index, especially if it’s been defined in a small table or if you’re not noticing slowdowns in application processing…but you might consider changing your maintenance jobs to focus on the most heavily-used indexes first.

My transaction log backups are taking longer and longer to complete. What can be causing this?

Tuesday, December 18th, 2007

I would bet you’re running into an issue where too many virtual log files (VLFs) have been created for your database. Each physical SQL Server log file is internally divided into a number of VLFs. VLFs have no fixed size, and there is no fixed number of VLFs for a physical log file. SQL Server’s Database Engine chooses the size of the virtual log files dynamically while it is creating or extending log files, but information is available on how this works. VLFs are added as follows when the transaction log file grows:
< 64MB = 4 VLFs
64MB – 1GB = 8 VLFs
> 1GB = 16 VLFs

To understand, in more detail, transaction log architecture read: MSDN Virtual Log Files (SQL Server 2000) or MSDN Transaction Log Physical Architecture (SQL Server 2005)

To see how many VLFs exist in your database issue the [undocumented] DBCC LOGINFO command. The number of rows returned is equivalent to the number of VLFs in your database. The Status column tells you a VLF is still active if the value is 2. Don’t be alarmed when you see the output of DBCC LOGINFO, you can rectify the situation; if you haven’t noticed slowdowns you should still be aware of how your logs are configured and how many VLFs exist in your database. VLFs affect system performance when log files are defined with a small initial size and use small growth_increment values. Too many VLFs can slow down database startup and also log backup and restore operations. To see how following some [egad] proven best practices to ensure good transaction log throughput read: Kimberly Tripp’s 8 Steps to better Transaction Log throughput

In your case you’ll likely have to:
Shrink your transaction log file. In SQL Server 2000 many DBAs probably know that multiple BACKUP LOG or DBCC SHRINKFILE commands were necessary to shrink a transaction log. To understand why and to see how to use DBCC SHRINKFILE in SQL Server 2005 for this condition read: KB Article How to use the DBCC SHRINKFILE statement to shrink the transaction log file in SQL Server 2005
…then address the configured size of your transaction log file. I say file, because there’s little to no benefit to creating multiple transaction log files (read Kimberly Tripp’s 8 Steps article above…). You want to attempt to adequately size the transaction log and allow for an auto-growth rate (to ensure you don’t get transaction log full errors) that will ensure very infrequent growth.  You also want to ensure you understand enough about your workload to ensure log truncation operations don’t burden the system (you don’t want the log truncating every second).

To read a practical case, with examples, of how small increment values and auto-shrink can be detrimental to database performance read: Tibor Karaszi’s Blog on why to be restrictive with DB file shrinking

I’m trying to come up with a database backup strategy, do I need to do more planning than to decide when to take full/differential backups?

Monday, December 17th, 2007

The goal of a good backup / recovery strategy is consistent data availability in the event of a corruption or disaster. So, while it’s helpful to determine how frequently you need full and differential backups, you definitely need to consider other factors. Database backups are only helpful if they can be restored. Also, where and how you store backups is important. What happens if your production database server goes down, or what if the hosting facility loses power in a flood or earthquake? Sure, not all scenarios apply to everyone, but storage, redundancy and consistency checks are all crucial in ensuring a successful backup / recovery strategy.

Disaster recovery strategies are outside the scope of this post, however leveraging the following SQL Server features can help strengthen your backup / recovery strategy and avoid integrity problems when you least expect them while maintaining data availability:

DBCC CHECKDB
You’ll almost certainly want to use DBCC CHECKDB to find corruptions before they become a problem. Paul Randal wrote CHECKDB for SQL Server 2005, and frequently posts in-depth, yet concise information on SQL Server internals and architecture on his blog. His series entitled CHECKDB From Every Angle tackles many of the topics you’ll need to consider and dispels the myth that you don’t have time for consistency checking.

BACKUP DATABASE…COPY_ONLY
If you’re performance tuning your production database or you have developers that need a copy of production data this option might help. You can create a full database backup without affecting the differential bitmap in the database. Consider this scenario: you perform full database backups every Monday, Wednesday, and Friday, and perform differential backups in between. Your development staff needs a full backup on Tuesday, but you know that doing so will affect the differential bitmap of your database. Enter the COPY_ONLY option. Using Copy-Only Backups lets you keep the developers happy and keep your backup strategy sound.

CREATE DATABASE…AS SNAPSHOT
Auditing production data is an industry in itself, but many shops rely on fast, ad-hoc reporting to satisfy management requests. These same shops often cannot afford to risk allowing ad-hoc statement executions for fear of data being accidentally modified. In these situations, database snapshots offer a good solution. A snapshot is a complete, read-only copy of a database that can be created extremely quickly and that consumes very little space on disk. When data is read from the snapshot, it is actually being read from the primary database. As data changes in your primary database, SQL Server writes the previous data to a sparse file. The snapshot will then read the sparse file for the data that has been changed and continue to read the primary database for data that has not changed.

One final note – Quest’s Litespeed for SQL Server product’s low impact, high-performance compression technology can reduce storage costs (data can be compressed up to 95%) and SQL backup and recovery windows while maintaining complete control over the backup and recovery process. It even supports object-level recovery!! It’s definitely worth a look.

I keep reading “Best Practice” information on database design. The theories are great and all, but are they really relevant?

Monday, December 17th, 2007

Best practices are absolutely relevant; they’re not gospel or absolutes, but they should be part of every DBA’s repertoire to help make appropriate, informed decisions. I think of best practices like my mom’s voice in the back of my mind whenever I go to grab a pot on the stove, “Use a towel, that’s probably going to burn you.” My neurosis aside, ignoring best practices can do just that – leave you burned.

Let’s look at two very basic “Best Practices”:

  1. Disabling Auto-shrink
  2. Fully-qualified Table Names

Disabling Auto-shrink
Auto-shrink can lead to serious fragmentation problems and resource contention at inopportune moments. The problem is auto-shrink works really well…at shrinking files. But shrinking files doesn’t mean shrinking seek times, and the “automatic” nature of the feature means it can fire during peaks in production workload processing. What auto-shrink is acutally doing is moving data pages, beginning from the last page, to the first available free slot available. In a previous post I mentioned that DBAs need to develop a plan for dealing with physical and logical fragmentation to help maintain query performance. Diskeeper or T-SQL can be used to deal with physical fragmentation, while logical fragmentation can be handled using DBCC DBREINDEX or ALTER INDEX … REBUILD. The thing is, even with a good maintenance plan, leaving auto-shrink enabled means that the feature could fire during or immediately after your maintenance task, effectively negating the defragmentation effort. Here’s an example of how enabling auto-shrink can take a new index from 0% fragmentation to 100% fragmentation in a single run. The example shows how an entire index is reorganized in opposite order (100% fragmented). Ouch.

Fully-qualified Table Names
It all comes down to plan reuse. Microsoft has published two excellent papers on this topic, the first is Troubleshooting Performance Problems in SQL Server 2005 and the second is Batch Compilation, Recompilation, and Plan Caching Issues in SQL Server 2005. The Cliff’s Notes summary is simply that plan reuse is jeopardized when objects are referenced without their schema prefix. I said that best practices are a guideline, so if you really don’t want to implement fully-qualified object names or you’re doubtful of the impact, look at the uid (user ID) column in sys.syscacheobjects. Only query plans with the same user ID can be reused. So, when the value for uid is -2, the query does not depend on implicit name resolution and can be shared among different user IDs. Of course, you could bypass the extra checking and just fully-qualify the object names (pick the pot up with a towel).

Does a database backup/restore update statistics?

Thursday, December 13th, 2007

Original question: Does a full backup and restore of a database rebuild table indexes and update statistics? We were seeing some slow query response times for a query running on a production server compared to the test server. We ran profiler trace, looked at execution plans and dbcc showcontig between the two databases and everything is pretty much the same except the same query in production was over 10 times slower than the one ran in the test environment with the same hardware. So in a rather desperate measure we did a full backup and then restore of the production database and now the same query is running about the same time as the test server. Hence my question about table indexes and statistics?

To answer the first part of the question, nothing is updated with a database restore; the database backup saves the current database as an as-is image and the restore restores that image. If you had out-of-date statistics before the backup, you’ll still have them afterwards.

With the information provided, it would appear that your backup/restore operations fixed a physical file (extent) fragmentation problem (see a previous post on fragmentation in SQL Server). Assuming your testing environment was created from a production database backup, you wouldn’t have had the physical fragmentation to contend with in that environment because the extents would have been restored to the file system in physical order.

You’re going to have to create a plan to deal with physical file fragmentation in your production environment. In addition, index maintenance should be planned to deal with logical fragmentation. Microsoft SQL Server 2000 Index Defragmentation Best Practices is a very worthwhile article to read on the topic. Your best bet will be to either run DBCC SHOWCONTIG, or used the information in sys.dm_db_index_physical_stats to determine the extent of logical / extent fragmentation.

One more topic of note. The SQL Server Storage Engine Blog ran a great series on fragmentation explaining SQL Server structures that’s worth a read. Gaining an understanding of these structures will go a long way in better understanding what fragmentation is and how it can affect the various structures in SQL Server.

SQL Connections Spring 2008 Orlando

Wednesday, December 12th, 2007

I am speaking SQL Connections in Orlando this April. The session is entitled "Augmenting the DBA toolbox with SSRS" It will be similar to my pass session except it is all about Reporting Services and it will be 100% SQL 2008. However, most reports will be 2005 compatible with minor changes. It will totally be about working smarter not harder. I will have a ton of tools that you can integrate into your environment and I will show you how to put your existing tools to work.

I am really excited about this. I missed this year but I went in 2006 and 2005. For more infomation, check out http://sqlconnections.com You can also read the co-chair blogs here and here.

Beware of CPU bottlenecks on SQL Server on VMWare ESX Server 3.0

Wednesday, December 12th, 2007

I have worked on two applications that were completely virtualized both on the front end and on the data tier. This post will cover my observations and experience.  Both applications were similar. Both were newly developed ASP.Net 2.0 web applications with a SQL2005 x86 standard edition backend.  They went with a virtual environment to avoid CAPEX. They both had 2 VPU, 2GB of RAM and a good HP EVA SAN. The databases were new so they were relatively small. The first one was tuning during a pre-production automated load test. The second was tuning a production work load.

I treated it as a normal tuning gig. I captured a perfmon log to determine which, if any, hardware subsystems were the bottleneck. They both turned up no disk or memory bottleneck. CPU averaged 70-80%. However, slow response times at the front end were reported. Upon further investigation, there was a CPU queue length, SOS_SCHEDULER_YIELD wait types, and runnable tasks averaging five or higher. We definitely had a CPU bottleneck. The only odd thing is we could not hit 100% CPU.

After some research, it appears that the CPU time counter may not be totally accurate. The full details can be found in this white paper.

“CPU usage data collected within virtual machines is not useful for two reasons. First, this data does not always accurately reflect the overhead of virtualization that is incurred by the ESX Server host. Second, because of the way time is kept within virtual machines, the usage data itself may be inaccurate(for details, see “Timekeeping in VMware Virtual Machines” in “References”). For these reasons, we use CPU usage data collected on the ESX Server host in this study. We used the esxtop tool to collect resource utilization statistics for VMware ESX Server. For further details, see “Appendix 2: Data Collection.”

Well, I went on tuning like a normal CPU bottleneck. We made substantial gains mainly through indexing, parameterization and caching at the web tier.  We did nothing different because of virtualization.

Although I have yet to experience it myself, the white paper shows nice gains by using x64 especially if you need a multiprocessor environment so that is another thing to keep in mind. I would love to hear your experience.