Tricks with User Settable Perfmon Counters

July 29th, 2010 by Michael Swart

So there’s this underrated SQL Server feature called User Settable Counters and they help SQL Server developers create our own custom perfmon counters.

You use them by using syntax like the following:

EXEC sp_user_counter1 @newvalue

This updates a counter than can be tracked using windows performance monitor etc…

Trick 1

Unfortunately this is not a counter that can be incremented. You have to know the new value to set. It would be great if we had stored procedures  like sp_increment_counterX so that we could simply increment the existing value of the counter by any given value. Well ask and you shall receive!

USE master
GO

CREATE PROCEDURE sp_increment_counter1
	@VALUE INT
AS

	DECLARE @newValue INT
	SELECT @newValue = @VALUE + cntr_value
	FROM sys.dm_os_performance_counters
	WHERE instance_name = 'User counter 1'
	and [OBJECT_NAME] like '%User Settable%'
	and counter_name = 'Query'

	EXEC sp_user_counter1 @newValue
GO

I like this stored procedure because it’s very light weight:

  • The cpu/IO/memory is negligible.
  • It doesn’t take or hold any significant locks.

Trick 2

If you have a lot of spare time, you can the commands sp_user_counter combined with WAITFOR to make some pretty pictures of your favorite skyline!

Toronto Skyline

Like the Toronto Skyline

Based on this this photo from BriYYZ (at Flickr):

Toronto skyline

Toronto skyline

Finding Key Lookups In Cached Execution Plans

July 29th, 2010 by Kendal Van Dyke

Ask anyone with performance tuning experience to list the most expensive operations in an execution plan and it's a safe bet you'll find key lookups near the top. The good news is they're usually easy to fix - Glenn Berry (Blog | Twitter) blogged about it recently and many others have written about the subject as well. For most people though, it's finding out when and where they're happening that's the tough part.

Background
A key lookup occurs when a nonclustered index is used to satisfy a query but doesn't include all the columns the query is asking for so they have to be retrieved from the corresponding clustered index (or heap if there's no clustered index) that the nonclustered index is based on. It's expensive because it requires fetching additional pages into the buffer which has a double whammy effect: if the pages aren't already in the buffer you have to wait for the disk subsystem to retrieve them and you're filling up the buffer with up to twice the number of data pages you'd need if you could just stick with using the nonclustered index to begin with.

Finding Key Lookups - There's a DMV For That
SQL Server keeps track of how many key lookups occur against every index in sys.dm_db_index_usage_stats (in the user_lookups column). Interesting, but to be useful we need to know what queries are causing the lookups. Enter the DMV sys.dm_exec_query_stats which keeps track of a wealth of information about how many times a query has been executed and the resources (CPU, disk, etc.) its used. Plug the sql_handle and plan_handle columns into the DMFs sys.dm_exec_sql_text and sys.dm_exec_query_plan, respectively, and we get the text and execution plan for the query. Because the execution plan is an XML document we can leverage SQL Server's native XML capabilities to find any key lookup operations that are occurring. Join them all together and - voila! - we can see every cached query that's got a key lookup, the additional columns being retrieved, and the execution plan, ordered by worst offender first - everything we need to know to work on eliminating the key lookups that are draining performance. Happy tuning!

NOTE: As the comments in the header suggest , exercise caution when running this against a production server…executing sys.dm_exec_query_plan can be resource intensive when your server contains a lot of cached plans. You have been warned!

     
/*********************************************************************************************      
Find Key Lookups in Cached Plans v1.00 (2010-07-27)       
(C) 2010, Kendal Van Dyke       
      
Feedback: mailto:kendal.vandyke@gmail.com       
      
License:       
   This query is free to download and use for personal, educational, and internal       
   corporate purposes, provided that this header is preserved. Redistribution or sale       
   of this query, in whole or in part, is prohibited without the author's express       
   written consent.       
   
Note:       
   Exercise caution when running this in production!       
      
   The function sys.dm_exec_query_plan() is resource intensive and can put strain       
   on a server when used to retrieve all cached query plans.       
      
   Consider using TOP in the initial select statement (insert into @plans)       
   to limit the impact of running this query or run during non-peak hours       
*********************************************************************************************/       
DECLARE @plans TABLE      
    (      
      query_text NVARCHAR(MAX) ,      
      o_name SYSNAME ,      
      execution_plan XML ,      
      last_execution_time DATETIME ,      
      execution_count BIGINT ,      
      total_worker_time BIGINT ,      
      total_physical_reads BIGINT ,      
      total_logical_reads BIGINT      
    ) ;      
      
DECLARE @lookups TABLE      
    (      
      table_name SYSNAME ,      
      index_name SYSNAME ,      
      index_cols NVARCHAR(MAX)      
    ) ;       
      
WITH    query_stats      
          AS ( SELECT   [sql_handle] ,      
                        [plan_handle] ,      
                        MAX(last_execution_time) AS last_execution_time ,      
                        SUM(execution_count) AS execution_count ,      
                        SUM(total_worker_time) AS total_worker_time ,      
                        SUM(total_physical_reads) AS total_physical_reads ,      
                        SUM(total_logical_reads) AS total_logical_reads      
               FROM     sys.dm_exec_query_stats      
               GROUP BY [sql_handle] ,      
                        [plan_handle]      
             )      
    INSERT  INTO @plans      
            ( query_text ,      
              o_name ,      
              execution_plan ,      
              last_execution_time ,      
              execution_count ,      
              total_worker_time ,      
              total_physical_reads ,      
              total_logical_reads      
            )      
            SELECT /*TOP 50*/      
                    sql_text.[text] ,      
                    CASE WHEN sql_text.objectid IS NOT NULL      
                         THEN ISNULL(OBJECT_NAME(sql_text.objectid,      
                                                 sql_text.[dbid]),      
                                     'Unresolved')      
                         ELSE CAST('Ad-hoc\Prepared' AS SYSNAME)      
                    END ,      
                    query_plan.query_plan ,      
                    query_stats.last_execution_time ,      
                    query_stats.execution_count ,      
                    query_stats.total_worker_time ,      
                    query_stats.total_physical_reads ,      
                    query_stats.total_logical_reads      
            FROM    query_stats      
                    CROSS APPLY sys.dm_exec_sql_text(query_stats.sql_handle)      
                    AS [sql_text]      
                    CROSS APPLY sys.dm_exec_query_plan(query_stats.plan_handle)      
                    AS [query_plan]      
            WHERE   query_plan.query_plan IS NOT NULL ;      
      
      
;WITH XMLNAMESPACES (      
   DEFAULT 'http://schemas.microsoft.com/sqlserver/2004/07/showplan'      
),      
lookups AS (      
   SELECT  DB_ID(REPLACE(REPLACE(keylookups.keylookup.value('(Object/@Database)[1]',      
                                                            'sysname'), '[', ''),      
                         ']', '')) AS [database_id] ,      
           OBJECT_ID(keylookups.keylookup.value('(Object/@Database)[1]',      
                                                'sysname') + '.'      
                     + keylookups.keylookup.value('(Object/@Schema)[1]',      
                                                  'sysname') + '.'      
                     + keylookups.keylookup.value('(Object/@Table)[1]', 'sysname')) AS [object_id] ,      
           keylookups.keylookup.value('(Object/@Database)[1]', 'sysname') AS [database] ,      
           keylookups.keylookup.value('(Object/@Schema)[1]', 'sysname') AS [schema] ,      
           keylookups.keylookup.value('(Object/@Table)[1]', 'sysname') AS [table] ,      
           keylookups.keylookup.value('(Object/@Index)[1]', 'sysname') AS [index] ,      
           REPLACE(keylookups.keylookup.query('      
for $column in DefinedValues/DefinedValue/ColumnReference       
return string($column/@Column)       
').value('.', 'varchar(max)'), ' ', ', ') AS [columns] ,      
           plans.query_text ,      
           plans.o_name,      
           plans.execution_plan ,      
           plans.last_execution_time ,      
           plans.execution_count ,      
           plans.total_worker_time ,      
           plans.total_physical_reads,      
           plans.total_logical_reads      
   FROM    @plans AS [plans]      
           CROSS APPLY execution_plan.nodes('//RelOp/IndexScan[@Lookup="1"]') AS keylookups ( keylookup )      
)       
SELECT  lookups.[database] ,      
        lookups.[schema] ,      
        lookups.[table] ,      
        lookups.[index] ,      
        lookups.[columns] ,      
        index_stats.user_lookups ,      
        index_stats.last_user_lookup ,      
        lookups.execution_count ,      
        lookups.total_worker_time ,      
        lookups.total_physical_reads ,      
        lookups.total_logical_reads,      
        lookups.last_execution_time ,      
       lookups.o_name AS [object_name],      
        lookups.query_text ,      
        lookups.execution_plan      
FROM    lookups      
        INNER JOIN sys.dm_db_index_usage_stats AS [index_stats] ON lookups.database_id = index_stats.database_id      
                                                              AND lookups.[object_id] = index_stats.[object_id]      
WHERE   index_stats.user_lookups > 0      
        AND lookups.[database] NOT IN ('[master]','[model]','[msdb]','[tempdb]')      
ORDER BY lookups.execution_count DESC      
--ORDER BY index_stats.user_lookups DESC      
--ORDER BY lookups.total_logical_reads DESC       

Return of the 24 Hours of PASS

July 29th, 2010 by Thomas LaRock

In case you missed it I wanted to let everyone know that we are planning on doing our third 24 Hours of PASS. I sent out a call for abstracts recently to the people that have been selected to present at the PASS Summit in November (about 70 people in total). I got back about 40 responses to my initial email and 35 expressed interest in participating. I know, I know, 35 is a lot more than 24. After I figure out the schedule and assign slots then we’ll have more than enough people available to serve as a backup in case someone needs to back out at the last possible minute (hey, it happens!)

I targeted the PASS speaker list because we are looking to promote the PASS Summit and thought it was best to ask people if they were interested in promoting their sessions. This can be a great way to drive attendance for those speakers that are doing a pre/post con session. For others it can be a way for then to practice their session one more time before the Summit, or just some general shameless self-promotion.

The biggest change to our format this time is that we are going to spread the event over two days (September 15th and 16th). We are keeping the same start time(s) as the last event, promptly at 12 noon GMT for both days. We’ll run twelve straight sessions, then take a twelve hour respite, and then crank it back up again the next day. Hopefully that will allow for more people to attend all the sessions they want to see.

Yeah, I know…people on the other side of the Earth have a less than optimal viewing time. When we looked at the numbers for the last event we found that an overwhelming majority of the attendees are from the USA and Europe. So we decided to switch up the format a bit in order to make it easier for those attendees to attend more sessions this time around. If you were one of the few from the Asia/Pacific region and are miffed that you will be forced to spend your overnight hours watching the sessions live, feel free to send me an email and we can discuss everything in more detail.

You should be able to begin registering for the event on or around August 11th. Don’t hesitate to ask any questions about the event, I will do my best to answer them.

Video: Disk Tuning and Optimization for SQL Server

July 29th, 2010 by Kevin Kline

This video focuses on accelerated Disk Optimization and improved productivity with SQL Query Tuning.

Video created Dec. 18, 2006.

Database Backup Disk Space Capacity Planning

July 29th, 2010 by Jeremy Carter

Database Backup Disk Space Capacity Planning. Try saying that three times fast!

Thomas LaRock (aka SQLRockstar) just posted an excellent piece on capacity planning. As I was reading this I was thinking about how true it is that for so many people we just do not spend adequate time planning and there are not alot of resources out there to refer to.  At the same time I had just recently worked with a client planning their recovery strategy when the topic of disk space for the backups came up. Basically we needed to figure out how much disk space they would need for their objectives so that the proper sized san luns could be carved out. I had previously written this script to fullfill that need and thought I would share with everyone. I find it very handy for looking for trends in backup sizes as well as planning disk space/retention requirements.

The script is pulling alot of information from the backup history tables in msdb and then formatting it into a nice report. It will display how much the full, diff, and log backups are taking on a daily basis for each database along with totals. I’ve included a small sample of a result set below (with the db name changed to protect the innocent). In the example you can see the results for the summary of All Databases along with one of the databases.

SELECT
	[Database Name], [Start Date], [All Types],
	ISNULL([Full],'') AS [Full],
	ISNULL([Differential],'') AS [Differential],
	ISNULL([Transaction Log],'') AS [Transaction Log]
FROM(
	SELECT
		ISNULL(bu.database_name,'*All Databases')AS [Database Name],
		CASE bu.type
			WHEN 'D' THEN 'Full'
			WHEN 'I' THEN 'Differential'
			WHEN 'L' THEN 'Transaction Log'
			ELSE 'All Types'
		END as [Backup Type],
		ISNULL(convert(char(10), backup_start_date, 120),'All Dates' )as [Start Date],
		CAST((SUM(bu.backup_size/1024/1024)) AS int) AS [Size in MB]
	FROM msdb.dbo.backupset as bu
	WHERE DATEDIFF(d, backup_start_date, GETDATE()) <= 6
	GROUP BY bu.database_name, convert(char(10), backup_start_date, 120),  bu.type
	WITH CUBE
) AS SourceTable
PIVOT(SUM([Size in MB]) FOR [Backup Type] IN ([All Types], [Full], [Differential], [Transaction Log])) AS PivotTable
ORDER BY [Database Name] ASC, convert(char(10), [Start Date], 120) DESC

SQLServerPedia Status Update – 29th July 2010

July 29th, 2010 by IKick

Firstly, welcome to lots of new bloggers, check them out on our Bloggers page. The rate at which new requests to syndicate have come in has certainly surprised and amazed me! Many thanks.

There have been a few changes to the site you may have noticed recently:

Contact Address

We’ve changed ask@sqlserverpedia.com to hello@sqlserverpedia.com, please use this address from here on.

Blog Categories

Brett Epps has added new sections to the blog menu. A lot of you have been writing articles about Professional Development, so whilst these aren’t specific to SQL Server they are certainly relevant to the users of the site. We have always wanted to keep the main blog focussed on SQL Server content so we have separated these posts to a Professional Development blog page. You can post to this page by creating a ‘Professional Development’ category on your blog and putting the posts in both the normal syndication category you use for the feed and the new one. The site will filter as necessary.

For those using Blogger, which doesn’t support categories, just label your relevant posts ‘Professional Development’.

There is a section for Database Design – category name ‘Database Design’, and a section for Podcasts.

We are looking at creating a Wiki section for these articles as well.

We also have a Site Updates section where you can see this and previous SQLServerPedia Status Updates.

If there are other section topics you feel would be relevant please let us know.

Buzz is renamed to Pulse

For those unfamiliar with Pulse, it searches relevant topics on Twitter and Delicious so users can find the current topics under discussion in the community. You can vote for favourite topics (or just your own!) so other users will see them listed first.

Here is a listing of the current search terms we are using. If there are others you feel are relevant please let us know:

 Twitter:

  • “sql server”
  • “sql 2005″
  • “sql 2008″
  • t-sql
  • tsql
  • #sqlhelp
  • “microsoft dba”
  • @sqlserverpedia
  • to:sqlserverpedia
  • #sqlserverpediatraining

 Delicious:

  • sqlserver
  • tsql
  • sql2000
  • sql2005
  • sql2008

  

Update Statistics

Some interesting stats:

  •  SQLServerPedia has 66 bloggers contributing
  •  SQLServerPedia has over 3,400 blog posts containing over two million words of content.
  •  SQLServerPedia Pulse has nearly 100,000 items in its content database.
  •  Monday-Thursday we average 9,000+ visits and had a day last week with 9,904! Lesson here, don’t post on a Friday or the weekend!

  

Wiki

I have updated the TempDB wiki page, please edit as you feel it appropriate. We are really keen for you guys to update the wiki pages. If you prefer send us the blog post around the subject and we will do the editing for the wiki as necessary.

  • Digg
  • del.icio.us
  • DZone
  • DotNetKicks
  • Google Bookmarks
  • Ping.fm
  • Reddit
  • StumbleUpon
  • TwitThis

Running a query against multiple servers at one

July 29th, 2010 by Denny Cherry

So you’ve got a bunch of machines that you want to run a quick query against.  SQL Server 2008’s Management Studio gives you a quick and easy.  Open up the Registered Servers in Management Studio and select a group of servers.  Then right click on the group and click on the “New Query” option.

This will open a new query window where you can run a query against all the servers that are online in that group. In my sample query shown below you’ll see that I ran SELECT @@VERSION against all the servers.  When I ran this 3 of the 6 servers in the group were online so three servers were able to return data.

Now if you look at the messages tab (look down) you’ll see which servers the query ran against, and which servers it failed against.  It also tells you what accounts the query was run using (based on the connection info for each server).

Now you can’t merge data together in a temp table as everything in the query window will be run against each server.  It simply displays the information together.  You can pull a single value like I showed above, or you can query a table.

Now when querying from a tables on multiple servers you’ll need to make sure that the schema for those tables are identical otherwise it will only return the data for the first table that it queries.  An error will be shown on the messages tab telling that you that the schemas don’t match if this is the problem.

Denny

The Dangers of Sub-queries!!!!

July 28th, 2010 by Wes Brown

Ok, now that I have your attention this really should be titled the danger of not qualifying all objects, but that isn’t quite as sensational enough to make you click through to here :)

Imagine if you will, a developer expects ten records to be inserted into a table and the query has been running for almost an hour.  A quick glance and sp_who2 shows that it is using a lot of IO, I mean a lot. The first thing that happens is a command decision from the boss to kill the query. I took a quick peek and found the plan still in the plan cache while we waited for this thing to roll back. As I look at the plan I see something really wrong. It looks like the table receiving the inserts isn’t getting the ten records we thought but 169 million records, every record from the table in the select clause.

I look at the query and on the surface everything looks good. When I hit the check mark it compiles without error. Looking at the plan in detail I notice a little icon that tipped me off on where to look next.

image

Oh that’s bad. Warnings: No join predicate. Ouch. But this is a sub-query, how can I have no join predicate? Lets take a look at an example.

DROP TABLE #t1
DROP TABLE #t2
GO
CREATE TABLE #t1 (
  t1id    INT   NOT NULL,
  t1name  VARCHAR(50)   NULL,
  t1place VARCHAR(50)   NULL)

ALTER TABLE #t1
 ADD   PRIMARY KEY ( t1id )

CREATE TABLE #t2 (
  t2id    INT   NOT NULL,
  t1name  VARCHAR(50)   NULL,
  t1place VARCHAR(50)   NULL)

ALTER TABLE #t2
 ADD   PRIMARY KEY ( t2id )

INSERT INTO #t1
SELECT
  spid,
  loginame,
  hostname
FROM
  MASTER.dbo.sysprocesses
WHERE  hostname <> ''

INSERT INTO #t2
SELECT
  spid,
  loginame,
  hostname
FROM
  MASTER.dbo.sysprocesses
WHERE  hostname <> ''

This gives us two tables with some data. We want to find all the records in #t1 that also exist in #t2 but instead of a join we will use an IN and a sub-query.

SELECT
  t1name,
  t1place
FROM
  #t1
WHERE  t1id IN (SELECT
                 t1id
                FROM
                 #t2)
 

Looks simple enough and will pass the compile test all day long even though t1id doesn’t exist in #t2. Since you can use columns from the top level query in the sub-query this is a perfectly valid piece of T-SQL.

It gives us this plan when we look at it though.

image

And there you have it, a join without an ON clause as far as the optimizer is concerned. By not qualifying all the columns in the sub-query we opened ourselves up to this error. This isn’t the optimizers fault! If we re-write this as it was mean to be:

SELECT
  t1name,
  t1place
FROM
  #t1
WHERE  t1id IN (SELECT
                 t2id
                FROM
                 #t2)
 

We get a plan that is more to our liking.

image

You could also re-write this as a join instead of using the IN clause and would have avoided this problem as well. Scary way to learn that SQL Server will do exactly what you tell it to even if you are wrong!


Who created that backup file?

July 28th, 2010 by Pradeep Adiga

Today my day started with a drive full issue. That too a drive on which the tempdb was residing! Someone had kept some good old backup files on this drive, may be in an effort to clean up another drive.

The question was whodunit? This was a good question. Even though we may not be able to track who copied it there, at least we can know who took the backup.

The approach is even though very simple, but at times it is very useful. Yes, I am referring to RESTORE HEADERONLY…….

Restore Headeronly command gives the header information for a given backup set. The header contains some useful information like the Backup Type, User Name, Server/Database Name, Backup Start & End Date etc. The syntax is something like

RESTORE HEADERONLY FROM DISK = 'C:\GuessWhoDunit.bak'

and the output will be like

In my case, it turned out that the backup was taken using SQL Server Agent and the user name was the service account !

Even though this exercise did not give the expected results, it was good to put in use one of the DBA friendly commands.

The Right Level of “Right”

July 28th, 2010 by Jen McCown

…or, How I Learned to Stop Worrying and Love the Bomb

I had a SQL dev’s dream come to me via email yesterday: a high-urgency, high-profile request for data that was both interesting and doable. I spent a few hours on it last night and this morning, and handed over a beautiful solution that was well appreciated.  The end. 

Courtesy "Today is a good day" on Flickr

Shiny!

Well, not really the end. This task made me think hard about the right level of “right”. Of course there’s more than one wrong way to retrieve data: for example, anything that wastes massive effort or resources, or comes back incomplete or otherwise incorrect.  But there’s a balance to all things… in this case, a balance between “BEST POSSIBLE SOLUTION” and “slapped-together piece of nearly unusable crap “.  We’re too often pushed toward the latter by time constraints and demanding situations, but we can also be lured into unproductivity by the shiny shinyness of the former. 

What the hell am I talking about? I’ll tell you. 

To take the example of my task: It was a request for two spreadsheets representing data from 2010 and 2009…a moderately complicated pivot of (real-life) entities and events, and which participated in which. The way the data was structured, I had a few options: 

1. Best Solution

The absolute best solution (or, one of the best solutions) would be to 

  1. Set up a user-defined table datatype of event keys, and declare an @Event table of that type (to be passed into a stored procedure).
  2. Write the SP with some dynamic SQL to create and populate a #temp table using participant Entity data JOINed to the @Event parameter table, and a final select out. 
  3. Bust out an SSRS report to use that SP, and set all that up in our user interface to  be called on demand, or schedule it to run monthly and publish to a shared folder.

For this particular solution, the dynamic SQL would’ve kicked my butt for too long, and the report would’ve taken (in my opinion) more time than it’s worth. Plus, we have other considerations: the SP should really be flexible enough to produce this report, or allow for a report on a single Entity’s participation in Events; that’s another layer of complexity. Finally, this is a report that won’t run often: maybe twice a year, in my estimation.  That’s an awful lot of time to spend on a setup that the users need NOW NOW NOW. 

2. Crap Solution

Create a manual pivot SELECT statement, based on the events needed. This is a horrible solution: while it’d probably be significantly faster than the Best Solution above, it’s be a huge pain in the butt, and it would have to be done for the 2010 data, and redone for the 2009 data, AND redone any time someone wants to rerun for 2010 data (events are still being added this year, you know).  The simplest solution, even in a quick-fire situation like this, is not always the best solution. 

3. Middle Ground

My solution was a temporary table of event keys, a manually created “pivot” temp table, and a cursor to update it for each event column. THAT’S RIGHT,  A CURSOR!  Cursors shouldn’t be part of your everyday code, absolutely. But sometimes they are beneficial, and have a minimal impact. This was such a time: it simlpified and quickened my development cycle, had a minimal impact (cycling through 114 repetitions, not millions), and isn’t at all likely to be used often (no more than twice a year max). 

Several of you just blew a gasket, because this isn’t something you would have done. I understand, and I’m with you…but there’s right, there’s wrong, and then there’s the right level of right. 

Happy days, 

Jen McCown 

http://www.MidnightDBA.com