Archive for October, 2008

Want a promotion? Find yourself a junior DBA

Friday, October 31st, 2008

If you want to get ahead in your career, and if you want to move up in the same company you’re at, somebody is going to have to do your job.  You can’t get promoted if you’re irreplaceable.  (And frankly, if you think you’re irreplaceable, your boss is probably already planning to replace you.)

Look around your office and find somebody who would love to have your job. If you’re a DBA, maybe it’s a developer who’s tired of learning new languages every 3-4 years and would like to stick with one (T-SQL) for a while.  Maybe it’s a network admin who’s tired of fixing desktops and servers and wants to make more money.  But either way, make sure it’s someone that you enjoy working with, because you’ll be training them for a while.

Approach them and start by saying, “Have you thought about becoming a DBA?  I’ve seen you in action, and I think you’d make a good DBA.  One of the toughest things about becoming a database administrator is getting your first job as a DBA, though, and I’d like to help you by getting you started with training.  One or two days a week, I’ll show you what I’m doing, and you can take over some of the basic tasks over time if you want.”

If they clap their hands and offer to wash your car, they’re your new junior DBA.  Even if YOU are a junior DBA, there’s somebody who wants this training, and they’ll be thankful for the opportunity.

If you’re paranoid and you’re worried that they’re going to steal your job, relax: we’re not going to give them the fun parts.  Give them the basic, day-to-day keep-the-lights-on tasks that you don’t particularly enjoy – things that won’t get you promoted.  Knowing how to check all of the backups isn’t going to make you famous, and every hour that you free up out of your schedule means an hour you can spend learning the latest & greatest tools and tricks.

Training this person will mean extra time initially on your part, so focus on training them on things that will pay off for you.  You don’t want to train them on something that you only do once a month, because it won’t free up your schedule and it won’t be useful to them either.  Train them on things you do all the time, like:

  • Interpreting sp_who and sp_who2 to find out who’s blocking queries
  • Reading query plans to improve a query
  • Indexing tables or removing unnecessary indexes
  • Backing up and restoring databases
  • Installing the client tools

Or whatever else you find yourself doing more than once a week.

If this concept sounds crazy to you, ask yourself: how many times has your boss come up to you lately and said, “You’re doing a great job.  How would you like to take on this cool new tool we got in?”  Or is he handing you the crappy stuff that nobody wants?  If you’re getting the short end of the stick, or if you’re never getting to play with cool toys, it’s probably because your manager sees you as too overwhelmed with your current responsibilities – and only you can fix that.

5 Quick Tips for the Query using the “Wrong” Index

Thursday, October 30th, 2008

I quote “wrong” because with the info the optimizer has, it thinks it has the right index. :)

1. Look for cardinality problems. You can do this by looking at the actual rows versus estimated rows in the query plan. If they are off, chances are you are not following the best practices listed here.

2. Conversions in the WHERE clause can cause issues. Be sure to look at query plan for implicit conversions

3. Complicated logic in the WHERE clause can also cause problems. Consider indexed persisted computed columns or indexed views.

4. Constraints give the optimizer more info when choosing a plan with the “right” index. It can take a PK or unique constraint into consideration but it does not know about similar logic in a trigger.

5. Large datasets are being returned but the index is not covering. An index with just the columns in the WHERE clause is ok if just a few rows are returned but it does not take much for an index scan to be more efficient.

#1 is the most important one because the link gives 5 or 7 more reason why the “wrong” index is chosen.

P.S. If all else fails, update stats and if that does not work, make the index covering.

This Update Came from SQL Server

Wednesday, October 29th, 2008

Disclaimer:

Just because you can does not mean you should and even if you should, use with care. For example, you will have no friends after a day or so of updating your facebook status every time a tlog backup runs on your farm. There are also many security concerns to consider. That said, I can think of a few uses for SQL Server Web 2.0 mashups and I am not all that creative.

How it works

Sign up at http://ping.fm and then sign up at whatever services you want to update from SQL. Here is a list of what they currently support. From SQL, we could send a message to twitter or even better, yammer via database mail. Here is the obligatory, “Don’t send anything confidential over email”

Ping.fm is a simple service that makes updating your social networks a snap!

Possible Uses

Augmenting your monitoring – The keyword is word is “augmenting” except for the smallest installations. For the crown jewel of the company, you might get a call from the 24/7 staff and an email. Why not a tweet too especially if it can add SMS functionality that you may not currently have?

Turn X into an RSS feed – Where X could be, all job executions, any SEV error messages, any trace event or DDL operations, or [your idea here]. You can funnel the RSS into the intranet, your reader or [insert creative idea here].

An insert trigger on your press release table – Blast it to the world. Some companies are already doing similar things http://dell.com/twitter.

Personal updates from the SQL Editor – I don’t know about you but I have spent weeks on end in tsql. Some deadlines don’t allow for time to come up for air. It would pretty cool in a geeky way to tell the world that I just replaced 500 lines of cursor code with a 10 liner or whatever without leaving SSMS.

 

Example Code:

sp_configure 'show advanced options', 1;       
        
GO       
        
RECONFIGURE;       
        
GO       
        
sp_configure 'Database Mail XPs', 1;       
        
GO       
        
RECONFIGURE       
        
GO       
        
EXECUTE msdb.dbo.sysmail_add_account_sp       
        
@account_name = 'Ping.fm demo',       
        
    @description = 'Mail account',       
        
    @email_address = 'jason@statisticsio.com',       
        
    @replyto_address = 'jason@statisticsio.com',       
        
    @display_name = 'Jason Massie',       
        
    @mailserver_name = 'mail.statisticsio.com' ;       
        
-- Create a Database Mail profile       
        
EXECUTE msdb.dbo.sysmail_add_profile_sp       
        
@profile_name = 'MailAccount',       
        
    @description = 'Profile used for administrative mail.' ;       
        
-- Add the account to the profile       
        
EXECUTE msdb.dbo.sysmail_add_profileaccount_sp       
        
@profile_name = 'MailAccount',       
        
    @account_name = 'Ping.fm demo',       
        
    @sequence_number =1 ;       
        
-- Grant access to the profile to all users in the msdb database       
        
EXECUTE msdb.dbo.sysmail_add_principalprofile_sp       
        
@profile_name = 'MailAccount',       
        
    @principal_name = 'public',       
        
    @is_default = 1 ;       

-- here is the magic blasphemy.
EXEC sp_send_dbmail @profile_name ='MailAccount', @recipients = 'YourUniqueEmail@ping.fm, @subject = 'Blogged:', @body = 'This update came from SQL Server http://ping.fm/XiEkj'

I just updated twitter, linkedin, facebook, plurk, rejaw from SQL!

In closing

One more time, pay attention to security. There are a few commercial products on the horizon. http://ping.fm and most of these service also have API’s that may be a better alternative than database mail. Finally, database mail was meant for administrative alerts and I suggest you stick with that mindset..

32bit SQL Server Gone Too?

Tuesday, October 28th, 2008

They drew the line in the sand at the OS.

First and foremost, 32-bit is done. History. Archives. Windows Server 2008 R2 is the first Windows OS platform to go 64-bit only, and frankly it was high time. Customers have been unable to purchase a 32-bit server CPU for over two years now, and the advancements in CPU architectures really dictated that we squeeze as much performance out of customers’ hardware purchases as possible. The move to 64-bit is a first step.

Read the full post here.

I assume they *could* still support SQL11 running in wow64. We will see. I gave my opinion, as did others, in this post a while back.

Windows 2008 R2 will be 64-bit only

Tuesday, October 28th, 2008

From the horse’s mouth at at the Windows Server Division WebLog:

“First and foremost, 32-bit is done. History. Archives. Windows Server 2008 R2 is the first Windows OS platform to go 64-bit only, and frankly it was high time. Customers have been unable to purchase a 32-bit server CPU for over two years now, and the advancements in CPU architectures really dictated that we squeeze as much performance out of customers’ hardware purchases as possible. The move to 64-bit is a first step.”

SQL Server database administrators everywhere are probably clapping with delight, but that clapping may slow down a little if you use virtual machines on your desktop computer for testing or development.  If you’d like to run Windows 2008 R2 as a virtual server guest, that means you need to be running a 64-bit host – which means your laptop or desktop will need to be running a 64-bit version of Windows.

Update: fellow Questie David Gugick points out that you can still run 64-bit guests on 32-bit host OS’s as long as your underlying hardware is 64-bit compliant.

merry-go-round scans

Tuesday, October 28th, 2008

An often overlooked feature of SQL Server Enterprise Edition is support for advanced scanning, referred to as merry-go-round scans.  Often I'm asked about the differences between enteprise edition and standard, this is one that I often forget about this one.  Recently I was researching some information on read-consistency problems ( nolock etc) and came across this type of scan.  There are many resources about the differences, i've quoted some here below, and also list the reference.

From:

http://blogs.msdn.com/boduff/archive/2008/01/24/why-should-i-use-sql-enterprise-edition.aspx

http://www.microsoft.com/Sqlserver/2005/en/us/compare-features.aspx

http://www.sqlmag.com/Articles/Print.cfm?ArticleID=49285


There are some key enterprise edition only performance benefits across RAM, Parallelism, Query Plans and DISK I/O that will lead to better performance on high end systems, which I will try to list here.

1) Lock Pages in Memory

Lock Pages In Memory" allows SQL Server 2005 to manage its own memory (as opposed to having the operating system do it). It is extremely important to give this right to the SQL Server Service account, especially on 64-bit SQL Server 2005 systems with lots of RAM. It also is required on 32-bit systems to enable AWE.

See http://blogs.msdn.com/psssql/archive/2007/10/18/do-i-have-to-assign-the-lock-privilege-for-local-system.aspx

2) Advanced Scanning (aka Merry-go-round scan)

In SQL Server Enterprise Edition, the advanced scan feature allows multiple tasks to share full table scans. If the execution plan of a Transact-SQL statement requires a scan of the data pages in a table and the Database Engine detects that the table is already being scanned for another execution plan, the Database Engine joins the second scan to the first, at the current location of the second scan. The Database Engine reads each page one time and passes the rows from each page to both execution plans. This continues until the end of the table is reached.

See http://msdn2.microsoft.com/en-us/library/ms191475.aspx and  Merry-Go-Round Culprits for performance variances

3) Larger Read Ahead Buffering on Storage Devices

Determining I/O section mentions that EE does up to 1024k read ahead buffering on a Storage Area Network (std only does 64k). This indicates that EE is more suitable to SAN’s which need more buffering due to increased latency.

http://www.microsoft.com/technet/prodtechnol/sql/bestpractice/pdpliobp.mspx

4) Large Page Extensions

SQL Enterprise Edition retrieves pages up to eight at a go.

http://msdn2.microsoft.com/en-us/library/aa337525.aspx

5) Parallel index Operations

This is particularly useful in data warehouses where indexes may be frequently dropped and re-created.

http://msdn2.microsoft.com/en-us/library/ms189329.aspx


Using Microsoft Operations Manager with SQL Server

Tuesday, October 28th, 2008

Microsoft Operations Manager (formerly MOM, now SCOM) has gained some traction in Windows shops for monitoring servers. I like it a lot for Windows, especially IIS and Sharepoint, but out of the box, it’s not that useful for DBAs. The default set of SQL Server alerts throw a lot of false alarms on things I can’t change, and it ignores the things I really want to know.

I haven’t seen a good howto article that helps with SQL Server monitoring with MOM – until now. Tom LaRock, who moonlights as SQLServerPedia’s Monitoring Editor when he’s not fighting crime, wrote an in-depth article for Simple Talk:

Operations Manager: A Big Tinker Set

Reporting Services-generated MDX, Subselects and Formula Caching

Tuesday, October 28th, 2008

Analysis Services can, of course, cache the results of MDX calculations and share that cache with other users most of the time... but when it can't the result is much unnecessary load on your server as calculations are recomputed again and again, and much longer query times. Unfortunately the details on what caching AS can do when are not properly documented anywhere and I've only learned the bits and pieces that I know about this subject from my own experience and from talking to the likes of Mosha; I've avoided blogging about it because my knowledge is incomplete and this is a complex topic. Happily the upcoming AS2008 Performance Guide should rectify this lack of proper documentation, and I believe Mosha is also going to address this subject in his pre-conf seminar at PASS.

However, there is one specific scenario that I thought I'd write about because I come across it regularly and it affects anyone using a client that generates MDX with subselects in, such as the Reporting Services query builder. Let's say you've got an expensive calculated member on your cube - for example, add the following (deliberately slow) calculated measure to the Adventure Works cube:

CREATE MEMBER CURRENTCUBE.MEASURES.EXPENSIVECALC AS
COUNT(bottomcount(
{[Scenario].[Scenario].&[1], [Scenario].[Scenario].&[2]}
*
[Account].[Account].[Account].members
*
[Date].[Date].[Date].members
*
[Department].[Departments].[Department Level 02].members
, 10,[Measures].[Amount])
);

In Reporting Services you might then use the query builder to create a query that looked like this to select just Calendar Year 2004 on rows and this new calculated measure on columns:

SELECT
NON EMPTY { [Measures].[EXPENSIVECALC] } ON COLUMNS,
NON EMPTY { ([Date].[Calendar Year].[Calendar Year].ALLMEMBERS ) }
DIMENSION PROPERTIES MEMBER_CAPTION, MEMBER_UNIQUE_NAME
ON ROWS
FROM (
SELECT (
{ [Date].[Calendar Year].&[2004] } )
ON COLUMNS
FROM [Adventure Works])
CELL PROPERTIES VALUE, BACK_COLOR, FORE_COLOR, FORMATTED_VALUE,
FORMAT_STRING, FONT_NAME, FONT_SIZE, FONT_FLAGS

On a cold cache, on my laptop, this query takes 33 seconds; on a warm cache it still takes 33 seconds. This is not good - practically all of the query time is taken up evaluating the calculation, and this calculation is clearly not being cached. The reason why is because the query contains a subselect, and when a query contains a subselect it means that the results of any calculations can only be cached for the lifetime of the query. You can easily rewrite the query as follows to avoid the subselect:

SELECT
{ [Measures].[EXPENSIVECALC] } ON COLUMNS,
NON EMPTY { [Date].[Calendar Year].&[2004] }
DIMENSION PROPERTIES MEMBER_CAPTION, MEMBER_UNIQUE_NAME
ON ROWS
FROM [Adventure Works]
CELL PROPERTIES VALUE, BACK_COLOR, FORE_COLOR, FORMATTED_VALUE,
FORMAT_STRING, FONT_NAME, FONT_SIZE, FONT_FLAGS

On a cold cache this query still takes 33 seconds, but on a warm cache it comes back in 1 second because the calculation can now be cached and this cache can be reused for subsequent queries. Clearly this is much better.

I wonder how many SSRS implementations are suffering from needlessly poor performance just because of this? Subselects are a useful addition to MDX and make calculating 'visual totals' type values very easy, but they should be used with care and not just as a means of selecting individual members.

SQLServerPedia Wiki

Tuesday, October 28th, 2008

Yesterday marked the soft launch of a new technical resource, the SQLServerPedia Wiki. The official launch will happen during the PASS conference next month, but I wanted to help spread the word now. Why? Good question. In addition to my crime fighting duties, I also happen to be the Monitoring Editor for the wiki.  Brent asked me a few weeks ago if I would be interested and I said “no”. He then said “please” and I said “hell no”. He then said he would buy me a beer at PASS next month and I said “fine”.

True story.

So, yes, I have an interest in promoting the wiki. I am very excited to be able to contribute to the community in a way that I feel is most appropriate for today’s technical learning. I think many of us find ourselves using wikis and blogs in order to piece together information when researching issues. Getting in on the ground floor for the best SQL wiki is a wonderful opportunity.

Of course, we will go nowhere without getting people to contribute. I know that not everyone wants to be a rock star, not everyone wants to write or contribute their ideas. But unless we can find a way to encourage others to contribute, then this wiki will be nothing more than the ramblings of a dozen Jagermeister infused MS SQL database administrators. And we need it to be more than that. More importantly, we want it to be more than that.

So, contribute. Please. Even if it is just a few lines. Or if it is just a comment you leave for me on this blog, or in an email that says “hey, you forgot this, so open your eyes and add this bit of knowledge”. We need that content in order to get this wiki to where we all think it belongs.

Reporting Services-generated MDX, Subselects and Formula Caching

Tuesday, October 28th, 2008

Analysis Services can, of course, cache the results of MDX calculations and share that cache with other users most of the time... but when it can't the result is much unnecessary load on your server as calculations are recomputed again and again, and much longer query times. Unfortunately the details on what caching AS can do when are not properly documented anywhere and I've only learned the bits and pieces that I know about this subject from my own experience and from talking to the likes of Mosha; I've avoided blogging about it because my knowledge is incomplete and this is a complex topic. Happily the upcoming AS2008 Performance Guide should rectify this lack of proper documentation, and I believe Mosha is also going to address this subject in his pre-conf seminar at PASS.

However, there is one specific scenario that I thought I'd write about because I come across it regularly and it affects anyone using a client that generates MDX with subselects in, such as the Reporting Services query builder. Let's say you've got an expensive calculated member on your cube - for example, add the following (deliberately slow) calculated measure to the Adventure Works cube:

CREATE MEMBER CURRENTCUBE.MEASURES.EXPENSIVECALC AS
COUNT(bottomcount(
{[Scenario].[Scenario].&[1], [Scenario].[Scenario].&[2]}
*
[Account].[Account].[Account].members
*
[Date].[Date].[Date].members
*
[Department].[Departments].[Department Level 02].members
, 10,[Measures].[Amount])
);

In Reporting Services you might then use the query builder to create a query that looked like this to select just Calendar Year 2004 on rows and this new calculated measure on columns:

SELECT
NON EMPTY { [Measures].[EXPENSIVECALC] } ON COLUMNS,
NON EMPTY { ([Date].[Calendar Year].[Calendar Year].ALLMEMBERS ) }
DIMENSION PROPERTIES MEMBER_CAPTION, MEMBER_UNIQUE_NAME
ON ROWS
FROM (
SELECT (
{ [Date].[Calendar Year].&[2004] } )
ON COLUMNS
FROM [Adventure Works])
CELL PROPERTIES VALUE, BACK_COLOR, FORE_COLOR, FORMATTED_VALUE,
FORMAT_STRING, FONT_NAME, FONT_SIZE, FONT_FLAGS

On a cold cache, on my laptop, this query takes 33 seconds; on a warm cache it still takes 33 seconds. This is not good - practically all of the query time is taken up evaluating the calculation, and this calculation is clearly not being cached. The reason why is because the query contains a subselect, and when a query contains a subselect it means that the results of any calculations can only be cached for the lifetime of the query. You can easily rewrite the query as follows to avoid the subselect:

SELECT
{ [Measures].[EXPENSIVECALC] } ON COLUMNS,
NON EMPTY { [Date].[Calendar Year].&[2004] }
DIMENSION PROPERTIES MEMBER_CAPTION, MEMBER_UNIQUE_NAME
ON ROWS
FROM [Adventure Works]
CELL PROPERTIES VALUE, BACK_COLOR, FORE_COLOR, FORMATTED_VALUE,
FORMAT_STRING, FONT_NAME, FONT_SIZE, FONT_FLAGS

On a cold cache this query still takes 33 seconds, but on a warm cache it comes back in 1 second because the calculation can now be cached and this cache can be reused for subsequent queries. Clearly this is much better.

I wonder how many SSRS implementations are suffering from needlessly poor performance just because of this? Subselects are a useful addition to MDX and make calculating 'visual totals' type values very easy, but they should be used with care and not just as a means of selecting individual members.