Archive for August, 2008

Powershell vs. TSQL

Saturday, August 30th, 2008

posh Dan Jones, from the SQL Server manageability team, posted on "PowerShell vs. T-SQL or Why Did We Add PowerShell Support in SQL2K8" and then Jeffery Snover from the powershell team expanded on this topic. Lots of good info and strong words. To quote, "If you work with Microsoft Server products, you are going to be a PowerShell user."

It has been a a long day so I will not add commentary right now although there were a few WTF moments while reading. With that, I will leave you with 21 Characters.

SQL Server Memory Tracking with Windows Task Manager

Friday, August 29th, 2008

Have you ever looked at the Windows Task Manager for a quick read on SQL Server 2005 memory consumption?
 

You can get good information from the Windows Task Manager about SQL Server memory, but you have to know what to look for. When running with AWE memory enabled on a 32-bit system, you may see an unusually low amount of memory in the Windows Task Manager.  If AWE is being used, you will probably get a much better idea of how much AWE memory is being consumed by checking page file usage.  The page file usage value and the amount of AWE memory will correspond.
You can then cross-check this value against sys.dm_os_sys_info.bpool_committed for the exact value.
(Thanks to Geoff N. Hiten, the “SQL Craftsman”, and Erland Sommarskog for the tip.)

Is there a way to find out from system tables when the store procedure was changed last?

Friday, August 29th, 2008

Absolutely. You can use the sys.objects view to determine the create_date and modify_date for an object. The query would be:


select [name],[create_date],[modify_date]
from [sys].[objects]
where [type]=N'P' and [is_ms_shipped]=0

To see the full definition of the procedure you would run:


select [name],[create_date],[modify_date],[definition]
from [sys].[objects] obj
join [sys].[sql_modules] mod
on obj.[object_id]=mod.[object_id]
where [type]=N'P' and [is_ms_shipped]=0

We are trying to implement log shipping to run hourly. If our nightly full backup takes 1 hour and 45 minutes, what is our recovery process? Is it a bad thing to have a transactional backup in the middle of a full backup?

Friday, August 29th, 2008

Yes, this is bad because it can lead to your log shipped database getting out of sync and will probably result in Error 4305; error 4305 states that the log in a backup set is too late to apply, which means your log shipping jobs will fail until you can synchronize the databases. You should schedule your transaction log backup job to stop before your full backup job begins and then to restart once it has completed. See the Microsoft TechNet article on setting up Log Shipping.

When you perform a a full database backup SQL Server stores the ending log sequence number (LSN), which becomes the starting LSN for the next transaction log backup. So, if you are performing a transaction log backup while your full database backup is executing, and that transaction log backup is successful, once it is shipped and applied to the subscriber it will have a different starting LSN than the ending LSN of the full backup. This is not allowed and SQL Server will throw error 4305.

SQL Server Videos

Friday, August 29th, 2008

I ran across SQL Server videos on .netkicks (RSS). I think this is a fairly new site and I must say it is pretty nice. Good clean layout with RSS feeds to let you know when new stuff arrives. You can stream with WMP or silverlight AND you can download in 3 different resolutions plus ipod format. On top of all of that, there is a transcript of the video.

I watched the 400 level video on extra wide covering indexes. It was well thought out and I learned something new. What more could you want? Oh yah, it is free!

Update: You can also follow Michael on twitter here.

Consolidation – Calculating Combined CPU

Friday, August 29th, 2008

I got a consolidation query from Andrey.  He asked:

“I have a question related to SQL Server consolidation. You know that the physical memory on any machine is limited and SQL Server tends to consume all available memory. Let’s say I am consolidation two SQL Servers. The first one consumes 1GB of memory and the second one is consuming 1GB of memory. I tempt to say that we need 2GB when consolidating. BUT what if I have only 1.5GB available on the target server? Will the performance suffer if I consolidate two servers and make then only 1.5 GB available? Do you have any ideas on how to estimate memory requirements in this case? The same applies to CPU and IO as well.”

That question comes up all the time when we talk about consolidation.  First, let’s talk about memory since that one’s the easiest to measure.

Memory is so cheap now (under $500 for 8gb of server-quality memory) that when people consolidate, they should throw a lot of memory into the server.  You’re probably consolidating old servers that had less memory, so it’s easy to take 4 servers with 4gb ram each (16gb total) and consolidate them into one server with 32gb of ram.  When you’re saving tens of thousands of dollars on SQL Server licensing, it’s easy to argue for $2,000 of memory (32gb) to help guarantee the project’s success.

That’s my personal real-world answer to memory issues during consolidation: you throw a lot of memory at the problem, because it’s cheaper to buy memory than it is to spend a lot of time analyzing the applications and the memory use.  And it’s way cheaper to buy this memory than it is to roll back a failed consolidation project.

But that doesn’t answer your question, because a cheap company might not do that, and I wanna make sure I answer your question.

Here’s the problem with consolidation and memory: before, when you had two separate servers, they had their own cache.  When you combine the two servers into one, they will fight over the same cache.  If one of the users runs a big query, they will use ALL of the memory on the server, no matter how much memory the server has.  If you’re really worried about memory requirements when you’re consolidating, you consolidate into two or more instances on the same SQL Server.  That way, you can specify the maximum memory for each instance, and then you guarantee that each application will get enough memory.

Well, that’s not really true either: you’re not guaranteeing that the application will get ENOUGH, you’re just guaranteeing that it will get its own minimum share.  To me, one of the selling points of consolidation is giving each application the chance to grab more peak resources when necessary, pooling resources so that everyone runs faster.  Ask your CIO about utilization percentages, and she’ll say she wants them as high as humanly possible – and you can’t get that if you break up your databases into individual instances and hard-code their memory requirements.

So that’s my memory answer: you either buy more memory than the servers initially had, or you can separate them onto different instances.  I vote for the former.

CPU consolidation is the same kind of problem.  One bad query can still slow down the entire server.  If you write a query that takes 100% CPU for 30 seconds, and you consolidate its databases onto a server that has 5 times as much CPU power, you’re still going to take 100% CPU for 6 seconds.  (That’s not necessarily exactly right – depends on whether the query can be parallelized – but you get the idea.)

In theory, we can fix that by consolidating the servers into separate instances on the same server, and setting the CPU affinity mask for each instance so that they’re guaranteed their own CPU power.

In practice, that’s a huge pain in the butt.  It’s easy for two instances on one server, but as soon as you start talking about 3 and 4 node clusters, it’s a mess.  You have to make a spreadsheet to track which CPUs will be used by which servers on which instances.  When there’s a failover between cluster nodes, you have to stay on top of which instances can fail onto which nodes.  I don’t know about you, but when a node fails, I’ve got enough things on my mind already.

Now that we’ve covered CPU and memory, let’s talk about IO.  You can see that CPU and memory calculations are a mess – once you get to this point, people throw in the towel and don’t even think about IO.  The sad truth is that DBAs treat their storage as a black box: they don’t do much storage benchmarking to see if they’re getting enough IO throughput, and they don’t consider this during consolidation.  Getting more storage throughput means working closely with your SAN team to test multiple HBAs, multipathing software, and array configuration.  It’s a lot of work – it does pay off, but it is a heck of a lot of work.

If any of our readers have done storage throughput benchmarking as part of a consolidation project, I’d love to hear about it in the comments.  (Or if you haven’t, that’s good feedback too.)

Kevin’s vblog – Clock drift in virtualization

Thursday, August 28th, 2008

Hi all, here’s a vblog entry that covers a bit more the topic of clock drift in virtualized environments.  This vblog entry corresponds to my blog post earlier this week

Enjoy and I look forward to your feedback – Kevin

[youtube=http://www.youtube.com/watch?v=BYoL7UPSNvs]

How Can I Determine Which Tables and Indexes Are Not Being Used in My Database?

Wednesday, August 27th, 2008

Q: I have inherited a legacy applications of which I am certain there are many unneccesary tables and indexes.  I am slightly paranoid about ripping them out however as I am not 100% sure that they are not used.  Is there any way that I can determine which indexes and tables are safe to drop?

A: As long as your application is running on SQL Server 2005+, you have access to a Dynamic Management View (DMV) that displays exactlky this information.  Querying from sys.dm_db_index_usage_stats will tell you how many times a given index/table has been scanned, seeked, looked up, updated, and even gives you the last date when an operation happened.  Browsing through this DMV will allow you to determine with confidence any index/table that has never been touched (since SQL Server was last restarted).  There are many exampes of some elegant queries that people have written that can be found by searching the DMV name but its a fairly straight forward one to understand.

One got-chya, the data in sys.dm_db_index_usage_stats will show tables/indexes across all databases where sys.indexes only shows you data from the database you are in.  This creates some challenging hurdles if you are trying to join on those tables to lookup the index name.  It’s best to keep it simple and filter dm_db_index_usage_stats to only show you data from the database you are in.

A fairly simple example with explanation can be found here.

SQL Server Data Mining in the Cloud Review

Wednesday, August 27th, 2008

The SQL Server 2008 Data Mining team is showing off some slick new technology at the KDD 2008 show in Las Vegas.  You didn’t get your trip to Vegas approved, you say?  Well, it pays to read Jamie’s Junk, also known as “The official blog of the SQL Server Analysis Services Development Manager(tm).”  He wrote a blog entry called KDD 2008 and Incredibly Awesome SQL 2008 Data Mining Demos, and I spent some time working with it to find out whether he was blowing things out of proportion.

Install the Free Excel Data Mining Plugin

At http://www.sqlserverdatamining.com/cloud/ you can play with a browser-based SQL Server Analysis Server.  Upload a CSV file or use one of their samples, and analyze away.  You’ll get a quick feel of some amazing technology, but for me, the real winner is their Excel 2007 Table Analysis Tools for the Cloud plugin, and I’ll show you how to use it.  (Go download that and install it fast before they figure out how popular it’s going to get.)

You get all of the power of a SQL Server Analysis Services server under your desk, but without any of the licensing, training or setup costs associated with SSAS.  It’s data mining for the rest of us.

Cleaning Up the Source Data

Step 1: The Perfmon CSV File

Step 1: The Perfmon CSV File

Let’s start with a simple file that I have to analyze all the time: results from Performance Monitor.  We’ve written a lot here about how to analyze Performance Monitor results to see where your SQL Server bottlenecks are, but one of the tough steps is to figure out when your server is undergoing peak loads, and what’s causing those loads.

At right, check out the raw results of Perfmon: a plain CSV file.  We’re going to clean it up a little, but I want to show you the “before” to understand what we need to clean up.  We’re not going to do any heavy lifting, just a couple of tweaks.

We want the headers to be readable by human beings, because we’re going to use these as labels.  In cell A1, change the label to “Date”, and in the rest of the cells in row 1, do a search & replace to strip out the \\MYSERVERNAME\ label.  (If you have multiple servers in your Perfmon data, leave the server names in, of course.)

Step 2 - Added Calculated Date Fields

Step 2 - Added Calculated Date Fields

Format the date column, A, to show the date and time instead of the number format.  That’s the datestamp when the performance data was captured.  To enhance our analysis, we want to add a couple of date calculations.  SQL Server performance often depends on the day of the week and the time of day – weekends might have lower performance needs than weekdays, and we might have daily jobs at a certain time that stress the system.  Add two new columns for Day of Week (using the WeekDay function) and Hour of Day (using the Hour function).

Finally, we need to tell Excel that this huge amount of numbers is actually a table.  Click on any cell that has data (not an empty cell off to the side), click on the Home tab in the Office Ribbon at the top of the screen, and click on the Format as Table button in the ribbon.  Click on any color style that pops up – I’m partial to Table Style Medium 2 myself.  Excel will highlight the entire data range to confirm, and make sure the “My table has headers” box is checked.  Click OK.

Now would be a good time to save our work – save the file as an Excel file, not a CSV file.

Connecting To SQL Server Data Mining in the Cloud

Step 3 - The Ribbon Tools

Step 3 - The Ribbon Tools

Assuming the “Table Analysis Tools for the Cloud” add-in has been installed successfully, the ribbon will change.  Click on the “Analyze (in the Cloud)” tab, and SQL Server Analysis Services options will appear.

Time to make the magic happen.

Let’s find out what kinds of load we have on this server by clicking on the Detect Categories button in the ribbon.  Accept the defaults, and click Run, and a warning will pop up:

“Table Analysis Tools for the Cloud needs to connect to the Internet to process your request. Your spreadsheet data will be uploaded for analysis using a secure protocol, then destroyed as the analysis completes. The operation cannot complete without your permission.  To learn more about the privacy of your data read out Privacy Statement”

Time to take a quick break and talk privacy.  This Excel add-in is going to connect to Microsoft’s servers, upload your data, run some analysis, and then send your data back to you via the internet.

Your data.  Someone else’s servers.  The internet.  Stop to think about that for a second, especially if you’re:

  • A public company or subject to SOX compliance
  • Dealing with health care data
  • Dealing with financial data
  • Analyzing your passwords spreadsheet
  • Slicing and dicing your little black book

This service may be secure in the future, but right now we’re talking about a demo, not an officially sanctioned Microsoft product with a EULA.  Be smart – only use this for dummy demo data for now.

Breaking Your Perfmon Counter Results Into Categories

Step 4 - Performance Categories

Step 4 - Performance Categories

Click Run, and the magic happens: the data is uploaded and analyzed.  It may take a few minutes depending on the size of your data source.  (I have no idea what kind of server horsepower is on the other end.)

The result is a list of categories.  In the screenshot at right, I have 10 categories of load.  About 60% of the performance monitoring times fell into the first two categories.  Let’s go through the categories individually to see what they have in common.  The category details are on the bottom half of the spreadsheet, and you can switch from one category to the next by clicking on the filter icons in the Category header of the spreadsheet.

My interpretation of the results are:

  • Category 1 – Low Activity with <8 Connections – during these times, we have less than 8 SQL connections, a lot of available memory, and low disk activity.  What’s interesting to me is that there’s not a consistent Hour of Day finding for this one.
  • Category 2 – Low Activity 8-13 Connections – during these times, we have a few active users, and we’re seeing some very minimal storage activity, but we’re still in great shape on memory.
  • Category 3 – Low Activity with C Drive Load – for some reason, about 11% of the time our server has low SQL Server activity, but the operating system drive is seeing some activity.  Maybe another application is running on our SQL Server, or maybe someone is remote desktopping into the system to do things.
  • Category 4 – Sundays 2pm-6pm – some kind of job must be running during this time frame, because SQL Server connections stay low but I/O goes up significantly.
  • Category 5 – High Connections, Heavy Load – about 5% of the time, this server has a high number of connections, heavy TempDB drive access, and high CPU loads.

Below category 5, each category represents 3% or less of the server’s time, so those wouldn’t concern me right away.

Bottom line: this server has very peaky demand cycles, only lasting for very short bursts and it doesn’t appear to be focused on a single time of the workday.  My priorities as a DBA would be:

  • Find out what’s happening on Sundays from 2pm-6pm, and ask the application owners if they want help to make that code to run faster
  • Look at the types of queries that are running in Category 5, the 5% of the time when the server is under heavy load.  My guess is that this is a new server with a new application, and not a lot of users are hitting it yet.  When the app catches on and gains users, this server won’t be able to handle the load due to the individual queries being tough on TempDB.  We need to find out why those queries are hammering TempDB, and either optimize them or speed up the TempDB drive.

Using SQL Server data mining in the cloud, I was able to figure this out in a matter of minutes – instead of manually slicing and dicing the numbers in Excel or pushing this data into my own SQL Server Analysis Server.  This would save me time and money, making me more productive and a better DBA.

But wait – there’s more.  That’s just one button in the ribbon.

Analyze Key Influencers of Performance Bottlenecks

Let’s find out what else seems to be happening in the system at the same time our TempDB usage is high.  Back on the source data tab of the spreadsheet, click the Analyze Key Influencers button in the Analyze (in the Cloud) ribbon tab.  Choose “Physical Disk 2_T Disk Time” as the data to analyze, because in this system, TempDB is isolated on the T drive.

Step 5 - TempDB Influencers

Step 5 - TempDB Influencers

Click on the “Choose columns to be used for analysis” link and uncheck the other counters for Physical Disk 2_T” because we know they’ll be high at the same time Disk Time is high.  Those aren’t influencers.  Then click Run.

When the results come back, scroll over to the far right.  You’ll notice several sets of colored bar graph results.  Scroll down, because we want to find what influences TempDB activity to be high – indicated by the “Favors” column being a high number.  We want to find what favors high TempDB activity.

In our particular example, we can see that Category 8 is the highest TempDB load.  Innnnteresting – earlier, I’d discarded categories over 5 because they represented a low amount of time, but it looks like we’ll have to go back and revisit category 8. Category 8 was Friday mornings from midnight to 4am, so we might have something going on then.

Other factors include high activity on our log drive (L), medium CPU load (11-34%), and so on.  I bet TempDB’s logs are also on the L drive, shared with the rest of the databases, and I would have to verify that before further investigation.  If TempDB’s logs are on that L drive, then it would make sense that they’d have high activity when the TempDB data is being written.

In Two Words: Incredibly Awesome

I’ve barely scratched the surface of what this thing can do, and already I can see that Jamie was right – this thing is incredibly awesome.  This demo just showed Perfmon data – something interesting to me personally, but not to most of humanity.  Imagine what it can do in the hands of an accounting analyst or a sales manager with actual financial data, people who could raise sales or cut costs.  Very impressive.

I want to loudly congratulate the SQL Server Data Mining team for building something so cool and really pushing the edge of cloud analytics.  I get genuinely excited about where cloud-based databases are going, and I’m thrilled that Microsoft might bring this to market.  The faster we can analyze data with less setup time and less fixed infrastructure costs, the more powerful the DBA will become.

Reporting on your OLTP system

Wednesday, August 27th, 2008

image Just a quick note on this topic since it really is not a best practice and I am sure no one does it. ;)

Business requirements > best practices?
Even if you have a reporting server dedicated for ...err... reports, the business might dictate real time data in those reports. That may or may no be possible on a reporting server but that is another post. Most of the time, the reporting server is not real time. At the same time, the OLTP data is not optimized reporting and if it is, transactional performance decreases. The solution usually dictates some reporting functionality being added to your production OLTP server. Maybe not at a DSS level but an adhoc query built by a web page. A company's order history, an store inventory, or even an advanced search web page. You know the one:

select * from widgets

where color is null and size is null and flavor is null........ and (title like '%ice cream%' or description like '%ice cream%' or keywords like '%ice cream%')

Is that really a reporting database?

So you have a copy of your OLTP database either through mirroring\snapshots, logshipping or replication that you call a reporting database. What does that buy you? Well, we you are offloading reporting right. Kinda. You separate your reporting queries and your OLTP queries but here are some downsides:

  • The reporting database is not optimized for reporting. This tends to be really apparent if reporting is critical.
  • The OLTP database now has read activity on the log(except for logshipping)
  • The reporting database still takes all of the write traffic in one way or another.
  • The OLTP database may lose resources. For example, less disks get dedicated to it. Or worse, reporting and OLTP end up on the same physical SAN disk basically doubling the IO.

So what is the solution? 

The real solution is to design an incremental ETL process that loads to a report optimized database. If a closer to real time data is required, triggers, modified replication or possibly asynchronous triggers might be the way to go. I am not going to pretend to know how to architect a solution like that. However, it is safe to say that a nightly or real time solution will require some serious dev work. This includes creating the ETL process and rewriting the application to use the report optimized database.

Paint yourself into a corner?

Say you have a home grown database that has gone from megabytes to 100's of gigabytes. Separating reporting functions to a new database is going to take time. However, SQL Server 2008 provides new features that scream consolidation. Not only instance consolidation but functionality consolidation.

  • Resource Governor is the backbone of a consolidation strategy.
  • Filtered indexes - Index the different workloads while impacting writes as little as possible.
  • Data Compression -  combine with partitioning and shrink the reporting data while leaving the really hot data uncompressed.
  • Not just SQL 2008 but don't forget about indexed views and indexed persisted computed columns.

The other assumption of consolidation is larger hardware. If you are not spending money on a copy of OLTP reporting server, you can get a larger OLTP box.

Warning: this is forward looking since I don't have any production SQL Server 2008 servers yet. :)

Conclusion

Not to sound like a Microsoft fan boy but SQL Server 2008 Enterprise Edition provides a lot of benefits for mixed workload boxes. However, the real point of the post is that making a copy of the OLTP database for reporting may not provide gains one would expect.