SQL Server Memory Tracking with Windows Task Manager

Filed under: Administration — KKline at 4:54 pm on Friday, August 29, 2008

Have you ever looked at the Windows Task Manager for a quick read on SQL Server 2005 memory consumption?
 

You can get good information from the Windows Task Manager about SQL Server memory, but you have to know what to look for. When running with AWE memory enabled on a 32-bit system, you may see an unusually low amount of memory in the Windows Task Manager.  If AWE is being used, you will probably get a much better idea of how much AWE memory is being consumed by checking page file usage.  The page file usage value and the amount of AWE memory will correspond.
You can then cross-check this value against sys.dm_os_sys_info.bpool_committed for the exact value.
(Thanks to Geoff N. Hiten, the “SQL Craftsman”, and Erland Sommarskog for the tip.)

Technorati Tags:

Is there a way to find out from system tables when the store procedure was changed last?

Filed under: Administration, I'm a Newbie, Internals and Architecture, SQL Server 2005, Transact-SQL (T-SQL) — Ari Weil at 1:22 pm on Friday, August 29, 2008

Absolutely. You can use the sys.objects view to determine the create_date and modify_date for an object. The query would be:


select [name],[create_date],[modify_date]
from [sys].[objects]
where [type]=N’P’ and [is_ms_shipped]=0

To see the full definition of the procedure you would run:


select [name],[create_date],[modify_date],[definition]
from [sys].[objects] obj
join [sys].[sql_modules] mod
on obj.[object_id]=mod.[object_id]
where [type]=N’P’ and [is_ms_shipped]=0

We are trying to implement log shipping to run hourly. If our nightly full backup takes 1 hour and 45 minutes, what is our recovery process? Is it a bad thing to have a transactional backup in the middle of a full backup?

Filed under: Administration, Backup and Restore, I'm a Newbie, Internals and Architecture, Replication, Transact-SQL (T-SQL) — Ari Weil at 1:06 pm on Friday, August 29, 2008

Yes, this is bad because it can lead to your log shipped database getting out of sync and will probably result in Error 4305; error 4305 states that the log in a backup set is too late to apply, which means your log shipping jobs will fail until you can synchronize the databases. You should schedule your transaction log backup job to stop before your full backup job begins and then to restart once it has completed. See the Microsoft TechNet article on setting up Log Shipping.

When you perform a a full database backup SQL Server stores the ending log sequence number (LSN), which becomes the starting LSN for the next transaction log backup. So, if you are performing a transaction log backup while your full database backup is executing, and that transaction log backup is successful, once it is shipped and applied to the subscriber it will have a different starting LSN than the ending LSN of the full backup. This is not allowed and SQL Server will throw error 4305.

Consolidation - Calculating Combined CPU

Filed under: Administration — Brent Ozar at 7:05 am on Friday, August 29, 2008

I got a consolidation query from Andrey.  He asked:

“I have a question related to SQL Server consolidation. You know that the physical memory on any machine is limited and SQL Server tends to consume all available memory. Let’s say I am consolidation two SQL Servers. The first one consumes 1GB of memory and the second one is consuming 1GB of memory. I tempt to say that we need 2GB when consolidating. BUT what if I have only 1.5GB available on the target server? Will the performance suffer if I consolidate two servers and make then only 1.5 GB available? Do you have any ideas on how to estimate memory requirements in this case? The same applies to CPU and IO as well.”

That question comes up all the time when we talk about consolidation.  First, let’s talk about memory since that one’s the easiest to measure.

Memory is so cheap now (under $500 for 8gb of server-quality memory) that when people consolidate, they should throw a lot of memory into the server.  You’re probably consolidating old servers that had less memory, so it’s easy to take 4 servers with 4gb ram each (16gb total) and consolidate them into one server with 32gb of ram.  When you’re saving tens of thousands of dollars on SQL Server licensing, it’s easy to argue for $2,000 of memory (32gb) to help guarantee the project’s success.

That’s my personal real-world answer to memory issues during consolidation: you throw a lot of memory at the problem, because it’s cheaper to buy memory than it is to spend a lot of time analyzing the applications and the memory use.  And it’s way cheaper to buy this memory than it is to roll back a failed consolidation project.

But that doesn’t answer your question, because a cheap company might not do that, and I wanna make sure I answer your question.

Here’s the problem with consolidation and memory: before, when you had two separate servers, they had their own cache.  When you combine the two servers into one, they will fight over the same cache.  If one of the users runs a big query, they will use ALL of the memory on the server, no matter how much memory the server has.  If you’re really worried about memory requirements when you’re consolidating, you consolidate into two or more instances on the same SQL Server.  That way, you can specify the maximum memory for each instance, and then you guarantee that each application will get enough memory.

Well, that’s not really true either: you’re not guaranteeing that the application will get ENOUGH, you’re just guaranteeing that it will get its own minimum share.  To me, one of the selling points of consolidation is giving each application the chance to grab more peak resources when necessary, pooling resources so that everyone runs faster.  Ask your CIO about utilization percentages, and she’ll say she wants them as high as humanly possible - and you can’t get that if you break up your databases into individual instances and hard-code their memory requirements.

So that’s my memory answer: you either buy more memory than the servers initially had, or you can separate them onto different instances.  I vote for the former.

CPU consolidation is the same kind of problem.  One bad query can still slow down the entire server.  If you write a query that takes 100% CPU for 30 seconds, and you consolidate its databases onto a server that has 5 times as much CPU power, you’re still going to take 100% CPU for 6 seconds.  (That’s not necessarily exactly right – depends on whether the query can be parallelized – but you get the idea.)

In theory, we can fix that by consolidating the servers into separate instances on the same server, and setting the CPU affinity mask for each instance so that they’re guaranteed their own CPU power.

In practice, that’s a huge pain in the butt.  It’s easy for two instances on one server, but as soon as you start talking about 3 and 4 node clusters, it’s a mess.  You have to make a spreadsheet to track which CPUs will be used by which servers on which instances.  When there’s a failover between cluster nodes, you have to stay on top of which instances can fail onto which nodes.  I don’t know about you, but when a node fails, I’ve got enough things on my mind already.

Now that we’ve covered CPU and memory, let’s talk about IO.  You can see that CPU and memory calculations are a mess – once you get to this point, people throw in the towel and don’t even think about IO.  The sad truth is that DBAs treat their storage as a black box: they don’t do much storage benchmarking to see if they’re getting enough IO throughput, and they don’t consider this during consolidation.  Getting more storage throughput means working closely with your SAN team to test multiple HBAs, multipathing software, and array configuration.  It’s a lot of work - it does pay off, but it is a heck of a lot of work.

If any of our readers have done storage throughput benchmarking as part of a consolidation project, I’d love to hear about it in the comments.  (Or if you haven’t, that’s good feedback too.)

Technorati Tags:

Kevin’s vblog - Clock drift in virtualization

Filed under: Administration, Virtualization — KKline at 10:00 am on Thursday, August 28, 2008

Hi all, here’s a vblog entry that covers a bit more the topic of clock drift in virtualized environments.  This vblog entry corresponds to my blog post earlier this week

Enjoy and I look forward to your feedback - Kevin

Technorati Tags: , , , , ,

How Can I Determine Which Tables and Indexes Are Not Being Used in My Database?

Filed under: Administration, Database Design, Internals and Architecture, SQL Server 2005, Transact-SQL (T-SQL), Tuning and Optimization — Jason at 11:33 am on Wednesday, August 27, 2008

Q: I have inherited a legacy applications of which I am certain there are many unneccesary tables and indexes.  I am slightly paranoid about ripping them out however as I am not 100% sure that they are not used.  Is there any way that I can determine which indexes and tables are safe to drop?

A: As long as your application is running on SQL Server 2005+, you have access to a Dynamic Management View (DMV) that displays exactlky this information.  Querying from sys.dm_db_index_usage_stats will tell you how many times a given index/table has been scanned, seeked, looked up, updated, and even gives you the last date when an operation happened.  Browsing through this DMV will allow you to determine with confidence any index/table that has never been touched (since SQL Server was last restarted).  There are many exampes of some elegant queries that people have written that can be found by searching the DMV name but its a fairly straight forward one to understand.

One got-chya, the data in sys.dm_db_index_usage_stats will show tables/indexes across all databases where sys.indexes only shows you data from the database you are in.  This creates some challenging hurdles if you are trying to join on those tables to lookup the index name.  It’s best to keep it simple and filter dm_db_index_usage_stats to only show you data from the database you are in.

A fairly simple example with explanation can be found here.

SQL Server Data Mining in the Cloud

Filed under: Analysis Services — Brent Ozar at 8:06 am on Wednesday, August 27, 2008

The SQL Server 2008 Data Mining team is showing off some slick new technology at the KDD 2008 show in Las Vegas.  You didn’t get your trip to Vegas approved, you say?  Well, it pays to read Jamie’s Junk, also known as “The official blog of the SQL Server Analysis Services Development Manager(tm).”  He wrote a blog entry called KDD 2008 and Incredibly Awesome SQL 2008 Data Mining Demos, and I spent some time working with it to find out whether he was blowing things out of proportion.

At http://www.sqlserverdatamining.com/cloud/ you can play with a browser-based SQL Server Analysis Server.  Upload a CSV file or use one of their samples, and analyze away.  You’ll get a quick feel of some amazing technology, but for me, the real winner is their Excel 2007 Table Analysis Tools for the Cloud plugin, and I’ll show you how to use it.  (Go download that and install it fast before they figure out how popular it’s going to get.)

Cleaning Up the Source Data

Step 1: The Perfmon CSV File

Step 1: The Perfmon CSV File

Let’s start with a simple file that I have to analyze all the time: results from Performance Monitor.  We’ve written a lot here about how to analyze Performance Monitor results to see where your SQL Server bottlenecks are, but one of the tough steps is to figure out when your server is undergoing peak loads, and what’s causing those loads.

At right, check out the raw results of Perfmon: a plain CSV file.  We’re going to clean it up a little, but I want to show you the “before” to understand what we need to clean up.  We’re not going to do any heavy lifting, just a couple of tweaks.

We want the headers to be readable by human beings, because we’re going to use these as labels.  In cell A1, change the label to “Date”, and in the rest of the cells in row 1, do a search & replace to strip out the \\MYSERVERNAME\ label.  (If you have multiple servers in your Perfmon data, leave the server names in, of course.)

Step 2 - Added Calculated Date Fields

Step 2 - Added Calculated Date Fields

Format the date column, A, to show the date and time instead of the number format.  That’s the datestamp when the performance data was captured.  To enhance our analysis, we want to add a couple of date calculations.  SQL Server performance often depends on the day of the week and the time of day - weekends might have lower performance needs than weekdays, and we might have daily jobs at a certain time that stress the system.  Add two new columns for Day of Week (using the WeekDay function) and Hour of Day (using the Hour function).

Finally, we need to tell Excel that this huge amount of numbers is actually a table.  Click on any cell that has data (not an empty cell off to the side), click on the Home tab in the Office Ribbon at the top of the screen, and click on the Format as Table button in the ribbon.  Click on any color style that pops up - I’m partial to Table Style Medium 2 myself.  Excel will highlight the entire data range to confirm, and make sure the “My table has headers” box is checked.  Click OK.

Now would be a good time to save our work - save the file as an Excel file, not a CSV file.

Connecting To The Cloud

Step 3 - The Ribbon Tools

Step 3 - The Ribbon Tools

Assuming the “Table Analysis Tools for the Cloud” add-in has been installed successfully, the ribbon will change.  Click on the “Analyze (in the Cloud)” tab, and SQL Server Analysis Services options will appear.

Time to make the magic happen.

Let’s find out what kinds of load we have on this server by clicking on the Detect Categories button in the ribbon.  Accept the defaults, and click Run, and a warning will pop up:

“Table Analysis Tools for the Cloud needs to connect to the Internet to process your request. Your spreadsheet data will be uploaded for analysis using a secure protocol, then destroyed as the analysis completes. The operation cannot complete without your permission.  To learn more about the privacy of your data read out Privacy Statement”

Time to take a quick break and talk privacy.  This Excel add-in is going to connect to Microsoft’s servers, upload your data, run some analysis, and then send your data back to you via the internet.

Your data.  Someone else’s servers.  The internet.  Stop to think about that for a second, especially if you’re:

  • A public company or subject to SOX compliance
  • Dealing with health care data
  • Dealing with financial data
  • Analyzing your passwords spreadsheet
  • Slicing and dicing your little black book

This service may be secure in the future, but right now we’re talking about a demo, not an officially sanctioned Microsoft product with a EULA.  Be smart - only use this for dummy demo data for now.

Enough Warnings - Show Me The Categories!

Step 4 - Performance Categories

Step 4 - Performance Categories

Click Run, and the magic happens: the data is uploaded and analyzed.  It may take a few minutes depending on the size of your data source.  (I have no idea what kind of server horsepower is on the other end.)

The result is a list of categories.  In the screenshot at right, I have 10 categories of load.  About 60% of the performance monitoring times fell into the first two categories.  Let’s go through the categories individually to see what they have in common.  The category details are on the bottom half of the spreadsheet, and you can switch from one category to the next by clicking on the filter icons in the Category header of the spreadsheet.

My interpretation of the results are:

  • Category 1 - Low Activity with <8 Connections - during these times, we have less than 8 SQL connections, a lot of available memory, and low disk activity.  What’s interesting to me is that there’s not a consistent Hour of Day finding for this one.
  • Category 2 - Low Activity 8-13 Connections - during these times, we have a few active users, and we’re seeing some very minimal storage activity, but we’re still in great shape on memory.
  • Category 3 - Low Activity with C Drive Load - for some reason, about 11% of the time our server has low SQL Server activity, but the operating system drive is seeing some activity.  Maybe another application is running on our SQL Server, or maybe someone is remote desktopping into the system to do things.
  • Category 4 - Sundays 2pm-6pm - some kind of job must be running during this time frame, because SQL Server connections stay low but I/O goes up significantly.
  • Category 5 - High Connections, Heavy Load - about 5% of the time, this server has a high number of connections, heavy TempDB drive access, and high CPU loads.

Below category 5, each category represents 3% or less of the server’s time, so those wouldn’t concern me right away.

Bottom line: this server has very peaky demand cycles, only lasting for very short bursts and it doesn’t appear to be focused on a single time of the workday.  My priorities as a DBA would be:

  • Find out what’s happening on Sundays from 2pm-6pm, and ask the application owners if they want help to make that code to run faster
  • Look at the types of queries that are running in Category 5, the 5% of the time when the server is under heavy load.  My guess is that this is a new server with a new application, and not a lot of users are hitting it yet.  When the app catches on and gains users, this server won’t be able to handle the load due to the individual queries being tough on TempDB.  We need to find out why those queries are hammering TempDB, and either optimize them or speed up the TempDB drive.

Using SQL Server data mining in the cloud, I was able to figure this out in a matter of minutes - instead of manually slicing and dicing the numbers in Excel or pushing this data into my own SQL Server Analysis Server.  This would save me time and money, making me more productive and a better DBA.

But wait - there’s more.  That’s just one button in the ribbon.

Analyze Key Influencers

Let’s find out what else seems to be happening in the system at the same time our TempDB usage is high.  Back on the source data tab of the spreadsheet, click the Analyze Key Influencers button in the Analyze (in the Cloud) ribbon tab.  Choose “Physical Disk 2_T Disk Time” as the data to analyze, because in this system, TempDB is isolated on the T drive.

Step 5 - TempDB Influencers

Step 5 - TempDB Influencers

Click on the “Choose columns to be used for analysis” link and uncheck the other counters for Physical Disk 2_T” because we know they’ll be high at the same time Disk Time is high.  Those aren’t influencers.  Then click Run.

When the results come back, scroll over to the far right.  You’ll notice several sets of colored bar graph results.  Scroll down, because we want to find what influences TempDB activity to be high - indicated by the “Favors” column being a high number.  We want to find what favors high TempDB activity.

In our particular example, we can see that Category 8 is the highest TempDB load.  Innnnteresting - earlier, I’d discarded categories over 5 because they represented a low amount of time, but it looks like we’ll have to go back and revisit category 8. Category 8 was Friday mornings from midnight to 4am, so we might have something going on then.

Other factors include high activity on our log drive (L), medium CPU load (11-34%), and so on.  I bet TempDB’s logs are also on the L drive, shared with the rest of the databases, and I would have to verify that before further investigation.  If TempDB’s logs are on that L drive, then it would make sense that they’d have high activity when the TempDB data is being written.

In Two Words: Incredibly Awesome

I’ve barely scratched the surface of what this thing can do, and already I can see that Jamie was right - this thing is incredibly awesome.  This demo just showed Perfmon data - something interesting to me personally, but not to most of humanity.  Imagine what it can do in the hands of an accounting analyst or a sales manager with actual financial data, people who could raise sales or cut costs.  Very impressive.

I want to loudly congratulate the SQL Server Data Mining team for building something so cool and really pushing the edge of cloud analytics.  I get genuinely excited about where cloud-based databases are going, and I’m thrilled that Microsoft might bring this to market.  The faster we can analyze data with less setup time and less fixed infrastructure costs, the more powerful the DBA will become.

Technorati Tags: , , ,

Testing Centers of Excellence

Filed under: Programming — KKline at 3:59 pm on Tuesday, August 26, 2008

Scott Moore, a friend of mine from my days at Deloitte, is now a noted expert in application development testing, especially on HP and Mercury Interactive products.  Scott has a new article up about “The Performance Authority” of a testing center of excellence.  If you work at all with QA and software testing, then this is worth a read.

This was mainly in response to the many inquiries he got about how to explain the highest level of maturity when it comes to addressing application performance issues in the software development lifecycle. Many of us have been seeing the slides around the maturity model for years, and we always see that the highest level – the Performance Authority – is always listed with a statement that it represents less than 1% of clients for companies like HP (especially Mercury Interactive).

This statistic hasn’t changed in at least four years that anyone has seen or written about. Either companies are just not getting it, or HP and their partners are not telling clients how to get there, and more importantly what it would look like if they had such a thing as a “Performance Authority”. This is Scott’s take from years of studying various real clients of his and doing Center of Excellence assessments – the typical holes that he sees.  Many companies are looking to achieve this level of maturity, and this article might give some clarity to what the “end game’ looks like. You can find the article online now at:

http://www.loadtester.com/performanceauthority.php

Enjoy,

-Kevin

Technorati Tags:

Clock Drift in Virtualization

Filed under: Administration, Virtualization — KKline at 4:07 pm on Monday, August 25, 2008

Last week, I was listening to John Rives, CEO of Amniox, speak about virtualization and clustered VMS (see pictures at http://drsql.spaces.live.com/blog/cns!80677FB08B3162E4!2292.entry.  Thanks Louis!) at the local SQL Server User Group meeting in Nashville.

John gave a really interesting talk about implementing virtualization and many of the benefits and pitfalls that you might encounter.  Most of the content was review for me, but the one concept that was new was a biggee - clock drift.

Clock drift is a situation in which the actual time shown on the physical machine’s (PM) clock, for example 3:53 pm today, is no longer in sync with one or more of the clocks of virtual machines’ (VM) running on the PM, showing 12:12 pm today and 9:30 pm yesterday.  This can happen because the VMs 1) only receive a slice of the PM’s total processing power and cycles, thus getting confused on the time, and 2) the application(s) running on the VM do not manually synchronize the VM clock with the PM’s clock or an external time source.  Naturally, if you’re running SQL Server, this can be a huge data integrity problem for you - especially for any transactions that record the date and time.

This is a fairly old-school problem, it turns out, and there are lots of hits when you google for “clock drift virtualization”.  However, the state of the industry seems to be somewhat immature in that the best hits tended to be blogs and discussion forums rather than vendor documentation or best practices papers.  If you’re developing applications for SQL Server that might be implemented on a VM, do yourself a favor and make sure you include clock syncronization processes as part of the application.  Otherwise, you might have to deal with a nasty clock drift problem.

Regards,

-Kev

Technorati Tags:

DBA Automation Roundtable Discussion

Filed under: Administration — KKline at 8:24 pm on Wednesday, August 20, 2008

My favorite event at Microsoft’s TechEd 2008 conference held in Orlando this year was a roundtable discussion about automating DBA work with several other SQL Server experts.  The session was moderated by Allen White and included myself, Dan Jones (of Microsoft), Steve Jones (of SQL Server Central fame), and Buck Woody (also of Microsoft, though you may have seen his excellent column at http://www.informit.com/).  These guys are smart, funny, and experienced.

Maybe you’d like to take a look?  Microsoft has posted the video at Tech·Ed Online:  Automating DBA Processes.

In addition, I’d like to point out that fellow blogger Jonathan Kehayias has posted a good issue up on Connect for a problem in the SQL 2008 Management Studio tools that I have already been told is a by design issue.  You may not realize it, but Microsoft now allows you to vote on which issues you want tackled.  Let’s build some community support for issues like these - the issue is that SSMS 2008 can’t connect to 2005 Reporting Services Instances by WMI - https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=361680.

Thanks so much!

-Kevin

Technorati Tags:

Next Page »