Author Archive

Service Broker sample application from SQLskills Immersion Event

Thursday, August 11th, 2011

You can find my Service Broker sample application which I presented during the SQLskills Immersion Event in Bellevue here for download: http://www.csharp.at/Downloads/SQLskills_ServiceBroker.zip

Thanks

-Klaus

SQL MCM Training – Day 4

Thursday, August 11th, 2011

Today Jonathan has presented on analyzing trace data, and Kimberly has spoken on troubleshooting Plan Cache Issues and Index Consolidation. It's really awesome when you listen to Kimberly, because she is speaking so fast, and delivers such a great quality of content – really awesome!!!

Module 10: Analyzing Trace Data

  • Default Trace
    • Logs everytime when an object is created and deleted
    • Also for Temp tables => Default Trace can get really big
    • Lightweight Trace defined by the Product Support Team
    • Uses DBCC commands
    • DBCC command are also logged in the Default Trace
    • sys.configurations, "default trace enabled" to check if Default Trace is enabled/disabled
    • Can be enabled/disabled through sp_configure
    • Default Trace has always ID of 1
    • Default Trace can't be changed
    • You can create your own Default Trace
      • Disable the original Default Trace
      • Create your own
        • TempDb object creations can be excluded
    • 5x50MB files
      • Server Activity impacts retention period
    • It writes to the error log path
      • Can't be changed
      • If you change the default error log path, you also change indirectly the location where the Default Trace is written
  • sp_get_query_template to normalize statements can be expensive!
    • Cursor must be used, so that you can use it for the result in a trace file
  • When you consolidate several databases onto a new instance, ad hoc workloads should put together, because they both have problems
    • Doesn't make sense to mix these workloads with parametrized workloads
  • SQLDiag can be used for (Remote) SQL Server Health Checks
    • Use Extended Events in it through custom tasks
    • SQLskills runs SQLDiag in the snapshot mode for SQL Server Health Checks
    • /C 2 compresses the output into a CAB file
    • SQLDiag needs sysadmin priviledges on the monitored server
  • Blocked Process Report

Module 11: DMVs – Part 1

Thanks for reading

-Klaus

SQL MCM Training – Day 3

Wednesday, August 10th, 2011

Nothing to say about today, Jonathan did really a great job describing Extended Events and Baselining/Benchmarking to us, and his boss – Paul – did a quick really great overview of Resource Governor. I've done a lot in these areas in the past, so that was a more easy one to me than yesterday J. Here are my study notes for today.

Module 7: Extended Events

  • SQL Trace will be replaced by Extended Events some time
  • SQL Denali CTP3 has more than 500 Extended Events
  • Event Sessions with events with no targets are possible
  • Event Sessions with targets and no events are possible
  • The Event Engine has no idea of Extended Events
    • It only provides an operational environment
  • Event Sessions have memory buffers attached
  • Dispatcher Pool is a worker pool
  • Package is a container about events with metadata
  • package0 is a container about generic things like targets
  • Any event can work with any action with any taraget from any packet
  • ETW (Event Tracing for Windows)
    • Gets information from the Windows Kernel
    • ETW & Extended Events are integrated to each other
  • Fastest first false value should be placed in the first place in a predicate
    • Support for short circuit evaluation
    • First false evaluation prevents events from firing
  • You can bind to Global State Data through Actions
    • Like SPID
  • Predicates can store state
    • You want only to process a event when it is fired the 5th time
  • Actions are accessing Global State Data
  • Actions are executed after a predicate is evaluated
    • When an event will not be fired, the action will be not executed
  • Actions are executed synchronously
    • Can impact performance!
  • By default the Ring Buffer Target uses 4 MB
    • ring_buffer target (DMV) can only return 2 MB, therefore mailformed XML can be returned
    • Limit therefore the Ring Buffer Target to 2 MB
  • Denali only provides an asynchronous Bucketizer
    • Synchronous should not be used anymore in SQL Server 2008 (R2)
  • Event Counter Target can be used to check how often an event is fired
    • For planning more specific Event Sessions
    • With that information you can also redefine your predicates to reduce the events needed to process
  • Event Pairing Target is used to match events
    • Bad example from BOL: Lock Acquired and Lock Released
      • Because of Lock Escalation, Lock Released is not fired anymore
    • Can be used for finding orphant transactions
  • Predicates are working on map keys, and not map values
  • Whitepaper "Using SQL Server 2008 Extended Events" by Jonathan Kehayias
  • Shred XML data in a table and then do analysis from the table
    • It's much faster
  • Event Sessions are isolated from each other
  • It takes 2 microseconds to fire an event, everything else is done asynchronously
  • Dispatch of async targets occurs under 2 conditions
    • The memory buffer becomes full
    • Dispatch Latency exceeds
  • sys.dm_xe_packages.capabilities is a bitmap mask
  • Extended Events uses Multi Page Allocations outside of the Buffer Pool
    • On x32 this is VAS address space
    • This can lead to allocation problems
  • Metadata file in depricated in Denali (Async File Target)
  • greater_than_max_float64
  • less_than_min_float64
  • NETWORK_IO = ASYNC_NETWORK_IO in sys.dm_os_wait_stats
  • MSSQL\Install.u_tables.sql
    • Contains the System Health Event Session
  • Events can't be changed
    • They must be dropped from the Event Sessions and added with the changed definition
  • Actions are executed syncronously
  • Causality tracking should be used for event correlation
    • Don't add actions for event correlation
  • Memory Dumps are placed into the LOG folder
  • Event Sesions can be automatically started when SQL Server is started
    • STARTUP_STATE
  • asynchronous_file_target can be read outside of SQL Server in SQL Server Denai CTP3
  • There can be only one ETW session on the entire Windows Server
  • Troubleshooting Recompiles with Extended Events
    • In the first step trace the Recompiles on the server level
    • In the second step trace the Recompiles in the database, where the recompiles were highest
    • Use the Bucketizer Target for that
  • sys.dm_xe_session_targets is the Ring Buffer target
  • Parameter Sniffing can be troubleshooted through Extended Events
    • use greater_than_max_int64
    • Events gets only captured when the stored procedure execution get's slower
      • When a "bad" parameter value was provided

Module 8: Resource Governor

  • Enterprise Edition only!
  • Can be used to limit MAXDOP on a more granular level
  • MAXDOP
    • Workload Group MAXDOP 4, Server MAXDOP 1
      • You can do control MAXDOP through Query Hints from 1 to 4
      • But you can't do anything more than MAXDOP > 4 through a query hint, because the workload group throtted it down to a maximum MAXDOP 4
  • Resource Governor will override the MAXDOP query hint
    • Query hint
  • It governs per scheduler
  • Resource Governor can't limit existing connections
  • Connections can't be migrated between workload groups
    • Connection must be reset, or disconnect/reconnect
  • I/O, TempDb can't be controlled through Resource Governor
  • You can limit the number of active connections to a workload group/database
  • CPU/IO affinity masks are depricated in Denali
  • Throtteling is only done when there is contention
    • Otherwise CPU limit are ignored
  • The internal pool can be changed
    • The classifier function and login triggers are running in the internal pool
  • Classifier function must return a SYSNAME
    • It can't be a SQLCLR function, it must be a T-SQL function
    • But you can call a SQLCLR inside the T-SQL function
  • CPU for Backup Compression can be limited through Resource Governor
  • Perfmon Counters
    • SQLServer:Resource Pool Stats
    • SQLServer:Workload Group Stats
  • You can't tell if a specific query was throttled
  • Connections are grouped into workload groups/buckets
  • Whitepaper on Resource Governor: http://msdn.microsoft.com/en-us/library/ee151608.aspx
  • Parallel Index Rebuilds/Parallel DBCC CHECKDBs are Enterprise Edition only!
  • You can only have up to 20 resource pools, 18 user defined
    • But the max doesn't really make sense
  • GROUP_MAX_REQUESTS
    • Limits the number of concurrent connections (0: unlimited connections)
    • Configured on the workload group
    • Perfmon Counters
      • Workload Group Stats: Active Requests
      • Workload Group Stats: Queued Requests
  • Extended Events have a resource pool id as action
  • Only for one SQL Server Instance!
    • Governing across SQL Server Instances doesn't work

Module 9: Baselining & Benchmarking

  • Memory Ballooned
    • Because of Balloon drivers
  • Ring Buffer provides the CPU time in sqlservr.exe
  • Memory pressure problems in the Buffer Pool
    • Low PLE
    • Low or no Free Pages
    • High Lazy Writes/sec
    • High Fee List Stalls/sec
  • SQL Load Generator: http://sqlloadgenerator.codeplex.com
  • SQL Trace
    • The traces itself filter out unrelevant events
    • That's a big difference to Extended Events, where events are not fired, if they are not needed
    • When an event is registered inside the system, it is send to EVERY trace, and the trace itself has to filter out, if it is interested in the event, or not
  • SQL Alerts can be used to automatically start a perfmon trace through logman.exe or to start a SQL Trace

Thanks for reading

-Klaus

SQL MCM Training – Day 2

Tuesday, August 9th, 2011

Today's day was Jonathan Kehayias days, and it was pretty impressive. Jonathan covered the following 3 modules, and he can speak really continuously and forever J

  • IO – Storage Area Networks
  • IO – Installation and Testing
  • SQLOS

I've already read few whitepapers in the past about SANs and also watched the MCM Readiness videos about it, but the material that Jonathan was presenting was pretty impressive. There were a lot of concepts new to me (like Multi Pathing), therefore I didn't have taken any notes in this module, because I have listened to Jonathan and tried to follow him. He also did a very great demo about Multi Pathing with the iSCSCI Target and iSCSI Initiator provided by Microsoft. The SAN module was definitely a module, where I learned around 90% new things.

Important note:
Those notes are the things that I have written down for my personal remembering. So please don't take anything for 100% sure and true in the way that I have written it. For some notes you also have to know the context in which we have discussed it, and why I have taken that note for me personally. Today morning we had a discussion about one specific note that I have taken about CXPACKET waits and why Hyper-Threading can be bad, but at always "it depends", and some things/context information I haven't written down, because I have it somewhere in my brain J.

So here are the notes that I have taken during the 2nd day of SQL MCM training.

Module 5: IO – Installation and Testing

  • GPT partitions can manage more than 2 TB of data
    • Max: 256 TB
  • Spanned disk just provide more size, but not more performance
    • Each drive is used sequentially
  • Allocation Unit Size should be the size of a single extent
  • Tools
    • SQLIO
    • IOMeter

Module 6: SQLOS

  • Each NUMA node gets its own Lazywriter
  • MAXDOP: number of physical cores per NUMA node
  • http://en.wikipedia.org/wiki/CPU_cache
  • Soft NUMA usage: http://msdn.microsoft.com/en-us/library/dd537533(v=sql.100).aspx
  • SOS_Worker: Thread
  • SOS_Task: Operation that a SOS_Worker executes
  • Startup parameter -P24: gets you 24 processors in SQLOS
  • Hidden schedulers are processing background tasks
  • Visible schedulers are processing user requests
  • Built-In SQLCLR data types are not supported when Fiber mode is activated
    • SQL Server will gives you an error message when you access a SQLCLR data type column
  • SQL Server Connections are bound to a scheduler
  • Each scheduler has its own waiter list and runnable queue
  • Scheduler Monitor
    • Generates automatically a mini-dump for post-mortem debugging
  • Names instances are running on different ports for the DAC
  • PAE (Physical Address Extension) changes the memory pointer size from 32 to 36 bits
    • 64 GBs are addressable in the user mode
    • Only usable for the Buffer Pool, NOT for the Plan Cache
  • /3 GB Option
    • Uses 1 GB from the Kernel mode
    • That 1 GB stores normally PTEs (Page Table Entries)
    • Therefore it's only possible to use a maximum of 16 GB RAM, because of the reduced PTEs
  • Understanding VAS Reservation
  • Locked Pages is only for Buffer Pool, not for Plan Cache
    • Can be used also by Standard Edition with Trace Flag -T845
    • Can be used for Windows System Cache problems
    • Startup time of SQL Server takes longer because SQL Server allocates all memory up to the Max Server Memory setting
    • The memory must be also contiguous, otherwise SQL Server gets less memory for the Buffer Pool
  • Memory\Available Mbytes > 150 - 300MB
    • Otherwise it means that the Windows OS is under memory pressure
  • Internal Memory Pressure
    • Another Memory Clerk wants to allocate memory
  • Resource Monitor responds to memory pressure
    • Outputs information to the OS ring buffer
    • ring_buffer_type = "RING_BUFFER_RESOURCE_MONITOR"
  • Activity Monitor adds a lot of load to an system which is in performance problems
    • Especially TempDb
  • NUMA Memory Balancing is not shown through DBCC MEMORYSTATUS
  • http://blogs.msdn.com/b/psssql/archive/2009/05/15/how-it-works-dbcc-memorystatus-locked-pages-allocated-and-singlepageallocator-values.aspx
  • http://blogs.msdn.com/b/psssql/archive/2010/02/23/how-it-works-sql-server-2008-numa-and-foreign-pages.aspx
  • Scheduler can be set offline through the Affinity Mask
  • Unloading AppDomain on x32
    • Indication for Memory Pressure
  • Interpreting DBCC MEMORYSTATUS
  • Max Worker Count should be changed to a higher value when you use Database Mirrroing with a large amount of databases
    • Mirroring uses at least 3 threads for principal and at least 5 threads for the mirror per database
  • One worker per level of DOP (e.g. 24)
    • 24 Threads for the Producers
    • 24 Threads for the Consumers
    • 1 Thread for the Coordinator
    • For each executed query!
    • Those threads are bound to parallel queries and can't be used by anything else!
  • How To Diagnose and Correct Errors 17883, 17884, 17887, and 17888

-Klaus

SQL MCM Training – Day 1

Monday, August 8th, 2011

Today was my first day of SQL MCM training here in Bellevue. We started at 08:30am and now (05:30pm) Kimberly is currently talking about how to use the provided VPC J. The main theme for today's day was about learning the fundamental about performance tuning. We have covered the following 3 modules:

  • SQL Server IO
  • Waits & Queues
  • IO – Core Concepts

In the morning Paul and Kimberly provided a foundation about how SQL Server is doing IO and the inner architecture of SQL Server. There was a big overlap with the first day of the first week (which I have attended in London), but this was "by design" J. Afterwards Paul jump into Waits & Queues and he has shown a lot of different demos about the various "well-known" wait types like CXPACKET, ASYNC_NETWORK_IO, WRITELOG, etc. Paul also walked down to Latches and Spinlocks, which was really great. Afterwards Jonathan kicked in and told us a lot of IO Core Concepts like the difference between traditional hard disks and SSDs.

I've written down a lot of personal notes for the first day, and the great thing is that I want to share all those notes (and all notes that are coming within the next 3 weeks) with you, so that you can get a picture about the content that we covered. Also a big thanks to Paul, that he allows that! J

Module 1: SQL Server IO

  • File Header Page in the LDF is a mirror of the Boot Page of the MDF
    • SQL Server can write the File Header Page in parallel but NOT the VLFs
    • SQL Server has to read all File Header Pages during startup, therefore it is not recommended to have too much databases on a SQL Server instance
  • DEFAULT file group should be changed so that new objects go automatically to another file group instead of PRIMARY file group
  • When IFI (Instant File Initialization) is enabled you can have a larger Auto Growth setting instead of when IFI is not enabled
  • When Auto Growth for the Log File kicks in, NO additional log records can be flushed to disk and therefore have to wait
    • But only for the database where Auto Growth kicked in, and NOT for the WHOLE instance
  • FGCB_ADD_REMOVE Latch (File Group Control Block)
    • Needed for add/remove files, and Auto Growth
    • Can be checked if Auto Growth is a performance issue
  • The IO subsystem can intersect the partition alignment of Windows Server 2008
    • The partition can be therefore ALSO misaligned!!!
  • File System Level Defragmentation
    • Shutdown SQL Server first - corruption can occur!
    • Gains maybe 1 - 2% performance (when doing large range scans)
  • Reads from the TxLog can be sequential or random
    • Sequential
      • Log Backup
      • Log Reader Agent for Replication
    • Random
      • Crash Recovery
      • Rollback
    • Log Records are chained together backwards through their LSN
    • Rollback always goes backward, therefore Random IO
    • Redo goes forward
    • Rollback goes backward
  • Minimally Logged Operation
    • Changed Data Pages are flushed to disk during the COMMIT
    • The flush is Extent based
  • Allocation Weightings
    • Weightings are per filegroup
    • One file always has a weighting of 1
    • Weightings are recalculated
      • Grown, Shrunk, Added, Removed or 8192 allocations took place
    • During an allocation the weighting is reduced by 1, e.g. 8192 to 8191
    • FGCB_ADD_REMOVE Latch when weightings are recalculated
    • Proportional Fill doesn't work on TempDb
    • Can be tracked through Extended Events - see Jonathans Blog
  • TF 1117
  • Backups/Restores
    • Backups are doing sequential IO
    • Backups are not going through the Buffer Pool
      • They are going directly to the data files
    • Fast Recovery
      • Only available on Enterprise Edition
      • Database is accessible after the REDO phase, and before the UNDO phase begins
    • Separate threads when you do backup striping
  • TempDb
  • Perfmon counter "Disk Queue Length" is not relevant
    • Outstanding IOs are ok, just look at the latency times from DMV sys.dm_io_virtual_file_stats

Module 2: Waits & Queues

  • Waits are just symthoms
  • LCK_M_S: Lock Mode Shared
  • APPEND_ONLY_STORAGE_FIRST_ALLOC
    • Latch on the Version Store
  • FGCB_PRP_FILL: proportional fill latch
  • Dead Latches are possible
    • Bugs inside SQL Server
  • Dead Spins are not possible
    • You can't acquire a Spin, when it is already acquired
  • When a Spinlock is acquired, nothing else can't be latched or locked
  • SUSPENDED threads are not a ordered list, anyone can be signaled to go into the RUNNABLE state
  • RUNNABLE is a FIFO queue, expect when Resoure Governor is used
    • Workload starvation could be theoretical possible for low priority workload
  • Propagated Page Split
    • Scans from the Virtual Root down to ensure that the Shared Latches are present
  • CXPACKET
    • Parallel Index Rebuild uses the statistics to determine how to divide the work
    • Resource Governor can be used to set MAXDOP for resource pools
      • DBCC CHECKDB can't use MAXDOP query hint, so Resource Governor can be used with a dedicated resource pool
  • OLEDB
    • DBCC CHECKDB uses OLEDB internally
    • Microsoft Office (Access, Excel)
  • WRITELOG
    • SSD: 1 - 2ms
    • < 10ms Traditional Hard Disk: >
  • Reasons for CXPACKET
    • Physical Cores are a little bit faster than logical cores, therefore CXPACKET waits also can occur
      • Therefore Hyper Threading is not recommend for SQL Server
    • Scan of partitioned tables, when one partition is on a faster storage system than another partition
    • Out-Of-State Statistics
  • Service Broker Wait Types
  • Preemptive Wait Types of Win32 IO are indicating that FILESTREAM is used
  • System table contentation can be removed if you use multiple insertion points (like in TempDb)
    • Create a dedicated table for each SPID in TempDb (not a temp table)

Module 3: IO Core Concepts

  • "General" SQL Server Latency recommendation
    • < 8ms: Excellent >
    • < 12ms: Good >
    • < 20ms: Fair >
    • > 20ms: Poor
  • Parallel disk head positions are not possible
    • All disk heads are positioned on the same position
  • Short Stroking
    • Storing data in the outer tracks because of the higher data density and lower seek times
    • Disks are storing data from the outer side to the inner side
  • SSD Seek Times
    • From 0.0ms to 0.16ms
  • RAID1 can improve read performance by reading from each disk
  • RAID10 and RAID01 provides the same performance, but a different redundancy
  • RAID01
    • 1 drive failure fails the whole RAID 0 array

Thanks for reading

-Klaus

SQL MCM Training – Day 0

Sunday, August 7th, 2011

For the next 3 weeks I'm in Bellevue/WA for the SQL MCM training provided by Paul Randal, Kimberly Tripp, Jonathan Kehayias, and Bob Beauchemin of SQLskills.com. So don't expect any technical weblog posting from me within that time. I'm sitting in the following 3 training classes:

  • Performance Tuning (5 days)
  • High Availability & Disaster Recovery (4 days)
  • Security, Powershell & Development Support (5 days)

I have already planned to attend the first week of training in London (Internals & Performance), but I had to cancel it after the first day, because of health issues…L. I arrived yesterday evening from Vienna/Austria, which is a very long trip across half of the world. It took me around 24 hours from door to door until I reached Bellevue. During my 2 long flights (VIE – IAD – SEA) I had plenty of time to go through the SQL MCM pre-reading list and read a lot of different whitepapers. Today's day was also about reading, working, and learning about SQL Server – always with a big jetlag in the back…

Tomorrow is Day 1, and I'm really looking forward to it. I'm trying to blog every day about the training, what I have learned, and how I feel during those 3 weeks of powerful SQL Server training!

Thanks for reading

-Klaus

My SQLbits sessions

Wednesday, August 3rd, 2011

Since today its official, I'm presenting 2 sessions at the SQLbits conference in Liverpool (http://www.sqlbits.com) from September 30 – October 1. Here are both of my session abstracts:

Advanced SQL Server 2008 Troubleshooting

It's Friday, 05:00pm. You are just receiving an email that informs you that your SQL Server has enormous performance problems! What can you do? How can you identify the problem and resolve it fast? Which tools provides you SQL Server for this task? In this session you will see a lot of different tools with which you can do effective SQL Server troubleshooting. You will learn how to identify performance bottlenecks with those tools, and how to resolve the identified bottlenecks.

Understanding SQL Server Execution Plans

Do you know SQL Server Execution Plans? Yes! - and can you read/analyze them? No... For the beginner it is not very easy to understand and explain execution plans generated by SQL Server for your queries. Therefore this session gives you a good overview and understanding about SQL Server execution plans and how you can read them without reading through complete books. In this session you will see the basic constructs of an execution plan, and how SQL Server uses them to translate your SQL based query to an internal format which is executed by the execution engine of SQL Server. After attending this session you are able to understand and analyze execution plans generated by your queries, and how you can tune them with the adoption of additional indices. Finally you will also see which additional information about your queries an execution plan will expose to you.

I'm looking forward to see you in one or both of my sessions J

-Klaus

My upcoming speaking engagements

Tuesday, July 26th, 2011

A few days ago I had an interesting conversation on Twitter about my upcoming speaking engagements. I really realized at this time that I have a lot of different SQL Server conferences in my pipeline where I'm speaking over the next months. Here are the conferences where you can meet me, and talk about SQL Server related topics.

  • September 12 - 14, London/United Kingdom:
    I'm doing my own "Advanced SQL Server Performance Troubleshooting Workshop", which is aimed for the senior DBA/DEV with experienced know-how in SQL Server. You can find further information about the workshop at http://www.csharp.at/events.html where you can also place your registration. If you are registering by the end of this month, you will get a discount of GBP 200!
  • September 23, Solid Quality Summit, Budapest/Hungary, http://www.sqlu.com
    I'm doing 3 sessions about SQL Server performance troubleshooting (what else?). I'm talking about Database Internals (yes, that's THE session from SQLbits 8 J), Troubleshooting TempDb Performance, and Advanced SQL Server Performance Troubleshooting Techniques
  • September 26 – 28, Vienna/Austria
    I'm doing again my own "Advanced SQL Server Performance Troubleshooting Workshop" in Vienna, which is again aimed for the senior DBA/DEV. I'm also able to give you a EUR 100 discount, if you register by the end of this month at http://www.csharp.at/events.html
  • September 29, SQLdays, Rosenheim/Germany, http://www.sqldays.net
    SQLdays is a nice and niche SQL Server conference in South Germany (in Bavaria) where they have really good beer J. I'm again doing 3 performance troubleshooting sessions about "Maintaining VLDBs", "Troubleshooting TempDb", and "Advanced SQL Server 2008 Troubleshooting Techniques".
  • September 30 – October 1, SQLbits 9, Liverpool/United Kingdom, http://www.sqlbits.com
    I've submitted several advanced sessions about troubleshooting SQL Server related performance problems. I don't know if I can speak, but I'm speaking regularly the last years, so if I get accepted, I will be of course there!
  • October 10 – 14, SQLPASS, Seattle/USA, http://www.sqlpass.com
    That's the really great one! I'm speaking at the SQLPASS Community Summit regularly since 2006 (expect 2008 because of personal reasons), so this is now my 5th contribution to the best SQL Server conference on earth! I was one of the first European speakers that got a spotlight session in 2007! I'm again doing my famous session about "Advanced SQL Server Troubleshooting Techniques".
  • November 7 – 9, Zurich/Switzerland
    I'm doing my own "SQL Server Performance Tuning & Troubleshooting Workshop" in Zurich/Switzerland, which is aimed for professional DBAs/DEVs. This workshop is very new, because I've announced it yesterday evening J. You can find further information about the workshop at http://www.csharp.at/events.html
  • November 10, TBD
    On November 10 I'm speaking on another SQL Server conference in Europe, but at this time I'm not allowed to tell you any more things about it… ;-(

As you can see from this list, there a lot of opportunities, where you can meet me face-2-face and talk to me about SQL Server related topics. As you can also see I'm doing a lot in the area of performance tuning and troubleshooting. If you need help in these areas of SQL Server, don't hesitate to contact me.

The cool thing is that I'm doing my session about "Advanced SQL Server Troubleshooting Techniques" several times within the next months, so you can expect here a really awesome content. And trust me, this session will differ from everything you have seen in the past, I'm trying to deliver that session in a really new way with some fun for you!

See you soon & thanks for reading!

-Klaus

How my SQL Server trainings helped Hannes and his company to improve their SQL Server based software

Tuesday, July 5th, 2011

In the last months I have done a lot of SQL Server consulting engagements where I have seen the craziest performance problems and learned how to solve them effectively. Sometimes these problems occurred because of misconfigured hardware (mostly because of I/O subsystems), but almost every problem also had some source in the development, maybe a developer hasn't tested his/her code against a production workload, maybe the developers haven't thought carefully about indexing strategies or even locking and blocking scenarios with concurrent users.

Because I have a strong background in development in combination with SQL Server I know exactly WHERE and WHY those problems are introduced. The short answer is very simple: for developers SQL Server is just a simple black box: you feed SQL Server with a query, and sometime later SQL Server returns you the result of the query. The key word here is sometime: sometimes it takes a few milliseconds to execute a query, sometimes it takes a few seconds or even minutes to execute the same query! For me, SQL Server was also a big black box for several years. I didn't have understood the inner workings of SQL Server, but I was able to write SQL Server based applications, because the tool support that Microsoft provides for developers is just amazing. But that's only the half of the truth. There is more that you HAVE to know about SQL Server, so that you are able to write better, scalable, and great performing SQL Server based applications.

For that reason I'm also offering custom SQL Server trainings, where I'm showing developers (and also DBAs) how SQL Server really works, and what are the pitfalls when you are working and developing with SQL Server. I have done such a training for HiCo (see http://www.hico.com) in Austria. They are developing and selling a large .NET application that is based on SQL Server, and in the last months they wanted to improve the performance and scalability of their application, so we have organized a 5-days long training where we went through all the most important core concepts of the relational engine of SQL Server, and how you can troubleshoot them, when you experience serious performance problems.

But by now I want to stop writing, because I few days ago I had the chance to meet again with Hannes – one of the Senior Software Developers of HiCo – who attended the training. We had a little chat about the training, and how it helped him and his company to improve their SQL Server know how, and how they can now relate this know how when they develop new features for their software. Here are some of the questions that I have asked Hannes, along with his answers.

In which area your company is using SQL Server, and what was your background on it, before you attended the training?

Our usage of SQL Server is primary as a repository for storing structured documents (XML, SGML), in our application. Our application uses current technology like WCF (Windows Communication Foundation) and WPF (Windows Presentation Foundation) and supports a large number of users and workflows to create technical publications, according to standards like S1000D (see http://en.wikipedia.org/wiki/S1000D). It's a requirement of our customers to support the Oracle database engine too, so we use nHibernate as an O/R-Mapper. Our initial background on SQL Server was "install it and use it to store our data". We did not care about internal mechanism of SQL Server – in fact we used it as a "black box to store data".

How would you rate your knowledge about SQL Server after attending the training?

The training gave us a "different look" at SQL Server. After the training it is no longer a "black box" for us.

Why your company has chosen to take an advanced SQL Server training?

We wanted to gain knowledge about how to make performance optimizations and how to solve "database problems" – to be more specific: we had some serious locking problems in our application.

Was it worth enough for you and your company to attend the SQL Server training?

For me, as a developer, it was more than worth enough, because I gained a deeper understanding and know "where to start searching", if I encounter problems according to the database. Four our company – as far as I can tell – it was worth too. Our mission is to provide high quality software for our customers and this naturally includes great performance and responsiveness of our applications. Since the database has a great impact on performance, it is important for us to be able to solve performance issues and locking problems.

What was the biggest "wow" effect you had during the training?

This in fact – for me - was something, not related to "performance or locking problems". It was, how important it is to set up a stable backup strategy and what happens, if you set up no backup strategy at all, because that IS, what you do when you use SQL Server as a "repository to store my data" and that's all you care about. Another very interesting point was the internal handling of indexes and what you should not do, when you want to "optimize" database indexes.

How often you had in your day-to-day work referred back to the provided training material and the notes you have taken during the course?

Because Klaus gave us his training materials, we are able to review them whenever required. Since the training, I did not required to review the training materials, but I know where to look, when I have to. The training definitely had influence on how I design database tables now. In contrast to designing tables before the training, I no longer use VARCHAR(MAX) columns for fields, that store a maximum of 500 bytes J

Which module was the one, that helped you the most in your day-2-day work?

The reason why we took the SQL server training, as already mentioned, was because we encountered locking and performance problems. The recipes how to solve these problems helped us most in our day to day work.

Would you recommend the training to other .NET/SQL Server professionals, and if yes, why?

I would definitely recommend the training to all developers that use SQL Server in their applications. It changes your point of view from "something to store my data in" to "I know how my table design affects performance" and now I can read and understand execution plans, and know how to do troubleshooting with them.

What do you want to tell other blog readers about the training?

If you use SQL Server in your applications – this is the course you have to take J

Wow, that sounds pretty good! Hannes and his co-workers really have enjoyed my SQL Server training, and they have learned a lot that they can now use in their day-2-day job. If you also want to have the experience that Hannes has enjoyed, I'm providing my "Advanced SQL Server Performance Troubleshooting Workshop" to you, starting with September across Europe. See http://www.csharp.at/events.html for further information.

This workshop is a "compact" 3-days long advanced training about serious performance problems/bottlenecks that you WILL encounter in your SQL Server production systems. We will have a look on how to find and identify those problems, and – of course – how to solve them. I'm currently providing the public course training in Austria and UK, but other countries like Switzerland, Germany, Norway, etc. are already in my pipeline.

If you are not able to attend the training, because of time constraints or other reasons, you can also drop me an email, so that we can arrange a customized in-house SQL Server training for your company.

Thanks

-Klaus

The mystery of the NULL bitmap mask

Wednesday, June 29th, 2011

In today's weblog post I want to talk about the NULL bitmap mask and some mysteries that you can encounter with it. The NULL bitmap mask encodes at the Storage Engine level which column stores a NULL value and which not. It's a simple bitmap mask, where a set bit (1) means that the column has a NULL value, and a not set bit (0) means that the column has an actual value (not a NULL value).

At the first look this sounds very straightforward, but when you look into the details, there are some implications with this approach. Let's have in the first step a look into the concrete structure of a data row that is used by the Storage Engine to store physically a record on a disk. The following picture shows that structure:

This format is called the FixedVar format, because SQL Server always stores fixed length columns (like INT, CHAR) before variable length columns (like VARCHAR) are stored. As you can see SQL Server stores at the beginning 2 bytes with some status bits, and then in 2 bytes the offset to which the fixed portion of the record is stored. Afterwards you find 2 bytes that stores the column count, which is followed by the actual NULL bitmap mask.

When you look at this in the first step, everything makes sense to you, but after a second look, you start thinking and you may ask, WHY SQL Server stores the actual column count in EVERY data row??? The column count MUST be identical for every data row! Why SQL Server actually stores redundant data?

The first (short) answer is very easy and logical: SQL Server needs the column count to calculate how many bytes are used for the actual NULL bitmap mask. Up to 8 columns in a table needs 1 byte, 9 – 16 columns need 2 bytes, 17 – 24 columns need 3 bytes, and so on. Makes sense, but again: the column count for each record in a table MUST BE THE SAME!

Let's have now a more detailed look into the technical correct answer, along with some examples. First of all, you need to know that the NULL bitmap mask stores the number of columns that are CURRENTLY stored in the record at the Storage Engine level. This means that SQL Server can store a different amount of columns in a physical data row – ok, now it gets really confusing… So the column count in the physical data row does not have to match with the column count in the table metadata layer (sys.columns). Those are also really different layers inside the architecture of SQL Server.

So under which circumstances those layers are not matching to each other? It's very simple: when you are adding columns to an existing table! SQL Server makes a big difference if you add a NULL or NOT NULL column to a table. When you just add a new NULL column to a table, SQL Server only updates the table metadata layer, WITHOUT touching the Storage Engine layer. This means that none of your records gets physically changed, when you add a NULL column. On the other hand, when you add a NOT NULL column, SQL Server updates the table metadata layer, and ALSO the Storage Engine layer, which means that SQL Server has to touch and rewrite EACH of the records in the table, where you have added the NOT NULL column. This makes a huge performance different! For that reason SQL Server has to store the actual column count in each data record, because the column count must not be in synch with the table metadata layer.

Let's have now a more detailed look into a concrete example. For this example I have created a simple table with 8 columns, which means that SQL Server uses 1 byte for the NULL bitmap mask:

CREATE TABLE TestTable

(

Column1 INTIDENTITY(1, 1)NOTNULL,

Column2 CHAR(600)NOTNULL,

Column3 CHAR(600)NOTNULL,

Column4 CHAR(600)NOTNULL,

Column5 CHAR(600)NOTNULL,

Column6 VARCHAR(600)NOTNULL,

Column7 VARCHAR(600)NOTNULL,

Column8 VARCHAR(600)NOTNULL,

)

GO

Afterwards I have inserted 2 records into the previous created table:

INSERT INTO TestTable VALUES

(

REPLICATE('2', 600),

REPLICATE('3', 600),

REPLICATE('4', 600),

REPLICATE('5', 600),

REPLICATE('6', 600),

REPLICATE('7', 600),

REPLICATE('8', 600)

),

(

REPLICATE('2', 600),

REPLICATE('3', 600),

REPLICATE('4', 600),

REPLICATE('5', 600),

REPLICATE('6', 600),

REPLICATE('7', 600),

REPLICATE('8', 600)

)

GO

When you dump out both data pages through the DBCC PAGE command, you can see that each record has a length of 4219 bytes (4204 bytes data + 7 bytes row overhead + 2 bytes variable length column count + 3 x 2 bytes variable column offset array entry).

DBCC TRACEON(3604)

GO

DBCC IND(InternalStorageFormat, TestTable,-1)

GO

DBCC PAGE(InternalStorageFormat, 1, 79, 1)

GO

DBCC PAGE(InternalStorageFormat, 1,89, 1)

GO

Now just add a new NULL column to the existing table:

ALTER TABLE TestTable ADD Column9 CHAR(600)NULL

GO

This is the 9th column in the table, which means SQL Server needs now 2 bytes for the column count. But SQL Server DOES NOT changes the physical data row at the Storage Engine level, because you are just adding a NULL column. SQL Server doesn't have to do anything on the Storage Engine level. You can prove that just by dumping out the 2 data pages again:

DBCC PAGE(InternalStorageFormat, 1, 79, 1)

GO

DBCC PAGE(InternalStorageFormat, 1,89, 1)

GO

The records have the same length of 4219 bytes, but you have added a column (logically) to the table. Let's try now to update one record of the table, so that the previous added column gets an actual value:

UPDATE TestTable SET Column9 =REPLICATE('9', 600)

WHERE Column1 = 1

GO

When you now dump out the page where the first record of the table is located, you can see that the record size is now 4820 bytes. SQL Server has now expanded the NULL bitmap mask to 2 bytes (1 additional byte overhead) and has written out the actual value of the column (600 additional bytes). SQL Server increased the record size from 4219 bytes to 4820 (4219 + 1 + 600 = 4820).

When you dump out the data page where the 2nd record of the table is stored, the record size is the old one of 4219! You have now created a scenario where SQL Server stores a different length of the NULL bitmap mask inside the data row. This also means that you can have scenarios where a table with only fixed length columns has different row sizes on the Storage Engine level – sounds interesting, isn't it? J

Let's now drop the table, recreate it, and insert the 2 records into it. Now we are adding a NOT NULL column to the table:

ALTER TABLE TestTable ADD Column9 CHAR(600)NOTNULL DEFAULTREPLICATE('9', 600)

GO

Now, SQL Server has to change EVERY record on the Storage Engine Level, because the default value of the new column must be added (you must define a default value for the new column when you already store records in a table), and SQL Server also has to expand the NULL bitmap mask.

This phenomenon can lead to serious performance problems when you are dealing with large tables, and you want to add a NOT NULL column. To give you an idea about the performance degradation I have inserted 1 million records into that table. When I have added a NULL column, SQL Server needed a few milliseconds, because it was just a metadata operation. But when I have added to that table a NOT NULL column, it took around 40 seconds (!) until the ALTER TABLE was completed. So as you can see this is a serious performance degradation when dealing with adding NOT NULL columns to big tables!

I hope that you now understand why SQL Server has to store the actual column count in each record on the Storage Engine level, and that you can have serious performance problems when you are adding NOT NULL columns to large tables in SQL Server.

Thanks for reading!

-Klaus