Archive for November, 2009

SQL Saturday #34/New England Data Camp v2.0

Monday, November 30th, 2009

It’s official sports fans. Well, it’s been official since last week since Adam Machanic set up the web site. The New England SQL Server Users Group and the Southern New England SQL Server Users Group are again jointly hosting a full day of SQL Server goodness on January 30th. The call for sponsors and speakers is open. We already have several local luminaries lined up to present including Aaron Bertrand on Management Studio Tips & Tricks and Scott Abrants on Automating Database Deployments with Visual Studio.

Please register to spend a day with your peers, learning and networking. It’s being held at the Microsoft Waltham office, a great facility. We should have a full day with lots to do and learn.


How do -you- install SQL Server? (Part 3)

Monday, November 30th, 2009

Welcome to part 3 in this series on a thought process around SQL Server Installation. In Part 1, we started talking about some basic steps and a thought process for a repeatable success story with your SQL Server installations. Part 2 discussed security, playing around before doing a first install and smoke testing afterwards.

Today we’ll close the discussion with some talk about maintenance, defaults and some post installation tips.

“The install is done, here you go”

Wait! You clicked finish, you reviewed the logs and tested connectivity like we talked about in Part 2. You aren’t ready to deliver yet though. Let’s cover a few things first. This, again, is not meant to be exhaustive but to help you think about the full process. Check out the documentation and blogs and explore in a dev environment.

Defaults

If you just did the installation and changed a few things like some of the directories and options, you don’t have all of the defaults but there are things to look at after the installation. There are defaults to change, or at least understand and consider. Let’s hit a few of them here.

Recovery Model

As the model database goes, so goes each new database you create (without manually changing it). The Model database is just that – a model database. Add a table to model and each new database you create will have that table in it. The same goes for recovery model. On all new installations of SQL Server the recovery model for the model database is “FULL” out of the box. This means that every new database you create will start out in FULL recovery model and the log file will continue to grow (after you take your first full backupuntil you take a full backup you are not truly in full recovery mode either when first starting out or when switching to full from simple). This is important to consider because if you are only planning on ever doing full backups and your recovery plans only have you recovering to a daily full backup, for example, you will see your log file grow to a point that can get out of control. (Then you have to start playing with -ugh- shrinking)

If you don’t want this then change your recovery model. If you don’t know what it all means, start with books online.

Database Sizes

By default new database files are set to a small size (2MB for data, 1MB for log). Don’t accept the defaults here. How big will your database grow to? What kind of transaction log space do you expect to need? Plan for that growth, add an overage percentage and “right-size” your files at the beginning. This prevents file fragmentation from frequent growths each time the file needs to grow and it helps give you peace of mind in knowing your database is right sized from the beginning.

For the Log File, you may want to check out this post by Kimberly L. Tripp on some tips for right sizing the transaction log.

Database Locations

Hopefully you setup a separate drive each for data files, log files and tempdb. That is generally a best practice though in lighter workloads it may not be absolutely necessary to split out tempdb. I say for log files it is always good from a performance and recovery standpoint to split log files from data files.

Where is your tempdb file? Do you need to move it to the drive you requested and setup for tempdb? (Check out this books online article on Alter Database. There is an example there for moving tempdb, in fact)

Are your default directories for new database log and data files setup right? (Right click on the server in SQL Server Management Studio –> Go to Database Settings –> What’s specified for the default directories?)

Data and Log File Growth

The default growth options for data and log files are likely not what you want to see. While it’s important to “right-size”, I am also of the opinion that you should allow for autogrowth as an insurance policy.  I just don’t like the default growth sizes (1MB for data files and 10% for log files). Instead I like to specify an amount that allows a “healthy” growth for that particular database and workload while looking to minimize file fragmentation from small chunks grown at a time. This varies from database to database. On an environment I just configured for a database that started at 25GB data and 5GB log file I chose 500MB and 250MB respectively.

I like to setup alerts for autogrowth operations with a monitoring tool watching my SQL environments. I like the insurance policy but in my mind using the insurance means there was a problem with the right-sizing.

Backups (or lack thereof)

We just talked above about understanding your recovery model. That is the first step in the game. Step two is setting up a backup strategy that matches your recovery strategy. My philosophy is that people focus on their backups too much.  What I mean by that is – remember why we take backups – we take them to have a way of recovering our databases in a crisis. If you can remember that, your backup plan should come together and you should have backups setup and running in your new instance sooner than most other tasks.

Maintenance

There isn’t any being done yet. You can change the defaults above and deliver, but you’ll be getting to know this newly created instance at some point in production. Could it be a need to recover from failure? Maybe performance was fine and just started getting worse all of the sudden? SQL Server databases, like most other tools and applications, require regular maintenance. Let’s briefly talk about some of the quick hits I throw on most instances as a basic rule.

Backups and history maintenance

Can’t have too many reminders about backups. Make sure you are taking backups and have them running in an automated fashion.

Also make sure you are cleaning up your backup history. There is a great blog post by Jonathan Kehayias here. Rather than regurgitate or try my own attempt, check it out.

Index and Statistics Maintenance

Your databases should have indexes for performance minded querying of data. They also help in the performance of updates (which are essentially making changes to data based on a query of that data) and having a clustered index can sometimes mean an improvement to inserts and deletes as well.

They grow less useful if your indexes are heavily fragmented and more work is required when working with an index than would otherwise be necessary. Look at rebuilding or reorganizing indexes as time allows. A great script by Michelle Ufford for this is here.

Statistics also help your database performance. Without them, the optimizer has a lot less to work with. Statistics help show the optimizer how unique your data is, how many qualifying rows could be found for queries, etc. They can get out of data even with auto update statistics set on (the default behavior). Updating statistics is a good practice to get into with maintenance as well. Paul Randal blogs about the debate on what to do first (statistics update or index rebuild).

Error Log Recycles

Look up sp_cycle_errorlog and sp_cycle_agent_errorlog in books online. They are system stored procedures that do what they sounds like. They cycle the error logs in SQL Server and the SQL Server Agent. If you don’t do this from time to time the error logs can get a little too large to easily scroll through if you don’t restart your instance a lot (the error logs are reset each time your instance is restarted).

I typically recycle them weekly, especially in environments where I don’t have a monitoring tool and/or I do daily log checks on production environments ensuring no issues.

I also increase the number of error logs to a new number. If someone were to compromise a SQL environment they could theoretically run the sp_cycle_errorlog command the right number of times and some of their tracks are gone. I increase this to help alleviate that. It’s not ironclad but it is one more way to help harden SQL Server or at least provide more of a forensics ability.

DBCC CHECKDB

I don’t like database corruption. It doesn’t happen much and I take frequent backups to help alleviate potential data loss. Database corruption is rare nowadays. With each new release of SQL it seems to become rarer and rarer. But… It happens.

I like the comfort of knowing I check my databases for corruption as frequently as I can. I will typically setup a job to perform a “DBCC CHECKDB with no_infomsgs” on all of my databases as frequently as I can. While DBCC CHECKDB can be run concurrently with users in the system, I like to pick either low use or no use times to run it, especially for larger databases. I try to run it at least weekly and the more frequently, the better.

This will help my chances of detecting corruption before a user does.

Wake me up, as you go-go

(Is it wrong to paraphrase a wham song in public? I grew up in part of the 80s, but I digress…)

What kind of monitoring do you use at your company? There are a lot of great tools out there to use. There are even homegrown approaches you can find on SQLServerPedia or in articles at SQL Server Central.

There are pros and cons to many of the tools and some quick web searches will show you a lot of these. The point is – use something!

Don’t rely on the users to tell you when you have a problem or when a job fails. Before I hand an environment over to a project team or users, I like to know that I will find out if a job runs too long, a job finishes too quickly, the CPU is pegged for a substantial amount of time, the SQL Services become unavailable, etc.

It bears restating: Discover your problems before your users report them to you. (Or at least have a chance to do so!)

—–

Those are the basics I like to focus on when installing SQL Server. I am sure there are items that others look at as well. Please feel free to share them in the comments below. Thanks for reading the series and happy installations!

If you have any problems, feel free to leave a comment – I check regularly and have no problem helping you out with an issue during installation with some advice, clarification or a point in the right direction.

Some Related Thoughts

Want more content like this? Subscribe to this blog’s feed in your favorite feed reader and stay up to date with tips and tricks. I like to share techniques and free tools I use to help me with my day job.

Share/Bookmark

How do -you- install SQL Server? (Part 3)

Monday, November 30th, 2009

Welcome to this series on a thought process around SQL Server Installation. In Part 1, we started talking about some basic steps and a thought process for a repeatable success story with your SQL Server installations. Part 2 discussed security, playing around before doing a first install and smoke testing afterwards.

Today we'll close the discussion with some talk about maintenance, defaults and some post installation tips.

"The install is done, here you go"

 

Wait! You clicked finish, you reviewed the logs and tested connectivity like we talked about in Part 2. You aren't ready to deliver yet though. Let's cover a few things first. This, again, is not meant to be exhaustive but to help you think about the full process. Check out the documentation and blogs and explore in a dev environment.

Defaults

 

If you just did the installation and changed a few things like some of the directories and options, you don't have all of the defaults but there are things to look at after the installation. There are defaults to change, or at least understand and consider. Let's hit a few of them here.

Recovery Model

As model goes, so goes each new database you create (without manually changing). The Model database is just that - a model database. Add a table to model and each new database you create will have that table in it. The same goes for recovery model. On all new installations of SQL Server the recovery model for the model database is "FULL". This means that every new database you create will start out in FULL recovery model and the log file will continue to grwo (after you take your first full backup -- until you take a full backup you are not truly in full recovery mode either when first starting out or when switching to full from simple). This is important to consider because if you are only planning on ever doing full backups and your recovery plans only have you recovering to a daily full backup, for example, you will see your log file grow to a point that can get out of control. (Then you have to start playing with -ugh- shrinking)

If you don't want this then change your recovery model. If you don't know, start with books online.

Database Sizes

By default new database files are set to a small size (2MB for data, 1MB for log). Don't accept the defaults here. How big will your database grow to? What kind of transaction log space do you expect to need? Plan for that growth, add an overage percentage and "right-size" your files at the beginning. This prevents file fragmentation from frequent growths each time the file needs to grow and it helps give you peace of mind in knowing your database is right sized from the beginning.

For the Log File, you may want to check out this post by Kimberly L. Tripp on some tips for right sizing the transaction log.

Database Locations

Hopefully you setup a separate drive each for data files, log files and tempdb. That is generally a best practice though in lighter workloads it may not be absolutely necessary to split out tempdb. I say for log files it is always good from a performance and recovery standpoint to split log files from data files.

Where is your tempdb file? Do you need to move it to the drive you requested and setup for tempdb? (Check out this books online article on Alter Database. There is an example there for moving tempdb, in fact)

Are you default directories for new database log and data files setup right? (Right click on the server in SQL Server Management Studio --> Go to Database Settings --> What's specified for the default directories?)

Data and Log File Growth

The default growth options for data and log files are likely not what you want to see. While it's important to "right-size", I am also of the opinion that you should allow for autogrowth as an insurance policy.  I just don't like the default growth sizes (1MB for data files and 10% for log files). Instead I like to specify an amount that allows a "healthy" growth for that particular database and workload while looking to minimize file fragmentation from small chunks grown at a time. This varies from database to database. On an environment I just configured for a database that started at 25GB data and 5GB log file I chose 500MB and 250MB respectively.

I like to setup alerts for autogrowth operations with a monitoring tool watching my SQL environments. I like the insurance policy but in my mind using the insurance means there was a problem with the right-sizing.

Backups (or lack thereof)

We just talked above about understanding your recovery model. That is the first step in the game. Step two is setting up a backup strategy that matches your recovery strategy. My philosophy is that people focus on their backups too much. I blogged about it here. What I mean by that is - remember why we take backups - we take them to have a way of recovering our databases in a crisis. If you can remember that, your backup plan should come together and you should have backups setup and running in your new instance sooner than most other tasks.

 

Maintenance

 

There isn't any being done yet. You can change the defaults above and deliver, but you'll be getting to know this newly created instance at some point in production. Could it be a need to recover from failure? Maybe performance was fine and just started getting worse all of the sudden? SQL Server databases, like most other tools and applications, require regular maintenance. Let's briefly talk about some of the quick hits I throw on most instances as a basic rule.

Backups and history maintenance

Can't have too many reminders about backups. Make sure you are taking backups and have them running in an automated fashion.

Also make sure you are cleaning up your backup history. There is a great blog post by Jonathan Kehayias here. Rather than regurgitate or try my own attempt, check it out.

Index and Statistics Maintenance

Your databases should have indexes for performance minded querying of data. They also help in the performance of updates (which are essentially making changes to data based on a query of that data) and having a clustered index can sometimes mean an improvement to inserts and deletes as well.

They grow less useful if your indexes are heavily fragmented and more work is required when working with an index than would otherwise be necessary. Look at rebuilding or reorganizing indexes as time allows. A great script by Michelle Ufford for this is here.

Statistics also help your database performance. Without them, the optimizer has a lot less to work with. Statistics help show the optimizer how unique your data is, how many qualifying rows could be found for queries, etc. They can get out of data even with auto update statistics set on (the default behavior). Updating statistics is a good practice to get into with maintenance as well. Paul Randal blogs about the debate on what to do first (statistics update or index rebuild).

Error Log Recycles

Look up sp_cycle_errorlog and sp_cycle_agent_errorlog in books online. They are system stored procedures that do what it sounds like. They cycle the error logs in SQL Server and the SQL Server Agent. If you don't do this from time to time the error logs can get a little too large to easily scroll through if you don't restart your instance a lot (the error logs are reset each time your instance is restarted).

I typically recycle them weekly, especially in environments where I don't have a monitoring tool and/or I do daily log checks on production environments ensuring no issues.

I also increase the number of error logs to a new number. If someone were to compromise a SQL environment they could theoretically run the sp_cycle_errorlog command the right number of times and some of their tracks are gone. I increase this to help alleviate that. It's not ironclad but it is one more way to help harden SQL Server or at least provide more of a forensics ability.

DBCC CHECKDB

I don't like database corruption. It doesn't happen much and I take frequent backups to help alleviate that. Database corruption is rare nowadays. With each new release of SQL it seems to become rarer and rarer.

Even still, I like the comfort of knowing I check my databases for corruption as frequently as I can. I will typically setup a job to perform a "DBCC CHECKDB with no_infomsgs" on all of my databases as frequently as I can. While DBCC CHECKDB can be run concurrently with users in the system, I like to pick either low use or no use times to run it, especially for larger databases. I try to run it at least weekly and the more frequently, the better.

This will help my chances of detecting corruption before a user does.

Wake me up, as you go-go

(Is it wrong to paraphrase a wham song in public? I grew up in part of the 80s, but I digress...)

What kind of monitoring do you use at your company? There are a lot of great tools out there to use. There are even homegrown approaches you can find on SQLServerPedia or in articles at SQL Server Central.

There are pros and cons to many of the tools and some quick web searches will show you a lot of these. The point is - use one.

Don't rely on the users to tell you when you have a problem or when a job fails. Before I hand an environment over to a project team or users, I like to know that I will find out if a job runs too long, a job finishes too quickly, the CPU is pegged for a substantial amount of time, the SQL Services become unavailable, etc.

It bears restating: Discover your problems before your users report them to you. (Or at least have a chance to do so!)

-----

Those are the basics I like to focus on when installing SQL Server. I am sure there are items that others look at as well. Please feel free to share them in the comments below. Thanks for reading the series and happy installations!

If you have any problems, feel free to leave a comment - I check regularly and have no problem helping you out with an issue during installation with some advice, clarification or a point in the right direction.

SQL 2008 R2 Utility Control Point (UCP) Videos

Monday, November 30th, 2009

While Playing around with the UCP inside of SQL 2008 R2 I found some videos to help you get started. The November CTP has six videos listed but only five of the links are currently working.

Here is the splash screen you will see to help you get started:

UCP_splash

The link to “Enroll instances…” does not seem to be working. The other five are working just fine. Since I am not able to play the videos directly from my server, so I had to save the files to a network share and then play them from my desktop.

Here are the actual links:

Create a Utility Control Point (UCP)

Connect to an existing UCP

Create a data-tier application

Set Resource Health Policies

Modify UCP Users

Now, maybe it is just me, but I think there is some inconsistencies with the use of capitalization. But this is only a CTP, so I will refrain from filing a Connect item for now. But I may file one for the broken link.

The videos are hosted by Buck Woody (blog | @buckwoody), who is all business. I kept hoping to start making a list of people or things but he was professional throughout. In other words, the videos have little entertainment value, but are a wonderful resource to help you get started using UCP.

SQL 2008 R2 Utility Control Point (UCP) Videos

Monday, November 30th, 2009

While Playing around with the UCP inside of SQL 2008 R2 I found some videos to help you get started. The November CTP has six videos listed but only five of the links are currently working.

Here is the splash screen you will see to help you get started:

UCP_splash

The link to “Enroll instances…” does not seem to be working. The other five are working just fine. Since I am not able to play the videos directly from my server, so I had to save the files to a network share and then play them from my desktop.

Here are the actual links:

Create a Utility Control Point (UCP)

Connect to an existing UCP

Create a data-tier application

Set Resource Health Policies

Modify UCP Users

Now, maybe it is just me, but I think there is some inconsistencies with the use of capitalization. But this is only a CTP, so I will refrain from filing a Connect item for now. But I may file one for the broken link.

The videos are hosted by Buck Woody (blog | @buckwoody), who is all business. I kept hoping to start making a list of people or things but he was professional throughout. In other words, the videos have little entertainment value, but are a wonderful resource to help you get started using UCP.

SQL Server XQuery For Idiots

Monday, November 30th, 2009

I’m still struggling with learning XQuery. My latest little revelation was small, but vital.

The difference between:

@inXML.nodes('/rss/channel/item')

and

@inXML.nodes('/rss[1]/channel[1]/item')

Is the difference between a 10 minute query and a 4 second query. Kind of shocking really. My understanding is that XQuery assumes there are multiple possible paths that look like “/rss/channel” so it searches over and over again through the XML to see if it can find them. But by identifying it as a root, showing that only one possible path is available, it stops stumbling around in the dark and simply reads the data.

It’s a little thing, but it made an enormous difference. I’ve still got a long way to go in learning how to use XPath within XQuery.

UPDATED: I modified the title so that it’s clear I’m talking about SQL Server here.

TechNet WebCast: SQL Server Locking & Blocking

Monday, November 30th, 2009

Next Monday, I’m presenting a TechNet WebCast for Microsoft Thrive Live! The subject: SQL Server Locking & Blocking Made Simple. Here are the details:

Event Overview

A good working knowledge of how Microsoft SQL Server makes use of locking and transaction isolation levels can help you greatly improve an application’s performance. In this webcast, we explore the SQL Server locking methodology and share techniques for enhancing query response times.

Language(s): English
Product(s): Microsoft SQL Server
Audience(s): IT Generalist
Duration: 60 Minutes
Start Date: Monday, December 07, 2009 9:00 AM Pacific Time (US & Canada)
Event ID: 1032432983

I’d love to have you join me for this free session. Click here to register.

OPENXML can be a beast sometimes.

Monday, November 30th, 2009

Our application includes some search functionality which is pretty complex to deal with.  In a nutshell the user can select multiple values from a couple of lists on the website and use those listed to filter down the rows which are being searched.  These lists are passed into the SQL Server as a couple of XML documents.  We recently had a larger customer call and complain that the search was slow.  I fired up profiler and grabbed the query.  They were right, 6 minutes is a long time for a query to take.

There's some full text searching going on so it's never going to really scream as there are 70 million records in one table, 57 million records in another table (with a one to many between them), and another 90 million records in the third table (this is a one to many to the table with 70 million records, and this table has the full text index on it).

Needless to say there was some tuning that I had to do.  The basic jist of the query was...

SQL:
  1. ...
  2. WHERE EXISTS (SELECT *
  3. FROM OPENXML(@hDoc_Computer, '//computer')
  4. WITH (ComputerId INT '@id') a
  5. WHERE a.ComputerId = Application.ComputerId)
  6. AND EXISTS (SELECT *
  7. FROM OPENXML(@hDoc_Logon, '//login')
  8. WITH (LogonId INT '@id') a
  9. WHERE a.LogonId = Application.LogonId)

After getting no where working on indexes and tweaking things here and there (and actually making the query take 16 minutes to run for this customer's data) I put a couple of table variables in the procedure and loaded those table variables up with the values from the XML Documents.

SQL:
  1. DECLARE @Computer TABLE
  2. (ComputerId INT)
  3.  
  4. DECLARE @Logon TABLE
  5. (LogonId INT)
  6.  
  7. ...
  8.  
  9. INSERT INTO @Computer
  10. SELECT ComputerId
  11. FROM OPENXML(@hDoc_Computer, '//Computer', 2)
  12. WITH (ComputerId INT '@ComputerId')
  13.  
  14. INSERT INTO @Logon
  15. SELECT LogonId
  16. FROM OPENXML(@hDoc_Logon, '//Logon', 2)
  17. WITH (LogonId INT '@LogonId')
  18.  
  19. ...

And I changed the WHERE clause to use the table variables instead.

SQL:
  1. WHERE EXISTS (SELECT * FROM @Computer a WHERE a.ComputerId = Application.computerId)
  2. AND EXISTS (SELECT * FROM @Logon a WHERE a.LogonId = Application.LogonId)

This got my query run time down to about 1 minute with the execution plan showing that ~90% of the time spent is being spent on the full text search. So while I wouldn't normally consider a query run time on 1 minute to be good, in this case it is. (This particular part of the application also goes out to the file server and uses Microsoft Search service to search millions of files for text string matches so this is now the fastest part of the search process.

Now don't take this post the wrong way. I love OPENXML, it's a great tool and I use it all over the place so that we can pass in multiple values in a single variable (all our code has to be able to run on SQL 2005 so table input parameters aren't an option for me). OPENXML just wasn't the write tool here, sort of.

I wish we could have found this performance problem in QA, but we just have no way to generate enough data to find these kinds of performance problems. But the problem is fixed and the customer is hopefully happy (for now).

Denny

SQLBits V – Conference & Session Recap (Part 1)

Friday, November 27th, 2009
Better late than never

I’ve been a little under the weather since getting back from SQLBits V so this post has been days in the making instead of just s a few hours. But here it is. This post focuses on the conference sessions, I am going to follow up with another post on people, conference feedback and things I might do differently next time.

Even the birds aren’t up this early…

Friday started swimmingly. Up at the crack o’ dawn,  taxi to a mainline station (my local didn’t have an early enough train to get me into London), only to find that the train has been delayed by 18 minutes. Pants. There goes all my hard work ‘being early’. The good news is that I made it on to the Newport Train (from Paddington). With just 2 minutes to spare. And I got a seat with a power outlet. Bonus! All my good intentions at getting a head start on an article have gone out the window as I am a little tired, so I try to catch up on some much needed shut eye.

Keynote by Donald Farmer

I arrived at the Celtic Manor, having shared a taxi, just in time for registration and a quick bacon bap before the welcome started. The keynote was given by Donald Farmer (Blog | Twitter). He gave us a great preview of the simply awesome PowerPivot and a few features in SQL Server 2008 R2. See this recap by Brent Ozar.

Below is a recap of the sessions that I found I got the most out of.

Friday Session 1 – New Development Features in SQL Server 2008 [Simon Sabin]

The first session of the day, that I attended, was Simon Sabin’s (Blog | Twitter) overview of the new development features that can be found in SQL Server 2008. Simon favours the demo approach for sessions and so had relatively few slides. Not a bad thing. Among other features available in SQL Server 2008, the following were discussed/demoed:

  • Variable declaration and initialisation – this can now be done in the same line (Declare @i Int = 1)
  • Named regions - You can create these using BEGIN & END blocks & naming it with a comment. Named regions will enable you to collapse the blocks for readability.
  • Intellisense – as has been available to .NET programmers for years. My only query here is, why not use the syntax that can be found in C# when writing LINQ queries [where you specify the FROM TABLENAME line first and then the SELECT statement). As such, you find yourself writing the FROM line first and then going back up to the SELECT statement….
  • Highlighted Syntax errors - think MS Word red squiggly lines to highlight that something is the matter.
  • Code validation at design time -
  • Date & Time datatypes - new Date datatype allows dates before 01/01/1753, and only takes up 4 bytes of storage. great for when you don’t actually need time precision. Time datatype handles precision down to the nano second, for when you do. The Time datatype also only refers to the time of day and not a time interval.
  • No longer allowed implicit calculations on dates - you now must use Date functions to perform calculations on Date datatypes (i.e. DateAdd(), Not “MyDatetime + 1”).
  • DateTime Offset - Allows you to store the date and time that has time zone awareness (based on the 24-hour clock)
  • Debugging - You can now debug your code! breakpoints, watch lists, view locals, etc
  • MERGE Statement - enables you to perform updates/deletes/inserts all within one statement.

Whilst this list is not conclusive, it certainly highlighted a number of new features that are now available in SQL Server 2008 and incentivises you to at least find out what else is available that could simplify you SQL development.

This was a great session, and if you find out that Simon is doing a presentation near you (UK User Groups), I suggest going.

Friday Session 2 - SQL Server 2008 R2: What’s new in Reporting Services? [Donald Farmer]

Donald Farmer (Twitter | Blog) blew my mind. Not once, but twice. First, in his keynote, he talked about the awesomeness of PowerPivot. Then in this session, he did it again. I loved this session and quickly realised that there’s going to be quite a learning curve and an altering of mindset of how reporting ‘gets done’ in the future. Why? Because the lines between the business & IT are blurring.

With the new Self Service Analysis and Reporting tools, there’s a new approach to finding out and tracking how the business is performing. So, if you are a Reporting Services Developer or Data/MI(S) Analyst. Things are going to change:

  • PowerPivot & Excel 2010 bring a lot more functionality ( & power) to business users. Enabling them to analyse data at length straight from the data source.
  • The arrival of report components in report builder, means that reports can be built quicker using existing parts of reports or report items. Think sub-reports, but on a more granular scale. Each report item can be a reusable component, which can be tweaked by the user for use in their own report.This includes the reuse of datasets, as components. These components are centrally managed by IT (or a reporting team).

So what did we learn? Here’s a bulleted summary:

  • Business users analyse data through tools like PowerPivot & Excel 2010, digging for answers to business questions. This analysis is an iterative process and gives rise to metrics that can be monitored. Enter reporting services. Reports are built to monitor the ‘answers’ discovered by the business analysts.
  • Reports can be created using components. Components are pre-built report objects that reside in a library that anyone can reuse. Think of sub-reports, but in a centrally managed way.
  • Created reports that are saved to the Server are saved as a report in its entirety, as well as individual report components. These components are then available to other users to include/tweak in their own reports.
  • When Publishing reports, you can choose which components of the reports you want to publish.
  • Report Data can be subscribed to via feeds (rss anyone?). This can be used to in PowerPivot to run Excel-based analyses using Pivot tables
  • New features of Reporting Services:
    • Spark lines – think inline charts. No x/y axes. Just used to illustrate a trend.
    • Data bars – no more having to use expressions to calculate cell padding and the applying a background colour. Pick your element & configure.
    • KPI’s – like those in SSAS, you can now add a KPI and configure it’s value, goal, status & trend. KPI’s can be ‘componentised’ and reused in other reports.
    • Lookup functions
    • Aggregates of Aggregates
    • New data sources (Project Madison, SQL Azure)
    • Ability to reset page numbers based on groups
    • Expression-based page breaks
    • Expression-based chart headers
    • Naming Excel Worksheet tabs (on render to excel)
    • Bottom-top text rotation
    • Session variables
  • R2 to be released in first half of 2010 (CTP is already out).
Saturday Session 2 – Jamie Thomson’s Whistle Stop Tour of SSIS Add-ins

In this session Jamie took us through the fact that there are a number of SSIS custom tasks available on the intertubez. Many of them are free and can be found on CodePlex. There were only one or two slides (Jamie’s contact info), as the whole session was taken up demoing each of the tasks he had chosen.

  • XMLify Task
  • Kimball SCD (Slowly Changing Dimensions) Task
  • TwitterTask
  • Term Extraction Task
  • Normaliser Task
  • Rank Transform Task
  • Compression & Decompression Task
  • FileWatcher Task
  • Trace File Task
  • Dynamic Data Flow Task
  • XMLify (http://xmlify.codeplex.com) – takes columns that you specify (from a query/view) and loads them into an xml block. Jamie demoed this for capturing error outputs.

    Kimball SCD (http://kimballscd.codeplex.com) – For slowly changing dimensions. Takes all data in the destination and for each column you can define it’s usage ( type 1 change, type 2 change, business key, surrogate key, etc). Type 1 changes will update the destination data based on the source data. This task is a monster. Loads of options for configurability.

    Twitter task (http://ssistwittersuite.codeplex.com/) I think this is awesome. You can automate sending your tweets. Careful though, password is in full view in design mode. You can also get all replies/mentions and then data mine them using the Term Extraction task. I can see this being used as a substitute for text (SMS) messaging notifications, for whatever you need to be notified about (if you’re always on Twitter).

    Term Extraction – A data mining task to pick out nouns, verbs, phrases. Although not an add-in (comes with SSIS) this is still a neat task that I had not used previously. Jamie is using it to sift through the Twitter feed to his @ssisdemo Twitter account. Very cool use to see what people are saying (words & phrases), and how often each occurs. Could be useful, I guess.

    Normaliser (http://normaliser.codeplex.com) – Normalises the data from an incoming data stream. Pure. Simple. Awesome. Data must be sorted as a prerequisite. This task only handles one to many relationships.

    Rank Transform (http://ranktransform.codeplex.com) – Ranks a dataset by Partition. Enables you to view a dataset ordered and ranked (by specified column), as well as being able to specify a partition for the ranking (i,e, month). The dataset must be sorted before use. Rank & Dense Rank have their differences, specifically where you have two ranked values that are equal (a ranking ‘tie’). So, if you have two values that are equal and ranked ‘2’, then for the column RANK the next value will be ranked as ‘4’ whereas the column DENSE RANK the next value will be ranked as ‘3’. Still confused? Go here.

    Zip task (http://cozyroc.com/ssis/zip-task)- this task takes a source file and compresses it into a gzip file (.gz). That’s it. Decompression decompresses the file. This will only work with a single file and not for a folder or multiple files. I think I would prefer to use 7zip and an Execute

    FileWatcher (http://www.sqlis.com/post/File-Watcher-Task.aspx) – This task ‘watches’ a directory, when a file is detected it will progress the flow to the next task. Placed in an endless loop this will continually execute, and will pick up files as they arrive. Neat.

    Trace File (http://www.sqlis.com/post/Trace-File-Source-Adapter.aspx) – understands a sql trace file and enables you to handle the trace file contents.

    Data Flow Plus (http://www.cozyroc.com/ssis/data-flow-task) – Enables you to populate two different tables with the same data flow. Neat.

    Trash Destination (http://www.sqlis.com/post/Trash-Destination-Adapter.aspx) – Enables you to send any data flow to a trash destination. No configuration required, so ideal for a number development scenarios.

    This session, along with those by Donald Farmer, was the most enjoyable and one that I probably took the most away from. It opened my eyes to the world of custom SSIS tasks and that fact that so many of them are freely available (with source code so that you can tweak at leisure). Jamie definitely seems to be the go-to guy for SSIS related  info. If you haven’t checked out his site yet, I suggest doing so: http://sqlblog.com/blogs/jamie_thomson/default.aspx

    Other SSIS Add-in Links:

    Go to Part II of the SQLBits recap.

    Have You Seen This Man?

    Friday, November 27th, 2009

    Bwoody

    EDMPASS WANTS BUCK WOODY!!!

    When talking to Summit attendees the one name that comes up constantly or the one question that comes up is…

    “Did you get a chance to see Buck Woody present?”

    I did… and it was my first time.  It was the end of the day I believe on the Thursday… I would tell you EXACTLY when but….

    dnsum209lulz

     

    Anyways… back to the story… so I go into Buck Woody’s session on Policy Based Management, Data Collector, and Central management Servers and I was pretty tired as it was and the room was cooking but as soon as I walked in the door it was entertainment from start to finish…. and not just entertainment but VERY informative.

    I remember back in the days of junior high and high school the teachers that stand out in my mind are the ones that were able to educate the class while adding a bit of fun and their own flavor to the material.  This is Buck Woody. 

    We’re all at the Summit, we’re all there to learn, and we’re all excited about SQL Server BUT if you had the pleasure of seeing a Buck Woody session you should know exactly what I mean when I say this guy takes presentations to the next level. 

    I heard a rumor on Twitter that if the Summit was NOT in Seattle that there is a chance that Buck Woody might not be able to attend and speak due to the travel costs:

    twebw

    … and this is exactly why I for one DO NOT want to move the Summit out of Seattle.  PASS has come right out and said that if we move the Summit from Seattle that we will NOT be able to take advantage of the all the local Microsoft folks (like Buck Woody).   Now some might say I’m a little biased in my opinion because Seattle is only an hour and 45 minute flight for me… but I would feel the same way if it was in Chicago, New York, or Boston. 

    Anyways the goal of this post was 2 things:

    1) To spread the good word about a great speaker – Buck Woody

    2) To get Buck Woody to speak at EDMPASS

    I’ve accomplished #1…. let’s see how I do with #2

    Enjoy!!

    Post to Twitter Post to Delicious Post to Digg Post to StumbleUpon