Archive for March, 2009

SQL SERVER – Reseed Identity of Table – Table Missing Identity Values – Gap in Identity Column

Tuesday, March 31st, 2009

Some time ago I was helping one of my Junior Developers who presented me with an interesting situation. He had a table with Identity Column. Because of some reasons he was compelled to delete few rows from the table. On inserting new rows in the table he noticed that the rows started from the next identity value which created gap in the identity value. His application required all the identities to be in sequence, so this was certainly not a small issue for him.

The solution to this issue regarding gap in identity column is very simple. Let us first take a look at his application’s situation wherein there is missing identity and then we will move on to the solution.

Developers can easily deter the above issue by avoiding gap in sequence of identity column through two additional SQL Tricks of reseeding identity.

We will now see the same example with the solution to the above gap issue. On deleting records, table was reseeded with identity, which was deleted. Download complete SQL Script here.

USE AdventureWorks
GO
/* Create a table with one identity column */
CREATE TABLE TableID (ID INT IDENTITY(1,1), Col VARCHAR(10))
GO
/* Insert 10 records with first value */
INSERT INTO TableID (Col)
VALUES ('First')
GO 10
/* Check the records in table */
SELECT *
FROM TableID
GO
/* Delete last few records */
DELETE
FROM
TableID
WHERE ID IN (8,9,10)
GO
/* Check the records in table */
SELECT *
FROM TableID
GO
/* Get current Max Value and reseed table */
DECLARE @MaxID INT
SELECT
@MaxID = MAX(ID)
FROM TableID
DBCC CHECKIDENT('TableID', RESEED, @MaxID)
GO
/* Insert 10 records with second value */
INSERT INTO TableID (Col)
VALUES ('Second')
GO 5
/* Check the records in table */
SELECT *
FROM TableID
GO
/* Clean Database */
DROP TABLE TableID
GO

I hope is solution is clear to all my readers and they will use it to avoid problems related to gap in identity column. Do send me your feedback on this article and let me know if you all need further explanation.

Reference : Pinal Dave (http://blog.SQLAuthority.com)

kick it on DotNetKicks.com

Posted in Pinal Dave, SQL, SQL Authority, SQL Index, SQL Query, SQL Scripts, SQL Server, SQL Server DBCC, SQL System Table, SQL Tips and Tricks, SQLServer, T SQL, Technology Tagged: Identity

Dynamic SQL Presentation Materials

Tuesday, March 31st, 2009

First off, thanks to everyone who attended today’s Dynamic SQL presentation. I appreciate everyone taking the time out of their busy day to attend the first AppDev SIG Live Meeting in quite some time. Hopefully there will be many more.

I’ve attached a copy of the Dynamic SQL presentation that I gave today. Here’s a run down of the contents of the zip file and an explanation of the database setup:

  • The DynamicSQL database referenced in the SQL is a copy of the AdventureWorks 2005 database renamed to DynamicSQL.
  • A copy of the dbo.Customers table has been imported from the Northwind database.
  • The Dynamic T-SQL Support.sql file contains SQL to create a calendar table and a numbers table that are used in the example SQL as well as the Helper_LongPrint stored procedure that is used to help debug longer stored procedures.
  • The Dynamic T-SQL.sql file contains the example SQL used in the presentation as demo code.
  • QueryStress1.sqlstress and QueryStress2.sqlstress contain the settings for the two SQL Query Stress demonstrations that were shown. QueryStress1 shows a positive baseline for using parameterized SQL whereas QueryStress2 shows a less performant option using a query build in an ad hoc fasion.

If you have any questions, feel free to email me at jeremiah.peschka@gmail.com.

Adding Reliability to Your Infrastructure

Tuesday, March 31st, 2009

I’ll never fly in a single-engine plane.  Never.  Not gonna happen.  Carve that one in stone.

Call me chicken, call me scaredy-cat, but I’m not going to get into an airplane that will kill me if an engine fails.  The next step up is a twin-engine plane - but I don’t get on all of those either.

I like my engines to be RAID 10.

I like my engines to be RAID 10.

If a single engine averages a failure once in every 10,000 hours of operation, then a plane with just one of those engines will experience a failure once every 10,000 hours.  What if we equip our plane with two of those engines - how often will we experience a failure?

  • Once in every 20,000 hours of operation, or
  • Once in every 5,000 hours of operation

The correct answer is once in every 5,000 hours of operation.  All other things being equal, two-engine planes are twice as likely to have an engine failure in the same span of time.

The only way a twin-engine plane is more reliable is if just one of the two engines is enough to power the airplane safely. If the airplane requires both engines in order to maneuver and land, then the second engine didn’t add reliablity: it just added complexity, expense and maintenance woes.

If one engine fails, the other engine might suddenly be running at full capacity.  In day-to-day operations, we’d only be using around 50% of each engine’s power (because we got twice as much power as we needed in order to cover our disaster recovery plan).  This engine would have to suddenly go from 50% utilized to 100% utilized - and that’s when things really start to get tested.  This means we probably shouldn’t take our time to land if one engine fails: we should get our plane on the ground as fast as possible to minimize the risks of overworking the remaining engine.  It’s working much harder than normal, and it isn’t used to that kind of load.

The only way a twin-engine plane is more reliable is if the one remaining engine can last long enough to get us to the ground. If it can’t handle the stress of running at 100% capacity, we’re not much better off than we were in the first place.  Therefore, it probably makes sense to build in even more capacity; either using more powerful engines so that they each only need 80% of their power to handle our plane, or using three engines instead of two.

But we can’t just go bolting on engines like crazy: engines cost money, add complexity, and add weight, which makes the plane harder to get off the ground.

Now Replace “Engines” with “Servers”

Some disaster recovery plans call for two database servers: a primary server used for production, and then a secondary disaster recovery server at another site.  That secondary server is constantly refreshed with data from production - might be with log shipping, replication, database mirroring, etc.  So far, so good: we’ve improved the reliability of our production site, even though we’ve added complexity.

Later, management looks at that server sitting idle and says, “We can’t leave those resources lying around. Let’s use those for reporting purposes.  We’ll have reports run against the DR server, and that’ll make our production server much faster.”  Query loads grow over time, and before you know it, both of those servers are now production.  If even just the disaster recovery system goes down, we suddenly have a problem.

The only way a two-server disaster recovery plan is more reliable is if just one of the two servers is enough to power your application safely. Otherwise, you don’t have a disaster recovery plan: you have a pending disaster.  You have the insinuation of protection without enough actual protection.  Sure, your data will still be around if one server dies, but you won’t have enough horsepower to actually service your users.  In the users’ minds, that’s a failure.

To prepare for that disaster, do some basic documentation ahead of time.  Make a list of your environments, and note whether each DR server is purely DR, or if it’s actually turned into production over time.  Before disaster strikes, make a list of which user-facing services will need to be disabled, and which can remain standing.  Decide ahead of time whether to shut down reporting queries, for example, in order to continue to service other end user activities.

Now Replace “Engines” with “Drives”

RAID 5 protects your data by striping it across multiple drives and storing parity information too.  If any one drive fails in a RAID 5 array, you’re completely fine.  Pull out the failed drive, swap in a brand new one, and the RAID card will automatically begin rebuilding the missing data from parity data on the blank drive.  For more about this process, check out the Wikipedia article on RAID.

Hard drives have moving parts, and moving parts fail.  The more drives we add, the more likely we are to experience a failure.  We’re distributing the work across more drives, which increases performance, but it simultaneously increases risk.

When there’s a drive failure, the clock starts ticking.  We have to get a new drive in as fast as possible.  In order to reduce the failure window, enterprise systems use hot spare hard drives: blank drives that sit around idle doing nothing.  When there’s a failure on Saturday night at midnight (the universally agreed-upon standard time for drive failures), the raid array automatically uses these hot spare drives as their replacement and start rebuilding the array automatically. SAN administrators like hot spares, because they like doing other things on Saturday nights instead.

On Saturday nights, SAN administrators like to do karaoke at The Arbitrated Loop.

On Saturday nights, SAN administrators like to do karaoke at The Arbitrated Loop.

When they finally return to the datacenter on Monday to replace the dead drive with a fresh one, that fresh one becomes the new hot spare.  (Not all arrays work this way - I’m generalizing.  I can hear SAN admins typing their replies already.)

While the drive array rebuilds, the remaining drives are working harder than they normally would.  Not only are they handling their regular load, but they’re also simultaneously reading data to write it onto the fresh drive.  This means our hard drives are working overtime - just like the remaining engines in our plane scenario.

This becomes a tricky balance:

  • The more drives we add, the easier they can handle normal load from end users
  • The more drives we add, the more likely we are to have failures
  • But when we have failures, the more drives we add, the easier of a time we’ll have keeping up with the rebuilds
  • The larger the drives, the longer rebuilds take, which lengthens our time window for recovery

It’s just like planes: adding more stuff means managing a balance between cost, complexity and reliability.

The next time someone asks you to add more gear into your scenario or asks to take advantage of the disaster recovery gear that’s “sitting around idle”, it’s time to recalculate your risks and reliabilities.

           

Registered Servers in SSMS

Tuesday, March 31st, 2009

In my last blog post, I discussed changing the color of the status bar in SSMS 2008. I received a couple of comments and even an e-mail discussing how this doesn’t seem to always work. After playing with it for a little bit, I’ve found that the status bar color needs to be set in both Query->Connection->Connect/Change Connection… (here-in referred to as simply the Query menu) and Registered Servers.

Let’s run through this. First, connect to an instance with any color using the Query menu.

Connecting via the Query menu


Now, create a new registered server. Make sure to use the same server.


Create a New Registered Server


New Server Registration

New Server Registration




Pick a color, but make sure that it’s different than the previous color. This is just for demonstration purposes only. Since the whole point is to have a consistent color, you would normally use the same color in both connection methods for the same server.


Pick a color

Pick a color




Open a new query window via Registered Servers.


New Query Window

New Query Window




Registered Server Query Window

Registered Server Query Window




Here’s what happens when I connect to the same server using both Registered Servers (left) and another window using the Query menu (right).


SSMS - Same Server, Different Colors

SSMS - Same Server, Different Colors

For anyone who’s using both the Query menu and Registered Servers to connect to servers, then you should walk through the process of connecting to each server via both means and changing the colors to ensure consistency. I did this for 22 servers and it took me less than 10 minutes.

I hope that helps clear up some of the confusion. :)

Source: http://sqlfool.com/2009/03/registered-servers-in-ssms/

SQL Solutions OLAP Heartbeat and OLAP Performance Advisor

Tuesday, March 31st, 2009

Hmm, you wait for years for commercial tools for monitoring Analysis Services (the only one I’d ever seen before was Companion for Analysis Services from SQLMinds) and then two come out at once. One of these tools I’ll be blogging about towards the end of this week, hopefully – I’ve had a sneak preview and it looks very cool – but today I found out the following from SQL Solutions:

I’ll download both and give them a thorough test as soon as I can.

Mixed Workloads Part 3

Tuesday, March 31st, 2009

In part 1, I talk about what I consider feeble attempts at implementing a reporting server through log shipping, mirroring\snapshots and, to a lesser extent, replication. Unless you invest in a real ETL solution, I argued that it is better to run a mixed workload. I talked about the architectural advantages of running mixed workloads in part 2. In a nutshell, doubling the hardware and cutting the data in half.

In this final post, we will talk about new features in SQL Server 2008 and some features that have been around a while that can help with mixed workloads.There are also some bad practices that could be the right answer that we won’t  talk about but let us mention triggers, table valued functions, 20 table outer joins, some correlated subqueries and table variables. These are options but usually not good ones. In the right circumstances, they could be right like an end of year report.

Here are the main tools in your arsenal:

  • Resource governor
  • Covering filtered Indexes
  • Indexed views
  • Partitioning + data compression
  • Persisted computed columns
  • Archiving on the same server
  • After hours denormalization(indexes views with deferred updates)

Resource Governor

Chances are you do not want to limit throughput of your OLTP queries. The resource governor does not do a good job with these queries anyway because their duration is usually so short. However, let’s say you have reports that run by executives. You can put them in a workload group that gives them as much resources as possible without affecting OLTP traffic. You may also have a less important group of reports from the marketing or sales teams that you can limit further. One caveat to the resource governor is it cannot limit, disk IO so if that is your bottleneck, this will not help much.

Covering Filtered indexes

Filtered indexes are a great new feature in SQL Server 2008. When optimizing for reporting queries on your OLTP system, you are probably going to be touching a lot of rows so covering the query is important. For example, the order fulfillment team works off a report of unfulfilled orders that pulls in order data, customer data, shipping data etc. In this case, you would add covering filtering indexes on each of those tables. The filtered indexes reduce write overhead on your OLTP writes and reduces read overhead of your reporting.

Indexed Views

Indexed views take filtered indexed view a step further. You can create indexes on multiple tables. Think of it as denormalization alongside your OLTP optimized schema. In the previous order fulfillment example, we can basically persist that report and have it updated in real time. There is more overhead to your OLTP transactions so weigh the pro’s and con’s. Test if possible.Unfortunately, you cannot defer changes to your indexed views but I believe there is a feature request for this on Connect and I will tell you about a workaround shortly.

Partitioning and Compression

This is the dynamic duo when mixing workloads. Unfortunately, the nitty gritty details would require their own post. For example, one mixed workload may benefit from compression on the hottest partition while the older data should be uncompressed. However, another workload may benefit from the opposite. The key here is really understanding your workload, data and hardware limitations. Most importantly, plan then TEST, TEST, TEST! Once you partition, you loose online operations so if you do it wrong, you are stuck.

 Persisted Computed Columns

This is an easy one. It is a simple trade off. Writes take a little more CPU and space in exchange for reduced CPU time when you report. Take your orders table, for example. You could calculate and save shipping costs when you insert the rows. If it adds a few milliseconds to the insert but shaves seconds off the hourly open orders report that the execs are looking at, it may be an easy decision.

Archival

This might not always be possible depending on your data. It may not be necessary if you have finely tuned indexes. However, it could make a night and day difference. If you need the data, UNION ALL’ing the production table with the archive table has little overhead. I do suggest you keep the archive database on the same server unless it will rarely be accessed. Trying to do this with linked servers is bad.

After hours denormalization

This is basically precreating reports during off hours. Think of it as indexed view with deferred updates. You can UNION with the OLTP tables if you need realtime data in your report. In an ideal world, touching less rows in the OLTP table and then UNIONing with the denormalized data will result in the best of both worlds if you need real time data.

The final word

As the concurrency and size of data scales, both a pseudo reporting database and a mixed work load scenario will not meet business requirements. A business requirement of real time data may dictate a mixed workload. There may be plenty of workloads where scaling out and scaling up both meet performance demands. I just wanted to play devil’s advocate and  let you know there is another option when planning reporting.

EXISTING AND NONEMPTY

Tuesday, March 31st, 2009

Here’s an interesting MDX nugget. The other day I was working on something similar to the following query, that finds the count of the number of customers who bought something in the current month as well as the previous month:

WITH
MEMBER MEASURES.TEST AS
COUNT(NONEMPTY(
NONEMPTY(
EXISTING [Customer].[Customer].[Customer].MEMBERS
, [Measures].[Internet Sales Amount])
, ([Measures].[Internet Sales Amount], [Date].[Calendar].CURRENTMEMBER.PREVMEMBER))
)
SELECT MEASURES.TEST
*
[Product].[Subcategory].[Subcategory].MEMBERS ON 0,
[Date].[Calendar Year].[Calendar Year].MEMBERS
*
[Customer].[Country].[Country].MEMBERS ON 1
FROM [Adventure Works]

On my laptop, this runs in almost 16 seconds – not good. What I was doing here was making the assumption that it was a good idea to do the EXISTING first, to filter the set of customers down as much as possible, then do the two NONEMPTYs. In fact, what I found was that by moving the EXISTING to after the NONEMPTYs had been evaluated the query ran much faster:

WITH
MEMBER MEASURES.TEST AS
COUNT(EXISTING 
NONEMPTY(
NONEMPTY(
[Customer].[Customer].[Customer].MEMBERS
, [Measures].[Internet Sales Amount])
, ([Measures].[Internet Sales Amount], [Date].[Calendar].CURRENTMEMBER.PREVMEMBER))
)
SELECT MEASURES.TEST
*
[Product].[Subcategory].[Subcategory].MEMBERS ON 0,
[Date].[Calendar Year].[Calendar Year].MEMBERS
*
[Customer].[Country].[Country].MEMBERS ON 1
FROM [Adventure Works]

On my laptop now, the above query executes in 3 seconds. Why is this? To be honest, I’m not completely sure… looking in Profiler and Perfmon I see differences, but not many; these things are better left to the likes of Mosha who know what’s going on inside in the formula engine. But anyway I thought I’d post this up because it’s a common form of calculation – so if you are using NonEmpty and Existing together, do the NonEmpty first and then the Existing.

EXISTING AND NONEMPTY

Tuesday, March 31st, 2009

Here’s an interesting MDX nugget. The other day I was working on something similar to the following query, that finds the count of the number of customers who bought something in the current month as well as the previous month:

WITH
MEMBER MEASURES.TEST AS
COUNT(NONEMPTY(
NONEMPTY(
EXISTING [Customer].[Customer].[Customer].MEMBERS
, [Measures].[Internet Sales Amount])
, ([Measures].[Internet Sales Amount], [Date].[Calendar].CURRENTMEMBER.PREVMEMBER))
)
SELECT MEASURES.TEST
*
[Product].[Subcategory].[Subcategory].MEMBERS ON 0,
[Date].[Calendar Year].[Calendar Year].MEMBERS
*
[Customer].[Country].[Country].MEMBERS ON 1
FROM [Adventure Works]

On my laptop, this runs in almost 16 seconds – not good. What I was doing here was making the assumption that it was a good idea to do the EXISTING first, to filter the set of customers down as much as possible, then do the two NONEMPTYs. In fact, what I found was that by moving the EXISTING to after the NONEMPTYs had been evaluated the query ran much faster:

WITH
MEMBER MEASURES.TEST AS
COUNT(EXISTING 
NONEMPTY(
NONEMPTY(
[Customer].[Customer].[Customer].MEMBERS
, [Measures].[Internet Sales Amount])
, ([Measures].[Internet Sales Amount], [Date].[Calendar].CURRENTMEMBER.PREVMEMBER))
)
SELECT MEASURES.TEST
*
[Product].[Subcategory].[Subcategory].MEMBERS ON 0,
[Date].[Calendar Year].[Calendar Year].MEMBERS
*
[Customer].[Country].[Country].MEMBERS ON 1
FROM [Adventure Works]

On my laptop now, the above query executes in 3 seconds. Why is this? To be honest, I’m not completely sure… looking in Profiler and Perfmon I see differences, but not many; these things are better left to the likes of Mosha who know what’s going on inside in the formula engine. But anyway I thought I’d post this up because it’s a common form of calculation – so if you are using NonEmpty and Existing together, do the NonEmpty first and then the Existing.

Nail in the Coffin for AMD?

Monday, March 30th, 2009

Intel released the Nehalem processor family. The Intel® Xeon® Processor 5500 series family. This could be the nail in the coffin for AMD. I hope not. Without competition, Intel can rest on its laurels.

Lets look at the goodies. On Glenn Berry’s blog, he points out the SQL specific benchmarks. This comes from the Anandtech benchmarks. This is what caught my eye from that review.

The memory controller has up to three channels. A dual CPU configuration has access to 35GB/s of memory bandwidth (measured with stream) if you use DDR3-1333. The latest dual Opteron achieves 19.4GB/s with DDR2-800

Think about it. If you have a SQL box with 32GB of RAM and a VLDB, you could theoretically churn the buffer pool once every second. Of course, you will probably hit a disk bottleneck first. In addition the the proc specific improvements, DDR3 with NUMA support in a server is a huge leap.

Today, HP also introduced the DL3X0 G6. Here is a link to the the DL 360 G6 specs. I am speculating but I bet  it would beat a dual socket 6 core DL 580 G5. Most definitely on IO bound workloads like a database server. Hopefully, the DL580 G6’s are coming soon. Maybe an 8 socket DL 780 G6. :)

Mix that with VMWare ESX 4.0 that is in RC and virtualization of the database server may have come of age.

The Opteron 1up’ed Intel in 2004. Now the ball is back in AMD’s court. I am rooting for you!

SQL SERVER – 2008 – IntelliSense Does Not Work – Enable IntelliSense

Monday, March 30th, 2009

While I was working with SQL Server 2008 IntelliSense, I realized that it was not functioning as I expected. Even after I had enabled IntelliSense it was still not opening any suggestions at all. After a while, I figured out some vital information regarding how to make sure IntelliSense smoothly works all the time without you giving any trouble.

Given below are four different ways through which you can enable IntelliSense.

1) Make sure you are connected to SQL Server 2008 Edition.
IntelliSense does not work with the previous versions of SQL Server.

2) IntelliSense should be enabled.
There are two ways to verify whether IntelliSense is enabled or not.
a) From Toolbar

b) Go to Tools >> Options >> Text Editor >> Transact-SQL >> IntelliSense

3) IntelliSense should be refreshed with the latest changes in database.
a) Press CTRL+SHIFT+R
b) Go to Edit >> IntelliSense >> Refresh Local Cache

4) Go to Tools >> Options >> Text Editor >> Transact-SQL >> General >> IntelliSense
Select Auto List Members and Check Parameter Information.

Reference : Pinal Dave (http://blog.sqlauthority.com)

kick it on DotNetKicks.com

Posted in Pinal Dave, SQL, SQL Add-On, SQL Authority, SQL Error Messages, SQL Query, SQL Server, SQL Tips and Tricks, SQL Utility, SQLServer, T SQL, Technology Tagged: IntelliSense