New Tip Posted

February 9th, 2010 by Thomas LaRock

I had another tip posted over at MSSQLTips today, you can find it at http://www.mssqltips.com/tip.asp?tip=1938

The tip helps you configure a policy inside of PBM that will help you to determine how badly your transaction logs are fragmented.

MCM Prep Week: Microsoft Exams 70-433 and 70-451

February 9th, 2010 by Brent Ozar

A few weeks ago, I glowed about how much I thought the MCITP exams had improved since the old braindump days.  I had a really good experience with the 70-432 and 70-450 exams on SQL Server database administration, and I was happy that the exams focused on good decisions, not mindless memorization of syntax.

Now, about the 70-433 and 70-451 exams for SQL Server database developers….

I’m totally bummed out.

Just Because You Can….

The exams were a syntax-fest that focused on features that, quite frankly, I don’t want my SQL Server developers ever using, let alone memorizing.

XMLambada - The Other Forbidden Dance

XMLambada - The Other Forbidden Dance

Need to shred XML on a regular basis?  I’d like to introduce you to a spiffy concept: application servers.  Get your CPU-intensive queries the hell off my SQL Server, because Enterprise Edition costs $30,000 per socket.  Do your XML processing on an app server that costs less to maintain and that scales easier.  What’s that, you say?  You desperately need to access it often and quickly in day-to-day queries?  Here’s another novel concept: store it natively as relational data.  Other developers, views, stored procs, and reports will be able to access it without doing the XMLambada.

Storing XML in the database is a crutch for a bad schema design.  If you like regularly querying XML columns in the database, you’ll probably love the NoSQL movement.  (Let me know how that works out for you.  (On second thought, don’t.))

I half-expected the next question to ask about the new scissors designed with pistol grips so that they’re easier to run with.  This is probably a personal gripe on my part about some of SQL Server’s new features lately, not the exam, but it felt like the exam focused a little too much on things I’d rather not see in production.

Where are the questions asking when it’s appropriate to use cursors?  Triggers?  UDFs?  Don’t tell me that good SQL developers already know this stuff, because I see way too many people abusing features that come back to haunt them later.

What’s the Best Way to Punch Yourself in the Junk?

Several times, I read the question, shook my head, and wanted to meet the person who wrote the question.  They must be wearing an armor-plated codpiece.  I can’t repeat the questions here since the test involves an NDA, but here’s one I made up:

In your high-volume, mission-critical database, you need to create a series of nested triggers to call web services hosted in South Africa when any record is inserted.  Which of the following commands will work best:

I’ve got your command right here, and it starts with UPDATE dbo.Employees SET IsActive = 0 WHERE LastName = ‘Ozar’.  Reading questions like this, I wanted to run to the bathroom for some Calgon soap.  Here’s another made-up question:

You’re a database developer and all of your queries against a certain table are slow.  Should you create a new index, update the statistics, force a query execution plan, or use an index hint?

None of the above – you should get a DBA involved.  Forcing a fix for one table or one query at the developer level is a really bad idea.  Zoom out, get more information about the bigger picture, and avoid making hard-coded problems for yourself down the road.  Forcing a query to use a specific index makes the DBA’s life hell, because he can’t do big-picture performance tuning.  He might add or remove indexes at the database level to make overall performance better, but your query won’t improve.

The more questions I answered, the more I started to understand the mindset of some of the shops where I’ve done performance tuning.  So often, I’ve walked in, taken a look at the environment, and said, “Yes, this will technically work, but it won’t scale.  We need to take a step back and ask why the application is trying to (insert crazy feature here) hundreds of times per second.  Let’s look at the business need behind what we’re doing, and find the most efficient way of accomplishing that goal, because this isn’t it.”  During the exam, I found myself arguing with the screen because sometimes, all of the answers were bad ideas or jumped to conclusions too quickly.

How to Ignore Distracting Information

What are you looking at? The question is over there.

What are you looking at? The question is over there.

Question: Some people are Republicans, some people are Democrats, and some are independent.  Independence doesn’t mean that they’re financially successful like a pop star or a CEO – it just means they haven’t made up their political mind.  Which of the following people live in Illinois?

  1. Barack Obama, 1600 Pennsylvania Ave, Washington, DC
  2. Brent Ozar, 1845 S. Michigan Ave, Chicago, IL
  3. Lady Gaga, 3 Legs Drive, New York City, NY
  4. Steve Jobs, 2108 Waverly St, Palo Alto, CA

Some tests bombard you with unrelated information in an attempt to confuse you.  It’s a lot like working with a junior project manager who just found the fridge with the free Red Bull.

Anytime a question has more than one sentence in it, I use the same technique I use to handle work emails.  I jump down to the last sentence, read it, and then decide whether I have any real work to do.  In the example above, the last sentence asks, “Which of the following people live in Illinois?”  I would scan quickly through the answers, rule out anyone who doesn’t live in Illinois, and that’s my answer.

If more than one answer matches, then I scan back through the question very quickly looking for the additional criteria, but at that point it’s easier because I’ve already ruled out half of the answers.

So anyway, I passed, but until the score screens came up, I was positively convinced I failed.  Want to take an exam yourself?  Microsoft is running a “second shot” promotion – fail an exam, and you get a free retake.  Read the details here.

MCM Prep Week: Microsoft Exams 70-433 and 70-451 is a post from: Brent Ozar - Too Much Information.
If you'd like to quote this content on other sites, please read how to use my blog's content.

MCM Prep Week: Microsoft Exams 70-433 and 70-451

February 9th, 2010 by Brent Ozar

A few weeks ago, I glowed about how much I thought the MCITP exams had improved since the old braindump days.  I had a really good experience with the 70-432 and 70-450 exams on SQL Server database administration, and I was happy that the exams focused on good decisions, not mindless memorization of syntax.

Now, about the 70-433 and 70-451 exams for SQL Server database developers….

I’m totally bummed out.

Just Because You Can….

The exams were a syntax-fest that focused on features that, quite frankly, I don’t want my SQL Server developers ever using, let alone memorizing.

XMLambada - The Other Forbidden Dance

XMLambada – The Other Forbidden Dance

Need to shred XML on a regular basis?  I’d like to introduce you to a spiffy concept: application servers.  Get your CPU-intensive queries the hell off my SQL Server, because Enterprise Edition costs $30,000 per socket.  Do your XML processing on an app server that costs less to maintain and that scales easier.  What’s that, you say?  You desperately need to access it often and quickly in day-to-day queries?  Here’s another novel concept: store it natively as relational data.  Other developers, views, stored procs, and reports will be able to access it without doing the XMLambada.

Storing XML in the database is a crutch for a bad schema design.  If you like regularly querying XML columns in the database, you’ll probably love the NoSQL movement.  (Let me know how that works out for you.  (On second thought, don’t.))

I half-expected the next question to ask about the new scissors designed with pistol grips so that they’re easier to run with.  This is probably a personal gripe on my part about some of SQL Server’s new features lately, not the exam, but it felt like the exam focused a little too much on things I’d rather not see in production.

Where are the questions asking when it’s appropriate to use cursors?  Triggers?  UDFs?  Don’t tell me that good SQL developers already know this stuff, because I see way too many people abusing features that come back to haunt them later.

What’s the Best Way to Punch Yourself in the Junk?

Several times, I read the question, shook my head, and wanted to meet the person who wrote the question.  They must be wearing an armor-plated codpiece.  I can’t repeat the questions here since the test involves an NDA, but here’s one I made up:

In your high-volume, mission-critical database, you need to create a series of nested triggers to call web services hosted in South Africa when any record is inserted.  Which of the following commands will work best:

I’ve got your command right here, and it starts with UPDATE dbo.Employees SET IsActive = 0 WHERE LastName = ‘Ozar’.  Reading questions like this, I wanted to run to the bathroom for some Calgon soap.  Here’s another made-up question:

You’re a database developer and all of your queries against a certain table are slow.  Should you create a new index, update the statistics, force a query execution plan, or use an index hint?

None of the above – you should get a DBA involved.  Forcing a fix for one table or one query at the developer level is a really bad idea.  Zoom out, get more information about the bigger picture, and avoid making hard-coded problems for yourself down the road.  Forcing a query to use a specific index makes the DBA’s life hell, because he can’t do big-picture performance tuning.  He might add or remove indexes at the database level to make overall performance better, but your query won’t improve.

The more questions I answered, the more I started to understand the mindset of some of the shops where I’ve done performance tuning.  So often, I’ve walked in, taken a look at the environment, and said, “Yes, this will technically work, but it won’t scale.  We need to take a step back and ask why the application is trying to (insert crazy feature here) hundreds of times per second.  Let’s look at the business need behind what we’re doing, and find the most efficient way of accomplishing that goal, because this isn’t it.”  During the exam, I found myself arguing with the screen because sometimes, all of the answers were bad ideas or jumped to conclusions too quickly.

How to Ignore Distracting Information

What are you looking at? The question is over there.

What are you looking at? The question is over there.

Question: Some people are Republicans, some people are Democrats, and some are independent.  Independence doesn’t mean that they’re financially successful like a pop star or a CEO – it just means they haven’t made up their political mind.  Which of the following people live in Illinois?

  1. Barack Obama, 1600 Pennsylvania Ave, Washington, DC
  2. Brent Ozar, 1845 S. Michigan Ave, Chicago, IL
  3. Lady Gaga, 3 Legs Drive, New York City, NY
  4. Steve Jobs, 2108 Waverly St, Palo Alto, CA

Some tests bombard you with unrelated information in an attempt to confuse you.  It’s a lot like working with a junior project manager who just found the fridge with the free Red Bull.

Anytime a question has more than one sentence in it, I use the same technique I use to handle work emails.  I jump down to the last sentence, read it, and then decide whether I have any real work to do.  In the example above, the last sentence asks, “Which of the following people live in Illinois?”  I would scan quickly through the answers, rule out anyone who doesn’t live in Illinois, and that’s my answer.

If more than one answer matches, then I scan back through the question very quickly looking for the additional criteria, but at that point it’s easier because I’ve already ruled out half of the answers.

So anyway, I passed, but until the score screens came up, I was positively convinced I failed.  Want to take an exam yourself?  Microsoft is running a “second shot” promotion – fail an exam, and you get a free retake.  Read the details here.

MCM Prep Week: Microsoft Exams 70-433 and 70-451 is a post from: Brent Ozar – Too Much Information.
If you’d like to quote this content on other sites, please read how to use my blog’s content.

  • Digg
  • del.icio.us
  • DZone
  • DotNetKicks
  • Google Bookmarks
  • Ping.fm
  • Reddit
  • StumbleUpon
  • TwitThis

Fun with SSIS – Part 1 – Troubleshooting

February 9th, 2010 by Rod Colledge

Although I still consider myself a DBA, the last few years have seen me spend more and more time in the Business Intelligence development space (SSIS development in particular). In this the first of a 2-part SSIS series, I’ll be focusing on a number of SSIS “issues” I’ve discovered the hard way.

1: Case Sensitivity

I’ve never been particularly good at using case consistently, the perfect example of which was my publisher asking me to edit hundreds of figure captions in my recently published book. Like most people of my age who took a college course in programming, I studied C, and spent countless hours debugging problems traced back to the incorrect case of variable names (and using “=” instead of “==” !).

When I decided to specialize in SQL Server administration, I rejoiced at the lack of case sensitivity involved, although I occasionally see databases with the case sensitive option, in most cases (pardon the pun) installed my accident.

Imagine my angst when I discovered that a tool I’m spending more and more time with is rampantly case sensitive. Almost everything about SSIS is case sensitive; the expression language, variable names and one that gets me every time; lookups.

A very common SSIS data flow transformation is lookup. As the name implies, it’s most often used to lookup a key value based on a text value in the data flow. For example, in the figure below, we’re returning the supplierKey based on the supplierName value. The first screen defines the lookup source, and the second defines the join condition and what value is to be returned.

1

This, for all intents and purposes, in a simple join condition with a column from the joined table being returned. However, unlike T-SQL, the SSIS lookup is case sensitive.

The obvious way to address this is to ensure both the lookup column (vendorName in this example) and the matching data flow field (Supplier) are set to the same case using the UPPER (or LOWER) function. For the lookup, that’s a simple case of changing the lookup source T-SQL to use one of those functions. For the data flow field, we could use the derived column transformation using the same function, and use that column in the lookup process.

2: Strong Typing

If there was any remaining doubt that SSIS was designed for developers (and not classic DBAs) it’s confirmed in its (very) strong typing. As an example of that, let’s revisit the lookup example from above. A previous version of the database had the Supplier data type set to varchar(50) and vendorName set to nvarchar(255). When we try and connect these in the lookup, we’ll get the following error;

2
 

Another common issue with mismatched data types is when inserting rows into a table at the end of a data flow task. If the destination table’s column length is shorter than the source, you’ll get a warning such as this one;

3

Unless you have the luxury of an enterprise wide data dictionary (that every one adheres to), data type variances such as these are common, with the most frequent SSIS workarounds being to either CAST the columns in the data source selection (e.g.; using a view) and/or using type casts in a derived column transformation (more on that in part 2 of this blog series)

3: 64-bit Trickery

There will presumably come a day when everything works on 64-bit, and that day can’t come soon enough. In the meantime, we have to deal with a variety of annoying problems, one of which is 64-bit driver support in SSIS.

A common data source for SSIS packages are Excel files. The current (and all previous) versions of Excel do not include 64-bit drivers. This presents a problem when SSIS packages run in 64-bit environments, and the error messages returned are far from helpful (unless you consider “The AcquireConnection method call to the connection manager blah failed with error code blah” helpful).

Fortunately, there’s an easy workaround for this issue, via a property called Run64BitRuntime accessed through the Debugging page of the project’s properties window. As per the figure below, setting this property to False (it’s true by default) will use the 32-bit driver.

4

Note that there are a whole range of other considerations for 64-bit mode, for example when calling child packages from a parent with a different setting for this value. For a great blog post on these issues visit Todd McDermid’s post  from last year.

4: Package Configurations

One of the real SSIS horror stories that I was very close to (but not responsible for!) was an SSIS package that was run against a production database by accident (blowing away gigabytes of production data). The package used a production configuration file instead of a test/development file. There’s a couple of aspects to this (all too common) problem; configuration technique and security.

Firstly, configuration technique. SSIS allows a number of ways of configuring settings such as server and database name. The most common one is to use an XML configuration file, the path to which is either stored within the package itself, or set through an environment variable.

I’ve seen problems with both of these techniques. Using a configuration file location (as per the example below) assumes that all servers on which the package runs has the same path available, which is quite often not the case.

5

The environment variable option can also present some problems, notably the need to reboot the server for the variable to come into effect. Depending on the environment, rebooting production servers is not the easiest thing to do.

Another alternative to both of the above is one that I discovered while listening to Greg Low’s SQLDownUnder podcast with Jamie Thompson  in which Greg spoke of the idea of connecting to a special configuration database which stored the configuration settings. The settings returned were based on the calling machine’s context and then used to set the value of package variables for subsequent use.

Regardless of the configuration technique, one of the really important considerations is security. In each of the above techniques, mistakes can still be made. For example, a production configuration file can be copied over the top of a development file, or the incorrect settings stored in a configuration database.

To prevent these types of configuration errors from killing a production database, we need to ensure that the security context of an SSIS package prevents it from accessing the wrong environment. There are at least two ways of achieving this; using separate service accounts in each environment (and making sure the account only has database permissions in the appropriate environment), or better still, having the production database in a separate domain with trust permissions removed to the other domain(s).

With the appropriate security setup, even if the wrong configuration file is used, the package will fail to run due to the lack of database permissions.

5: Miscellaneous Silly Business

Finally, there’s a number of really silly things I continue to do that I shouldn’t (I’m a slow learner). Firstly, the evaluate as expression property.

A common design pattern in SSIS packages is to use a variable as the source for an Execute SQL task. The variable’s value is then set using an expression which references another variable. A common example of this is a variable that contains an update command with a where clause containing the value of a BatchID variable. The expression property contains the code that references the other variable.

If you use this type of arrangement, make sure the variable's evaluate as expression property is set to TRUE, as per the example below. I’ve spent many hours debugging problems where I’ve simply forgotten to set this properly.

6

To wrap up this post, I frequently want to add an existing .dtsx package (e.g.; from another project) to the current project. By default I right click the SSIS Packages node in solution explorer and choose the .dtsx package that I’ve already copied into the project directory. I’m then confused as to why the package is renamed e.g.; from Package.dtsx to Package(1).dtsx.

It turns out that the Add Existing Package menu is really designed for adding a package from a different location (i.e.; remote server), and if you use it locally, it makes a copy (and renames) the file.

For the situation I described, using the Add > Existing Item option (instead of Add Existing Package from the SSIS Packages menu) is the correct option. The figure below compares these 2 options side by side.

7
In the next post, I’ll talk about some important SSIS performance tuning tips.

Cheers

  • Digg
  • del.icio.us
  • DZone
  • DotNetKicks
  • Google Bookmarks
  • Ping.fm
  • Reddit
  • StumbleUpon
  • TwitThis

Fun with SSIS – Part 1 – Troubleshooting

February 9th, 2010 by Rod Colledge

Although I still consider myself a DBA, the last few years have seen me spend more and more time in the Business Intelligence development space (SSIS development in particular). In this the first of a 2-part SSIS series, I’ll be focusing on a number of SSIS “issues” I’ve discovered the hard way.

1: Case Sensitivity

I’ve never been particularly good at using case consistently, the perfect example of which was my publisher asking me to edit hundreds of figure captions in my recently published book. Like most people of my age who took a college course in programming, I studied C, and spent countless hours debugging problems traced back to the incorrect case of variable names (and using “=” instead of “==” !).

When I decided to specialize in SQL Server administration, I rejoiced at the lack of case sensitivity involved, although I occasionally see databases with the case sensitive option, in most cases (pardon the pun) installed my accident.

Imagine my angst when I discovered that a tool I’m spending more and more time with is rampantly case sensitive. Almost everything about SSIS is case sensitive; the expression language, variable names and one that gets me every time; lookups.

A very common SSIS data flow transformation is lookup. As the name implies, it’s most often used to lookup a key value based on a text value in the data flow. For example, in the figure below, we’re returning the supplierKey based on the supplierName value. The first screen defines the lookup source, and the second defines the join condition and what value is to be returned.

1

This, for all intents and purposes, in a simple join condition with a column from the joined table being returned. However, unlike T-SQL, the SSIS lookup is case sensitive.

The obvious way to address this is to ensure both the lookup column (vendorName in this example) and the matching data flow field (Supplier) are set to the same case using the UPPER (or LOWER) function. For the lookup, that’s a simple case of changing the lookup source T-SQL to use one of those functions. For the data flow field, we could use the derived column transformation using the same function, and use that column in the lookup process.

2: Strong Typing

If there was any remaining doubt that SSIS was designed for developers (and not classic DBAs) it’s confirmed in its (very) strong typing. As an example of that, let’s revisit the lookup example from above. A previous version of the database had the Supplier data type set to varchar(50) and vendorName set to nvarchar(255). When we try and connect these in the lookup, we’ll get the following error;

2
 

Another common issue with mismatched data types is when inserting rows into a table at the end of a data flow task. If the destination table’s column length is shorter than the source, you’ll get a warning such as this one;

3

Unless you have the luxury of an enterprise wide data dictionary (that every one adheres to), data type variances such as these are common, with the most frequent SSIS workarounds being to either CAST the columns in the data source selection (e.g.; using a view) and/or using type casts in a derived column transformation (more on that in part 2 of this blog series)

3: 64-bit Trickery

There will presumably come a day when everything works on 64-bit, and that day can’t come soon enough. In the meantime, we have to deal with a variety of annoying problems, one of which is 64-bit driver support in SSIS.

A common data source for SSIS packages are Excel files. The current (and all previous) versions of Excel do not include 64-bit drivers. This presents a problem when SSIS packages run in 64-bit environments, and the error messages returned are far from helpful (unless you consider “The AcquireConnection method call to the connection manager blah failed with error code blah” helpful).

Fortunately, there’s an easy workaround for this issue, via a property called Run64BitRuntime accessed through the Debugging page of the project’s properties window. As per the figure below, setting this property to False (it’s true by default) will use the 32-bit driver.

4

Note that there are a whole range of other considerations for 64-bit mode, for example when calling child packages from a parent with a different setting for this value. For a great blog post on these issues visit Todd McDermid’s post  from last year.

4: Package Configurations

One of the real SSIS horror stories that I was very close to (but not responsible for!) was an SSIS package that was run against a production database by accident (blowing away gigabytes of production data). The package used a production configuration file instead of a test/development file. There’s a couple of aspects to this (all too common) problem; configuration technique and security.

Firstly, configuration technique. SSIS allows a number of ways of configuring settings such as server and database name. The most common one is to use an XML configuration file, the path to which is either stored within the package itself, or set through an environment variable.

I’ve seen problems with both of these techniques. Using a configuration file location (as per the example below) assumes that all servers on which the package runs has the same path available, which is quite often not the case.

5

The environment variable option can also present some problems, notably the need to reboot the server for the variable to come into effect. Depending on the environment, rebooting production servers is not the easiest thing to do.

Another alternative to both of the above is one that I discovered while listening to Greg Low’s SQLDownUnder podcast with Jamie Thompson  in which Greg spoke of the idea of connecting to a special configuration database which stored the configuration settings. The settings returned were based on the calling machine’s context and then used to set the value of package variables for subsequent use.

Regardless of the configuration technique, one of the really important considerations is security. In each of the above techniques, mistakes can still be made. For example, a production configuration file can be copied over the top of a development file, or the incorrect settings stored in a configuration database.

To prevent these types of configuration errors from killing a production database, we need to ensure that the security context of an SSIS package prevents it from accessing the wrong environment. There are at least two ways of achieving this; using separate service accounts in each environment (and making sure the account only has database permissions in the appropriate environment), or better still, having the production database in a separate domain with trust permissions removed to the other domain(s).

With the appropriate security setup, even if the wrong configuration file is used, the package will fail to run due to the lack of database permissions.

5: Miscellaneous Silly Business

Finally, there’s a number of really silly things I continue to do that I shouldn’t (I’m a slow learner). Firstly, the evaluate as expression property.

A common design pattern in SSIS packages is to use a variable as the source for an Execute SQL task. The variable’s value is then set using an expression which references another variable. A common example of this is a variable that contains an update command with a where clause containing the value of a BatchID variable. The expression property contains the code that references the other variable.

If you use this type of arrangement, make sure the variable's evaluate as expression property is set to TRUE, as per the example below. I’ve spent many hours debugging problems where I’ve simply forgotten to set this properly.

6

To wrap up this post, I frequently want to add an existing .dtsx package (e.g.; from another project) to the current project. By default I right click the SSIS Packages node in solution explorer and choose the .dtsx package that I’ve already copied into the project directory. I’m then confused as to why the package is renamed e.g.; from Package.dtsx to Package(1).dtsx.

It turns out that the Add Existing Package menu is really designed for adding a package from a different location (i.e.; remote server), and if you use it locally, it makes a copy (and renames) the file.

For the situation I described, using the Add > Existing Item option (instead of Add Existing Package from the SSIS Packages menu) is the correct option. The figure below compares these 2 options side by side.

7
In the next post, I’ll talk about some important SSIS performance tuning tips.

Cheers

  • Digg
  • del.icio.us
  • DZone
  • DotNetKicks
  • Google Bookmarks
  • Ping.fm
  • Reddit
  • StumbleUpon
  • TwitThis

Using Perfmon and Profiler

February 9th, 2010 by Brent Ozar

At the SQLBits conference in Great Britain last fall, Brent Ozar gave a presentation called “Using Perfmon and Profiler.”  He covered:

  • Why Windows Task Manager is nothing more than a check-engine light
  • His performance tuning lifecycle
  • Why you need to capture baselines (and how to do it)
  • Several sample tuning scenarios

You can watch this 52-minute presentation free courtesy of SQLBits:

Get the Flash Player to see the wordTube Media Player.

If you liked this presentation, you can watch any of the SQLBits presentations at their web site, including:

Subscribing or Downloading the Podcast

If you have an MP3 player or a portable video player and you want to download our videos automatically, you can subscribe to our podcast feeds here:

You can also download this video to watch it later:

  • Digg
  • del.icio.us
  • DZone
  • DotNetKicks
  • Google Bookmarks
  • Ping.fm
  • Reddit
  • StumbleUpon
  • TwitThis

#TSQL2sDay 003: Maslow and relational design

February 9th, 2010 by Stuart Ainsworth

Rob Farley is hosting the third installment of TSQL Tuesday, and it’s a fun one: relationships (in honor of Valentine’s Day).   While I’m not currently in much of a mood to opine on the virtues of love and databases, I did think I wanted to post something a bit more esoteric this time.  Not many of you may know that I don’t have a formal background in Information Technology (some of my more sarcastic friends just held their tongues at that one); I actually have a Master of Arts in Communication, and a Master’s of Education in Instructional Technology.  I tripped into IT when I failed my comprehensive exams for the doctoral program in Speech Communication at the University of Georgia.  Awful time, but ultimately one of the best things to ever happen to me.

Anyway, why is this relevant?  Because the goal of this post is to attempt to extend one of the more famous models of social psychology and communication to database design; bear with me (I’m assuming that many of you either have no background in social psych or slept through it), but I’m hoping that this extension to the metaphor will benefit you in terms of your application design.

Maslow: the crash course.

The following is a BRIEF introduction to the theory; if you want more details, Wikipedia is your friend. In a nutshell, Abraham Maslow proposed that humans, as a social animal, were driven to fulfill certain basic needs in a quest for self-actualization or enlightenment.  He proposed a pyramidic model of five sets (or stages) of these needs, with the four lowest ones being required to achieve before attempting the fifth; few people ever attain the fifth level, but the quest to reach that is part of our collective experience.  I’ve defined the five stages below:

maslows_hierarchy_of_needssvg Physiological:

The basic requirements for human existence; food, water, etc.

Safety:

This often translates into security, but it’s different than the term we use in information technology careers; safety is the ability to acquire and maintain goods for ongoing existence.  The Physiological elements are immediate needs; Safety elements are the ability to fulfill those immediate needs at a future date.

Social:

Where do we belong?  How do we interact with others who need us (and we need)?  What is our role, and how does that affect our definition of the self?

Esteem:

Esteem stems from the social need; once our relationship with others has been established, we can truly begin to define ourselves and the virtue of our importance in the world.

Self-Actualization:

Self-actualization is the ultimate fulfillment of one’s potential; to be what one is, without need for constant reinforcement from other beings, yet able to exist in harmony with purpose.  Few people have ever attained this stage, but as stated before, the quest to reach the top of the pyramid drives human development.

So what does this mean to the database designer?

Why is all of this important?  This is not a perfect analogy, but if we extend Maslow’s model to the area of database design, some interesting questions arise (particularly in the third and fourth stages, which is why I felt like this point would be relevant to the TSQL Tuesday challenge of relationships).  Let’s take each stage, and step through them again.

Physiological:

While applications don’t have physiological needs, they DO require certain basic elements for long term survival.  Questions to consider at this stage are things like: How much space will I need?  What are the server requirements?  Can my database live in cloud or a mobile device?   What sort of I/O concerns do I have?

Safety:

Recall that safety is NOT security (in terms of who has access to the data), but it is security in terms of long-term survival of the application.  Is the database you’re designing intended for a long-term project, or is it “throw-away” code?  Have you designed it in such a way so that it’s easy to replace without impacting the dependent application?

Social:

Speaking of dependent applications (and herein lies the relationship aspect of this post), is your database application designed so that it is loosely related and decoupled from the application?  Does the database fulfill the needed role within the relationship (data storage), without treading too far into business logic?  Can the database handle multiple relationships with various applications (UI/reporting/business services).

Esteem:

Closely related to the social nature of the database within the application stack is the need for self-esteem within the database; can the database meet the the needs of the dependent applications WHILE retaining enough information to establish new relationships?  A classic example of this is the lookup table; a database with low self-esteem will only store the enumerated values provided to it by some other application. 

Without the enabling application, the database lacks sufficient internal definition to validate meaning; in practical terms, this means that the database is not decoupled from the application enough to enable the development of alternate accessing applications.  For example, my day job is to reverse engineer vendor databases; few things in the world are more disturbing than a table full of numbers without any sort of category associated with that number.  The application designer decided to store that enumeration in the application; security through obfuscation IS a method of securing your database, but not a very effective one.

A high-self esteem database will store all of the appropriate lookup values (complete with constraints) in order to provide complete validity within the structure.  The database can then be reused by several different applications, without requiring a complete set of business rules to determine those relationships.    The data layer is definitional; the business layer should be procedural.

hal[1] Self-Actualization:

I have to admit that discussing self-actualization in regards to application design makes me think of HAL.  “I’m sorry, Dave….”

To try and stay on track with this metaphor, self-actualization is the basic premise of BI; when your database can start providing you knowledge instead of just data, it has attained the highest level of potential.  Few apps make it that far without requiring substantial redesign, but the ones that do are invaluable to the enterprise they support.

So where are we?

Dunno.  I hope this little exercise made your brain hurt just a bit, and opened up a new metaphor for understanding database design issues within the application stack.   If you have more questions than answers, that’s a good place to be.

Unrelated Relationship Ramblings

February 9th, 2010 by Mike Walsh

It’s Tuesday somewhere when this post goes up. In fact when this goes live it will be sometime around Tuesday afternoon or evening in Adelaide Australia. Why there? This month’s “T-SQL Tuesday” is being hosted by Rob Farley over at his place. His topic is “relationships”. He talks about the history of T-SQL Tuesday over at his post and I encourage you to check it out.

In honor of Valentine’s day, he wanted us to all chat about relationships with no other specificity other than having some sort of a relationship to our day jobs with SQL Server.  A few directions came into mind, come along on a journey through the rambling thoughts in my mind. Let me know what you think in the comments, I’d love to hear your opinion.

Customer Relationships

This past weekend I experienced some customer service that made me think back to my own abilities in this area (and where I can improve). You might read about another such lesson I learned at my town dump a few months back when posting about the importance of attitude at work. Anyway, I ordered from a smaller pizza store in town. They don’t get a lot of business (new kid on the block) but I wanted a calzone. The former owner (and older Greek man) made a good one with Spinach, Feta, Mushrooms and Kalamata Olives with olive oil and garlic. Called and ordered that. No questions/complaints (Until I got there). To continue the point let me contrast how I came upon this idea of a calzone with the former owner…

A Positive Example

Now the former owner used to have to chop the non-pitted olives up but he offered to do that. In fact when I was first in there and ordered the calzone without the olives he got excited and said something like, “no one up this way ever wants something like that.. they want sauce and pizza cheese.. How about we put some olives in there too?” He made it up, we chatted while it cooked and he took an interest in me. Asked me questions about my family, about living in town, etc. Nothing annoying, nothing prying. He just seemed to care about me. I asked him about business, about Greece and we just chatted between customers picking up orders (Did I mention I live in a small town? Smaller shop, not a lot of business unfortunately.. The popular pizza place has been established by another Greek family for some time and they dominate the “market”). Great Calzone and I ordered it from time to time…

Under New Management

Flash forward to this past weekend. I called up, ordered this calzone for the first time with the new owners (I knew there was new ownership because we ordered a pizza a while back and the young owner was quick to ask “have you been here recently? we have new ownership… I just wanted to let you know our prices went up so you aren’t surprised”). No questions/comments over the phone, I even said “the old owner used to cut up the olives, if you guys can that’s great otherwise don’t worry about it”) When I went to pick it up the owner, his mother and a female employee were lazing about and I was greeted not with a hello but with, “Just so you know… Those olives are expensive, we normally use regular black olives on pizza. Plus there are pits, my mother had to cut them up with a knife.. If we do this for you next time we will have to charge you 1 dollar.” I had already paid and didn’t offer to pay the difference (price was already astronomical ) but I did say, “oh.. sorry I used to get them like that with the other guy”… The response? “Yeah.. That old guy used to always do special requests for customers, that’s one of the reasons he isn’t here anymore, you know?” I didn’t comment on the fact that the “old guy” was just one person working, here is 3 people with one customer.. I just nodded and left, likely for good until this guy goes out of business and new management comes in…

The Day Job Lesson?

Yeah, I know. Conciseness is key but the story above illustrates a few points to remember with the relationship to our “customers”, managers, vendors, etc. As a DBA, the developers are my customers. To a developer, perhaps a DBA can be seen as a consultant or vendor. No matter how you define it, a relationship exists. Customer satisfaction is a good goal. Some thoughts:

  • Going the extra mile isn’t always bad – At the pizza place doing extra for each customer may not be wise but every once in awhile? Might make a repeat customer. As a DBA, always doing something extra for every customer may mean a week that never ends. But… There is a happy medium. Where you can do so safely, a little more is a good goal.
  • Why are you telling someone the impact to you? I am not saying don’t tell them but ask why. If you are just whining to brag about how you went above and beyond and complain at how horrible it was for you, save it. Save your breath. Just quiet down and go and beat the expectations for the sake of doing a good job.
  • Happy Customer – Happy Provider- At the pizza place a happy customer means I’ll go back. As a DBA? Maybe a developer will be more likely to listen to some advice or guidelines if you aren’t always whining, complaining and doing the bare minimum.

Relationships Work

I used a lot of my words in the above post. Just a reminder here that set based logic and normalized databases generally work out just fine. I still hear people come to job interviews and say things like “normalization is a great concept academically” or “I never normalize too far, then there are too many tables”. Blah. Blech, even. I am not saying every database needs to get to the highest level of normalization but go for it and try a higher bit of normalization than you are currently willing to. Performance and Integrity will thank you. Sure, in some cases the read performance might even take a slight, barely noticeable hint. But your maintainability, data integrity and update/insert cost may just improve at the same time for a couple ms here or there.

The rules of normalization still apply (40 years later! That is pretty neat… I work with in 2010 in IT with a technology that is by and large based on the same bedrock that was there before I was born.. not a lot of .net developers can say that ;-) and it isn’t “legacy”) and good database design eliminates many performance problems. In fact taking the time to properly design your database today will mean there is a much slimmer chance that you’ll ever need to hire the likes of me to come in on a tuning or “what the heck is happening to data quality!?” project. Louis Davidson (and others) wrote a good book on this subject, Pro SQL Server 2008 Relational Database Design and Implementation , is a great resource (so are his earlier versions). The first section of the MVP Deep Dives book which covers design decisions is worth the price of the book alone (then you get a bonus of a multitude of other, post-design concerns in the book)

Go and read some of the early papers in the life of the database as we mostly know it today, like this paper from E.F. Codd. Heck – spend some time clicking around in Wikipedia even, start in normalization and click on what you don’t know. Then apply some basic principles:

  • Normalize your database. Go through the motions and understand normalization and ask the right questions of your data model (joins are not inherently evil, especially in 2010 with DBMS’ and hardware where they are…)
  • Define your relationships explicitly at the database level if you care about integrity (sure there are some cases where you may be better served not to but in most databases I have worked with, they would have benefited from integrity and possibly join elimination for some outer joins to dimension tables)
  • Model – No, wipe that zoolander face off please… Model your database, define your relationships before you start working. Some seem to have the, “real data professionals don’t do data models” mentality -  That’s bull. (though, I’ve had quiche from time to time, so your mileage may vary).

Check out the rest of the links that will go up on Rob Farley’s SQL Blog at the conclusion of this event. I am sure there will be some great posts (and then there will be some like this random relationship rambling but it will be worth it when you find the good ones)

What do you think? Should us DBAs send Valentines to our developer friends like our kids do at school?

Share/Bookmark

Join me at the Memphis SQL Server User Group meeting

February 8th, 2010 by Joe Webb

memphis_pyramid_2010_02_09.jpg

For almost a year now, I’ve exchanged emails with the leaders of the Memphis SQL Server User Group, trying to find a time when our schedules would align themselves and I’d be able to join them for a meeting. After many attempts and several near misses, we’ve finally set a date!

This Thursday, February 11, 2010, I will be speaking on a SQL Server Locking & Blocking. I’ll talk about:

  • Why databases lock resources
  • The consequences of blocking
  • How SQL Server manages locks
  • Techniques for influencing locking behavior
  • Using Transaction Isolation Levels

You can download the presentation materials here.

During my presentation, I’ll demonstrate how to analyze locks that SQL Server is holding using the sys.dm_tran_locks Dynamic Management View (DMV) as shown below.


–examine the resources
SELECT
resource_type
,(CASE
WHEN resource_type = ‘OBJECT’ THEN object_name(resource_associated_entity_id)
WHEN resource_type IN (‘DATABASE’, ‘FILE’, ‘METADATA’) THEN ‘N/A’
WHEN resource_type IN (‘KEY’, ‘PAGE’, ‘RID’) THEN (
SELECT
object_name(object_id)
FROM
sys.partitions
WHERE
hobt_id=resource_associated_entity_id)
ELSE
‘Undefined’
END) AS resource_name
,request_mode as lock_type
,resource_description
,request_status
,request_session_id
,request_owner_id AS transaction_id
FROM
sys.dm_tran_locks
WHERE
resource_type <> ‘DATABASE’;

If you live in or near Memphis, I hope you come out and join us for the meeting. I’d love to meet you. For more information, visit mem-pass.org.

Get the Server level default connection properties for SQL Server using T-SQL

February 8th, 2010 by Sankar Reddy

Recently I have been doing an audit on our enterprise SQL Servers including production and non-production systems and wanted to check the server level configurations. One of the audit items I specifically wanted to look at was the server level default connection options. Unless there is a strong reason to do so, In general it is recommended NOT to tinker with the server level configuration settings [exceptions apply and ymmv] and just accept the default values.

SQL Server server level default connection options

This information can be retrieved from sp_configure or sys.configurations (2005 & above) by looking at the “user options” property.

EXEC sp_configure 'User Options'

GO

SELECT * FROM sys.configurations WHERE name = 'User Options'

GO

Let’s take a look at the data exposed by the above methods. We need to look at either run_vale from sp_configure or value_in_use from sys.configurations to look at the current value set for this property.

SQL Server sp_configure sys.configurations

What we get from the above is one value and NOT detailed information for each of those individual properties. Using SSMS to retrieve this information manually on several servers is rather boring & tedious and definitely NOT a good use of my or anyone else’s time. This led me to write a T-SQL script to fetch the details of each one of the properties listed under the default connection options.

IF OBJECT_ID('tempdb..#ConnectionOptions', 'U') IS NOT NULL DROP TABLE #ConnectionOptions

IF OBJECT_ID('tempdb..#UserOptions', 'U') IS NOT NULL DROP TABLE #UserOptions
--Create the table and store the configuration values

CREATE TABLE #ConnectionOptions (property VARCHAR(100), value INT)
INSERT #ConnectionOptions

SELECT 'implicit transactions', 2 UNION ALL

SELECT 'cursor close on commit', 4 UNION ALL

SELECT 'ansi warnings', 8 UNION ALL

SELECT 'ansi padding', 16 UNION ALL

SELECT 'ansi nulls', 32 UNION ALL

SELECT 'arithmetic abort', 64 UNION ALL

SELECT 'arithmetic ignore', 128 UNION ALL

SELECT 'quoted identifier', 256 UNION ALL

SELECT 'no count', 512 UNION ALL

SELECT 'ansi null default on', 1024 UNION ALL

SELECT 'ansi null default off', 2048 UNION ALL

SELECT 'concat null yields null', 4096 UNION ALL

SELECT 'numeric round abort', 8192 UNION ALL

SELECT 'xact_abort on', 16384 
--define the table structure

CREATE TABLE #UserOptions ( name VARCHAR(100), minimum INT, maximum INT, config_vale INT, run_value INT)
--Retrieve the data

INSERT #UserOptions

EXEC sp_configure 'User options'
DECLARE @currentvalue INT

SELECT @currentvalue = run_value FROM #UserOptions
SELECT property, CASE WHEN (@currentvalue & value) = value

                       THEN 'has been SET' 

                       ELSE 'NOT SET'

               END CurrentValue

 FROM #ConnectionOptions

SQL Server server level default connection options results

The above results are easy to read and the above code is self explanatory. It uses bitwise operations on the retrieved value from the sp_configure/sys.configurations value .

Recommeded Reading:

How to use SP_CONFIGURE in SQL Server
T-SQL Bitwise Operations

Digg This  Reddit This  Stumble Now!  Buzz This  Vote on DZone  Share on Facebook  Bookmark this on Delicious  Kick It on DotNetKicks.com  Shout it  Share on LinkedIn  Bookmark this on Technorati  Post on Twitter