Execute T-SQL in Parallel

From SQLServerPedia

Jump to: navigation, search

See Also: Transact SQL Code Library

Contents

Overview

This page describes how to create a CLR stored procedure which will allow you to execute T-SQL on a set of databases in parallel. Various issues are addressed such as limiting parallel threads and collecting errors return by each.

Requirements

  • SQL Server 2005 or later with CLR enabled
  • Deployment database with compatibility mode 90 or greater and Trustworthy flag on
  • Visual Studio 2005 or later (support for SQL Server project template under language of choice)
  • Microsoft .NET Framework 2.0 or later
  • SP must be executed by a network/domain account able to establish new connections based on Integrated Security

Important Considerations

This is a powerful tool that gives those with access to it the ability to spawn as many threads as they choose on a server. It, like many other tools, could be used for evil and should be properly secured and should be treated with the same care used with sp_executesql or sp_MSforeachdb.

Some things to keep in mind as you implement:

  • If your server only has one or two CPU cores, low I/O throughput, or strained memory resources this procedure becomes less useful. It will be most helpful on servers which have many resources available for concurrent requests.
  • If you wish to cap the maximum threads based on the number of CPU cores available, add this line in the Initialize Variables section of code:
    MaxDOP = Math.Min(MaxDOP, System.Environment.ProcessorCount)
  • When testing new implementations of this procedure, you should observe how much system resources are being used or if there are any excessive waits during execution. Adjust @MaxDOP up or down accordingly.
  • If you use this procedure to perform simultaneous actions on the same database you may risk creating a deadlock.
  • The procedure does not return until all threads have exited. However, if you are forced to kill the process which ran it, its current threads will continue their normal course and you will not be able to retrieve any of their possible errors unless you've opted to write them to the Windows Application Event Log.
  • All the connected threads should be visible in the appropriate system tables.
  • You should review the Books Online document regarding CLR Integration Security before you begin.


Program Flow

The stored procedure will take the following steps:

  1. Establish a context connection to SQL Server
  2. Retrieve the @@SERVERNAME value
  3. Loop through each database name provided
    1. Provide server, database, T-SQL, and additional options to a new thread and run it
      1. Thread connects via Integrated Security to the server and database
      2. Thread executes its T-SQL command
      3. Thread catches errors, retries, and stores errors if they persist
    2. Sleep until running thread count is less than max specified
  4. Wait for all threads to stop
  5. Loop through and output errors received from threads

Implementation

Configuring SQL Server

First the server must have CLR enabled. More information on this can be found here. Enable it by executing this script in SQL Server:

sp_configure 'clr enabled', 1
GO
RECONFIGURE
GO

Next the database to which the stored procedure will be deployed must have its trustworthy flag enabled. More information on this can be found here. Enable it by executing this script in SQL Server:

ALTER DATABASE AdventureWorks SET TRUSTWORTHY ON
GO

Finally, the user you deploy as must be a member of the sysadmin fixed server role. Now SQL Server is ready for the stored procedure.

Creating the Project

This method requires two portions of code. First is the Partial Public Class StoredProcedures which includes all CLR stored procedures contained in the Visual Studio project. Second is a special class we will create that can be run as a separate thread. This class will accept all our parameters and perform the actual T-SQL execution.

  1. Open an empty solution in Visual Studio and go to File->New->Project or press Ctrl-Shift-N.
    File:NewProject.jpg
  2. Find the SQL Server Project template under Visual Basic-Database or Database Projects-Microsoft SQL Server-SQL CLR (VB), depending on your version of Visual Studio, and fill out the rest of the fields.
    This version allows you to select the .NET Framework version to base the project on. If you have this option, pick the newest version. If not, don't worry about it.
  3. Add a reference to the server and database to which you will be deploying the stored procedure. Click No when asked whether to enable SQL Server CLR debugging.
  4. Now you should be presented with a blank project. Open the project properties by double-clicking on My Project in the Solution Explorer.
  5. Under the Database section, change the Permission Level to Unsafe.
    Security Note from Microsoft on this:
    Specifying UNSAFE enables the code in the assembly complete freedom to perform operations in the SQL Server process space that can potentially compromise the robustness of SQL Server. UNSAFE assemblies can also potentially subvert the security system of either SQL Server or the common language runtime. UNSAFE permissions should be granted only to highly trusted assemblies. Only members of the sysadmin fixed server role can create and alter UNSAFE assemblies.
    File:PermissionsUnsafe.jpg
  6. Now go to the Project menu and choose Add Stored Procedure... then choose a name for it such as EXEC_PARALLEL and click the Add button.
  7. Replace the entire contents of this code file with:
Imports System
Imports System.Data
Imports System.Data.SqlClient
Imports System.Data.SqlTypes
Imports System.Threading
Imports Microsoft.SqlServer.Server

Partial Public Class StoredProcedures

     _
    Public Shared Function EXEC_PARALLEL(ByVal DBs As String, ByVal MaxDOP As Integer, _
                                         ByVal TSQL As String, ByVal msDelay As Integer, _
                                         ByVal Retries As Integer) As SqlInt32

        ' Initialize variables
        Dim conn As SqlConnection
        Dim cmd As SqlCommand
        Dim DBList() As String = DBs.Split(",")
        Dim errors As New Generic.List(Of String)
        Dim locker As New Object
        Dim server As String

        Dim threads As New Collections.Generic.List(Of Thread)
        Dim stopped As New Specialized.StringCollection
        Dim item As Thread
        Dim executer As ExecSql

        ' Get Server Instance Name
        conn = New SqlConnection("context connection = true;")
        conn.Open()

        cmd = conn.CreateCommand()
        cmd.CommandText = "SELECT @@SERVERNAME"
        server = cmd.ExecuteScalar().ToString()

        cmd.Dispose()
        conn.Close()
        conn.Dispose()

        ' Execute threads
        Dim keyloop As IEnumerator = DBList.GetEnumerator()
        While keyloop.MoveNext()
            executer = New ExecSql(server, keyloop.Current.ToString().Trim(), TSQL.Replace("?", keyloop.Current.ToString().Trim()), Retries, errors, locker)
            item = New Thread(AddressOf executer.run)
            item.Name = "ExecSql " & keyloop.Current().ToString().Trim()
            item.Start()
            threads.Add(item)

            SqlContext.Pipe.Send(DateTime.Now.ToShortTimeString() & " : Start : " & item.Name.Replace("ExecSql ", ""))
            Thread.Sleep(msDelay)

            While checkThreads(threads, stopped) >= MaxDOP
                Thread.Sleep(1000)
            End While
        End While

        ' Wait for all threads to stop
        While checkThreads(threads, stopped) > 0
            Thread.Sleep(1000)
        End While
        SqlContext.Pipe.Send("All threads have stopped with " & errors.Count.ToString() & " errors: ")

        If errors.Count > 0 Then
            keyloop = errors.GetEnumerator()
            While keyloop.MoveNext()
                SqlContext.Pipe.Send(keyloop.Current())
            End While

            Throw New Exception("ERRORS OCCURRED.")
        End If

        Return 0 - errors.Count
    End Function

    Public Shared Function checkThreads(ByRef threads As Collections.Generic.List(Of Thread), _
                                        ByRef stops As Specialized.StringCollection) As Integer
        Dim item As Thread
        Dim running As Integer = 0

        Dim keyloop As IEnumerator = threads.GetEnumerator()
        While keyloop.MoveNext()
            item = keyloop.Current()
            If item.IsAlive Then
                running += 1
            ElseIf Not stops.Contains(item.Name) Then
                stops.Add(item.Name)
                SqlContext.Pipe.Send(DateTime.Now.ToShortTimeString() & " : Stop  : " & item.Name.Replace("ExecSql ", ""))
            End If
        End While

        Return running
    End Function
End Class
  1. Now go to the Project menu and choose Add Class..., give it a name, and click the Add button.
  2. Replace the entire contents of this code file with:
Imports System.Data.SqlClient
Imports System.Threading

Public Class ExecSql
    Private ExecErrors As Generic.List(Of String)
    Private ExecLocker As Object
    Private ExecServer As String
    Private ExecDB As String
    Private ExecTSQL As String
    Private ExecRetries As Integer

    Public Sub New(ByVal Server As String, ByVal DB As String, ByVal TSQL As String, ByVal retries As Integer, _
                   ByRef errors As Generic.List(Of String), ByRef locker As Object)
        Me.ExecServer = Server
        Me.ExecDB = DB
        Me.ExecTSQL = TSQL
        Me.ExecRetries = retries
        Me.ExecErrors = errors
        Me.ExecLocker = locker
    End Sub

    Public Sub run()
        Dim tries As Integer = 1
        Dim conn As SqlConnection
Retry:
        conn = New SqlConnection("Data Source=" & ExecServer & ";Initial Catalog=" & ExecDB & ";Integrated Security=SSPI;")
        Try
            conn.Open()

            If conn.State = ConnectionState.Open Then
                Dim cmd As SqlCommand = conn.CreateCommand()
                cmd.CommandText = ExecTSQL
                cmd.CommandTimeout = 0
                cmd.ExecuteNonQuery()

                cmd.Dispose()
                conn.Close()
                conn.Dispose()
            Else
                Throw New Exception("Unable to connect to SQL Server.")
            End If
        Catch e As Exception
            If conn.State <> ConnectionState.Closed Then conn.Close()
            conn.Dispose()
            'System.Diagnostics.EventLog.WriteEntry("SQL_EXEC_PARALLEL", e.Message)
            If tries <= ExecRetries Then
                Thread.Sleep(5000)
                tries += 1
                GoTo Retry
            Else
                SyncLock ExecLocker
                    ExecErrors.Add(Me.ExecDB.PadRight(16, " ") + " : " + e.Message)
                End SyncLock
            End If
        End Try
    End Sub

End Class
  1. Next, at the top of the screen, change the Solution Configuration from Debug to Release.
  2. Finally, go to the Build menu and choose Deploy Solution

If all went well, you should now have a stored procedure in your database which can be used like so:

EXEC AdventureWorks.dbo.EXEC_PARALLEL @DBs = 'msdb,AdventureWorks,master', @MaxDOP = 2,
         @TSQL = 'EXEC ?.sys.sp_updatestats',
         @msDelay = 5000, @Retries = 2

The above statement will update stats on the listed databases, two at a time, waiting 5 seconds between each thread start, and retrying twice for each thread that fails. Any thread which fails its final retry will have its error recorded and printed out when all threads are complete. The ?. is optional since each database is specified as the Initial Catalog when each new thread connects.

Here is an example of output that includes some failures. In this example, the first four databases contain the table PERSON, and the last four do not.

EXEC Adventureworks.dbo.EXEC_PARALLEL
@DBs='foo_live,foo_test,foo_dev,foo_archive,foo_misc,staging,reports,legal',
@MaxDOP=4, @TSQL='SELECT TOP 1 * FROM dbo.PERSON',
@msDelay=1000, @Retries=1

Results in:

1:23 PM : Start : foo_live
1:23 PM : Stop  : foo_live
1:23 PM : Start : foo_test
1:23 PM : Stop  : foo_test
1:23 PM : Start : foo_dev
1:23 PM : Stop  : foo_dev
1:23 PM : Start : foo_archive
1:23 PM : Stop  : foo_archive
1:23 PM : Start : foo_misc
1:23 PM : Start : staging
1:23 PM : Start : reports
1:23 PM : Start : legal
1:23 PM : Stop  : foo_misc
1:23 PM : Stop  : staging
1:23 PM : Stop  : reports
1:23 PM : Stop  : legal
All threads have stopped with 4 errors:
foo_misc         : Invalid object name 'dbo.PERSON'.
staging          : Invalid object name 'dbo.PERSON'.
reports          : Invalid object name 'dbo.PERSON'.
legal            : Invalid object name 'dbo.PERSON'.
.Net SqlClient Data Provider: Msg 6522, Level 16, State 1, Procedure EXEC_PARALLEL, Line 0
A .NET Framework error occurred during execution of user-defined routine or aggregate "EXEC_PARALLEL":
System.Exception: ERRORS OCCURRED.
System.Exception:
   at DBA_SQL_UTIL.StoredProcedures.EXEC_PARALLEL(String DBs, Int32 MaxDOP, String TSQL, Int32 msDelay, Int32 Retries)

Related Reading

Author Credits

This wiki article was authored by Konrad Willmert to document, share and improve upon a method he created for executing T-SQL on multiple databases and in parallel.

Konrad Willmert, a Programmer/Analyst for Indiana Wesleyan University since 2006, also serves as a reserve DBA for the institution. He has six years of experience building web applications based on MySQL for web design and hosting company Visionary Web. Konrad acquired the bulk of his SQL Server knowledge and experience at IWU while collaborating with the DBA on building and optimizing a data warehouse and custom ETL process from scratch. Konrad can be reached at: konrad.willmert@indwes.edu.