Tracking & Resolving Sort Warnings in SQL Server
What are Sort Warnings?
Queries with sort operations try to use memory to do the sorting initially. If the server has only few memory buffers left, this sort operation will spill to tempdb. SQL Server will raise ‘Sort Warnings’ event when this happens and it can take multiple passes to tempdb to get the sorted output. Every pass to tempdb is additional overhead and takes extra resources. Note that a query with out any Sort Warnings will perform better compared to single pass, query with single pass will perform better compared to multiple passes. These Sort Warnings are captured by the default trace or by running profiler and EventSubClass column will tell if it took single or multiple passes to get the desired sorted output. Anytime, you see these Sort Warnings the server may NOT be performing optimally.
How do I know if my system is affected by these?
Note: Below query *may* return thousands of rows.
DECLARE @filename VARCHAR(50)
SELECT @filename = CAST(value AS VARCHAR(50))
FROM fn_trace_getinfo(DEFAULT)
WHERE property = 2
AND value IS NOT NULL
--Check if you have any Sort Warnings
SELECT gt.HostName,
gt.ApplicationName,
gt.NTUserName,
gt.NTDomainName,
gt.LoginName,
gt.SPID,
gt.EventClass,
te.Name AS EventName,
gt.EventSubClass,
gt.TEXTData,
gt.StartTime,
gt.EndTime,
gt.ObjectName,
gt.DatabaseName,
gt.FileName
FROM [fn_trace_gettable](@fileName, DEFAULT) gt
JOIN sys.trace_events te ON gt.EventClass = te.trace_event_id
WHERE EventClass = 69
ORDER BY StartTime;
--Get a scale of this issue
SELECT dateadd(dd,datediff(dd,0,gt.StartTime),0),
gt.EventSubClass,
count(1) as MeasureMe
FROM [fn_trace_gettable](@fileName, DEFAULT) gt
JOIN sys.trace_events te ON gt.EventClass = te.trace_event_id
WHERE EventClass = 69
group by dateadd(dd,datediff(dd,0,gt.StartTime),0), gt.EventSubClass
ORDER BY dateadd(dd,datediff(dd,0,gt.StartTime),0), gt.EventSubClass
What statements are causing these?
We can use SQL Profiler to find if the system is suffering from ‘Sort Warnings’ but we can’t really capture exactly what statements are causing these. For certain events SQL Profiler doesn’t tell much information to find the root cause of these issues. See image below, TextData column is disabled for this event.
How do we track them if there is no support?
There is no direct support to track these but Solomon Rutzky came up with an innovative idea to use SQL Profiler, Triggers, Temp tables and dbcc inputbuffer. This idea will help to track down the offending statements up to the procedure name. In most cases, this is good enough but if you have one SP calling another or 20 Stored Procedures typical of Data-warehouse environments or you have a length Stored Procedure then it makes harder to pinpoint the root cause. Here is an extension to his idea.
1) Create a Profiler trace at a minimum selecting these columns and save the results to a table.
EventClass, DatabaseID, SPID, TextData, EventSubClass and ApplicationName.
2) Pause the trace and run the below script to create a trigger.
3) Play the trace again.
Once the trace runs for sometime, now you can find out exactly which statements are causing these Sort Warnings.
USE [dba] GO SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO create TRIGGER[dbo].[TraceTableExtraInfo1] ON[dbo].[Trace_041409] AFTER INSERT AS BEGIN SET NOCOUNT ON; DECLARE@SQL VARCHAR(4000) SELECT@SQL = 'select SUBSTRING(dest.text, (der.statement_start_offset/2)+1, ((CASE der.statement_end_offset WHEN -1 THEN DATALENGTH(dest.text) ELSE der.statement_end_offset END - der.statement_start_offset)/2) + 1) AS statement_text from sys.dm_exec_requests as der cross apply sys.dm_exec_sql_text (der.sql_handle) as dest where session_id = '+CONVERT(VARCHAR(10),isnull(ins.SPID,0)) +'' FROM inserted ins CREATE TABLE#DBCCInfo (EventInfo NVARCHAR(4000)) INSERT INTO#DBCCInfo (EventInfo) EXEC(@SQL) UPDATE tt SET tt.TextData =info.EventInfo FROM dbo.[Trace_041409]e in the tracel tt INNER JOIN inserted ins ON ins.RowNumber =tt.RowNumber CROSS JOIN #DBCCInfo info WHERE tt.TextData IS NULL DROP TABLE#DBCCInfo END
Comment: Use caution to select as few columns as possible in the above trace and set restrictive filters. Otherwise this exercise can overwhelm the server while collecting the data.
Ok, Now we know what statements, How do we fix these?
1) Add more RAM to the server.
2) Avoid the sorting if possible.
3) Minimize the number of rows returned from the statement with more restrictive where clause.
4) Minimize the number of columns being sorted on.
5) Add proper indexes to help the sort.
Anything else:
*Rant*
Index maintenance scripts use sorting but are excluded in the ‘Sort Warnings’ event. SQL Server throws Sort Warnings for explicit sorting such as order by but I am surprised to see these for UPDATE STATISTICS. Here is the code snippet generated by update statistics.
SELECT StatMan([SC0], [SB0000]) FROM (SELECT TOP 100 PERCENT [SC0], step_direction([SC0]) over (order by NULL) AS [SB0000] FROM (SELECT [cEndDateTime] AS [SC0] FROM [dbo].[Tablename] WITH (READUNCOMMITTED,SAMPLE 5.949706e+001 PERCENT) ) AS _MS_UPDSTATS_TBL_HELPER ORDER BY [SC0], [SB0000] ) AS _MS_UPDSTATS_TBL OPTION (MAXDOP 1)
What’s worrying about the code snippet is, the sort is being done with TOP 100 percent inside a sub query which doesn’t make any sense nor does it add value. I hope MSFT will add documentation on this or fix the Sort warnings for Update Statistics.
Acknowledgements: Solomon Rutzky, Jonathan Kehayias