Partitioning Query Performance Benefits

From SQLServerPedia

Jump to: navigation, search

See Also: Main_Page - Database Administration - Database Design - Partitioning

Partitioning is primarily intended to make managing large tables easier for database administrators. However, partitioned objects can also improve performance of SELECT, UPDATE and DELETE queries. If a table is partitioned, SQL Server can often eliminate irrelevant partitions from even being considered by query optimizer and only examine those partitions that contain the requested data, as defined by query predicates. This feature is called partition elimination. In order to benefit from partition elimination, the query should not return data from all partitions.

It's important to notice that partition elimination only works if the search argument contains the partitioning column. One might think that the following query should also eliminate partitions:

SELECT CustomerKey, EnglishDayNameOfWeek, SUM(SalesAmount)  FROM dbo.FactInternetSales_Partitioned2 a  INNER JOIN DimTime b ON a.OrderDateKey = b.TimeKey  AND b.FullDateAlternateKey <= '6/1/2003'  GROUP BY CustomerKey, EnglishDayNameOfWeek

But in this case the date column from the dimension table is being examined, not from the fact table. Therefore, the execution plan shows that all five partitions are examined.

If you change the query above slightly, SQL Server could once again take advantage of partition elimination:

SELECT CustomerKey, EnglishDayNameOfWeek, SUM(SalesAmount)  FROM dbo.FactInternetSales_Partitioned2 a  INNER JOIN DimTime b ON a.OrderDateKey = b.TimeKey  AND a.FullDate <= '6/1/2003'  GROUP BY CustomerKey, EnglishDayNameOfWeek

Similarly, if you join multiple partitioned tables and specify search arguments for each table in the WHERE clause, SQL Server can eliminate partitions from each affected table and can provide excellent querying performance. The query execution plan would show multiple constant scan operators, each referencing only those partitions relevant for the given query.

Even simple queries requesting all rows from a partitioned table filtered by values in the partitioning column will execute considerably faster than the same query executed against a non-partitioned table, as long as SQL Server query optimizer can use partition elimination.

UPDATE and DELETE queries can benefit from partition elimination much like SELECT queries. The following query sets the SalesAmount column to an arbitrary number for all sales that occurred before June 1st 2002:

UPDATE FactInternetSales_partitioned2  SET SalesAmount = 1000  WHERE FullDate < '6/1/2002'

Contents

Demand Parallelism

Queries executed against partitioned tables on multi-processor servers can use an operator called demand parallelism. You can see partition identifiers in the query execution plan if demand parallelism operator is used. Demand parallelism is constrained by the availability of system resources and maximum degree of parallelism (MAXDOP) option which you can set at server or at query level. By default, maximum degree of parallelism is set to 0, which allows SQL Server to use all available processors for each query. Essentially if you have a query which affects 20 partitions on a server with 4 processors and if MAXDOP is set to default value then 4 partitions will be scanned in parallel. When one of the threads completes reading a partition it will move on to the 5th partition, then to the 6th partition and so forth.

Note that depending on the workload of your servers, parallel data retrieval may or might not be the most efficient choice. If you have a heavily utilized transaction processing system and processor resources are sparse, it might be best to override the default behavior and turn off parallelism at the query level by setting MAXDOP = 1 option. However, if your server can dedicate plenty of available processor resources to a single or few queries, then you should attempt to maximize parallelism to minimize the query execution time.

Table partitioning could also help with BULK INSERT command performance as long as the input files match the partitioning scheme of your table. Data from multiple files could be loaded in parallel, each load affecting a single partition. To further improve performance your data load, ensure that database is using the simple or bulk-logged recovery model.

More SQLServerPedia Wiki Articles on SQL Server Partitioning

When To Use SQL Server Partitioning

How to Implement Partitioned Tables