More Advanced MDX Queries
From SQLServerPedia
|
See Also: Main_Page - Database Administration - Analysis Services - MDX with Analysis Services Queries you will see in this section involve syntax elements that you will encounter most often; these aren't necessarily complicated but go beyond the basic syntax that simply lists certain members or rows and others on columns.
CROSSJOINThe CROSSJOIN function allows you to combine possible combinations of two sets. This is a powerful function that allows analyzing your measures by multiple dimensions on the same view. For example, suppose you wanted to see the breakdown of sales by product category and by year, the following query will help: SELECT CROSSJOIN({[date].[cy 2001]:[date].[cy 2002] },
{[product].[category].members})
ON ROWS,
{[measures].[order count]} ON COLUMNS
FROM [adventure works] Results:
The CROSSJOIN function essentially multiplies one set by another. In fact, MSAS supports an alternative syntax for writing the same query by replacing the CROSSJOIN function with a multiplication sign; the following query would return the same results: SELECT
{[date].[cy 2001]:[date].[cy 2002] } * {[product].[category].members} ON ROWS,
{[measures].[order count]} ON COLUMNS
FROM [adventure works]
SELECT CROSSJOIN({[date].[cy 2001]:[date].[cy 2002] },
CROSSJOIN({[product].[category].members},
{[customer].[education].members}))
ON ROWS,
{[measures].[order count]} ON COLUMNS
FROM [adventure works]
FILTERThe FILTER function allows you to reduce the result set by specifying some criteria. This function is particularly useful if your result set is large and difficult to decipher. For example, suppose you have five thousand members in your product dimension. Instead of presenting all product sales you could return only those products that have sold at least 3000 items by using the following query: SELECT FILTER({[product].[product].members},
[Measures].[order count]>3000) ON ROWS,
{[measures].[order count]} ON COLUMNS
FROM [adventure works]
Results:
Note that unlike Transact-SQL which would limit the result set by the WHERE clause MDX limits the output using the FILTER function included with the SELECT statement. The FILTER function requires two arguments, a set that needs to be limited and the criteria for limiting the set. Note that you can use multiple criteria within the same FILTER function. For example, the following statement checks for a total order count greater than 3000 OR an internet order count of 1000: SELECT FILTER({[product].[product].members},
[measures].[order count]>3000 OR [measures].[internet order count]>1000) ON ROWS,
{[measures].[order count]} ON COLUMNS
FROM [adventure works] You can use a set returned by other functions as the first argument of the FILTER function. For example, you can apply FILTER to a result of the CROSSJOIN. You could extend the previous example to return only those product categories that sold over 1000 orders in either 2001 or 2002: SELECT FILTER(CROSSJOIN({[date].[cy 2001]:[date].[cy 2002] },
{[product].[category].members}), [measures].[order count]>1000)
ON ROWS,
{[measures].[order count]} ON COLUMNS
FROM [adventure works]
Results:
The next example uses the FILTER function to return 2004 order count only for those products whose sales have improved in 2004 as compared to 2002: SELECT FILTER(
{[product].[category].members},
([date].[cy 2004],[measures].[order count])>
([date].[cy 2002], [measures].[order count]))
ON ROWS,
{[measures].[order count]} ON COLUMNS
FROM [adventure works]
WHERE ([date].[cy 2004]) Results:
You could also use the FILTER function if you wanted to eliminate "All Products" from the result set; the query would look as follows:
ORDERThe ORDER function allows sorting of the query result set by the returned data values. This function takes three arguments: the set to be ordered, string or numeric expression to be used as the sorting criteria, and optional flag that specifies whether the results should be ordered in ascending or descending fashion. The following query returns 2003 order counts for product categories:
SELECT {[product].[category].members} ON ROWS,
{[measures].[order count]} ON COLUMNS
FROM [adventure works]
WHERE [date].[cy 2003] Results:
Now let's see how you can sort it in ascending order:
Results:
Note however, that "all products" still appears at the top of the list, even though the query was supposed to place it at the bottom. This is because by default MSAS keeps the hierarchy in tact - since all products is the "parent" of all other categories it still appears at the top. You can override the default behavior by specifying BASC ("B" stands for break hierarchy) flag instead of ASC. When you add BASC to the query the results will look as follows:
Similarly to break the hierarchy when sorting in descending order you must specify BDESC flag instead of DESC. The next example is somewhat more complicated; it retrieves 2002 sales for those categories whose sales have improved in 2002 as compared to 2004. In addition the query sorts the output based on 2003 sales:
Results:
At first the results of this query might look confusing. But if you examine the previous query results you will notice that accessories indeed had more orders than bikes in 2003. Since this result set is ordered based on 2003 sales the order count for accessories appears above bikes. You could also use the ORDER function to sort the items based on a string. For example the following query will return the output based on product name in descending order:
Results:
TOPCOUNT and BOTTOMCOUNTTOPCOUNT and BOTTOMCOUNT functions return top or bottom N members of a dimension optionally ordered based on a specified measure. For example, to return the top ten products from the product dimension and their respective orders you could write the following:
Results:
Note that MSAS does not attempt to order products alphabetically - it simply returns the top 10 products it finds in the product dimension. Next let's see how you could retrieve top 10 products based on respective order counts:
Results:
BOTTOMCOUNT would do the exact opposite - return the top 10 items with the least number of orders. However the BOTTOMCOUNT function isn't always associated with negative performers; for example you could use the BOTTOMCOUNT function to retrieve shipping companies that provide shortest (and therefore best) delivery times. TOPPERCENT and BOTTOMPERCENTTOPPERCENT and BOTTOMPERCENT functions are similar to their counterparts that return counts. However, these functions require the numeric expression (measure); they return the topmost (or bottommost) members whose cumulative total is above or equal to the specified percentage. For example, to return products whose cumulative number of orders is 15 percent of all orders (excluding "all products" from the total) you could run the following:
Results:
TOPSUM and BOTTOMSUMTOPSUM and BOTTOMSUM functions work similarly to TOPPERCENT. TOPSUM sorts the set and returns thw topmost members whose cumulative total is above or equal to a specified value. BOTTOMSUM returns the bottommost members whose cumulative total is above or equal to the specified value. For example, the following query will return products that have had the least number of orders and their cumulative total is 10:
Results:
Note the usage of the ISEMPTY function in this query; if we had not filtered out those products that haven't been ordered the list would include every single product that has had no orders. The ISEMPTY function returns TRUE if a member has no value (has a value of NULL); specifying NOT ISEMPTY does the opposite - it advises MSAS to return only values that are not empty. NON EMPTYThe NON EMPTY keyword supplies another way of filtering members that have NULL values (are empty). Empty cells occur because the intersection of two or more dimensions does not exist. The previous query could be rewritten like this:
Results would be the same as before. Similarly we could filter promotions that have not resulted in any orders from the following query: SELECT NON EMPTY {[promotion].[end date].members} ON ROWS,
Results:
If you attempt concatenating an empty cell with other cells MDX will treat the empty cell as an empty string. If you attempt adding / subtracting / multiplying a value of an empty cell with a numeric value MDX will treat the empty cell as a number zero. Empty cells can skew your calculations; for example, if you try calculating an average, MDX will divide the sum of all cells by the count of cells. When MDX sums the cells it adds zero for each empty cell but when it counts cells it increments the count of cells even if the empty cell is encountered. You need to decide whether you wish to exclude empty cells from the calculations depending on your business needs. You can apply the NON EMPTY keyword to any set, including the output of the CROSSJOIN function. For example, the following query excludes empty members from promotions that occurred in 2004 and shows the number of orders generated for each product category by each promotion:
Results:
Note that the NON EMPTY keyword examines whether the tuple is empty, as opposed to checking whether the cell was empty. So it is possible to see empty values in the output even when using NON EMPTY. The only reliable test of whether the cell is empty is using the ISEMPTY function. MDX provides another way of dealing with empty cells the COALESCEEMPTY function allows you to evaluate several values and return the first non-empty value. For example, the following query returns two columns the first one is the actual count of orders for a particular promotion; the second column replaces the empty counts with the number zero:
Results:
Alternatively we could examine multiple measures and supply the number zero if none of the examined measures have any values:
Note that the COALESCEEMPTY function examines values from left to right. NONEMPTYCROSSJOINThe NONEMPTYCROSSJOIN function is similar to CROSSJOIN but is considerably more efficient; this function also acts differently than CROSSJOIN preceded by the NON EMPTY keyword. NONEMPTYCROSSJOIN returns a cross product of two or more sets but it eliminates the empty tuples as well as tuples that have no associated fact table data. Since tuples that aren't based on fact tables are dropped, the NONEMPTYCROSSJOIN function does not work with calculated members. In addition to one or more sets this function also takes "set counter" as an optional argument. If the set counter is supplied then the function cross joins the specified number of sets; the rest of the sets provide context and are used to determine which members are considered non-empty in the resulting set. For example, the following query will return products that have had orders in calendar years 2001 and 2002:
Results:
Recall that the CROSSJOIN function only accepts 2 sets as parameters. Not so with NONEMPTYCROSSJOIN; you could alter this query slightly by adding another set to return the breakdown of order counts by product categories and promotions:
Results:
Notice that this output finds all intersections of date and product category dimensions and then cross joins the results with promotion end date. You could pass the crossjoin set counter to the NONEMPTYCROSSJOIN function to drop the first column and make the query more efficient:
Results:
As a rule NONEMPTYCROSSJOIN performs better than the CROSSJOIN function preceded with NON EMPTY keyword. NONEMPTYCROSSJOIN has certain drawbacks, however, and is deprecated in Analysis Services 2005. MSAS 2005 introduces the new function NONEMPTY() that should be used in place of both NONEMPTYCROSSJOIN and NON EMPTY CROSSJOIN. Click here for more information on the NONEMPTY() function. |