When
summarizing data using MDX, it’s a common requirement to find the top N by
group (i.e., the top N tuples of one set for each member of another set). For
example, you may need to find the top N brands by state or the top N products
by calendar year based on sales count or amount. In MDX, this is accomplished using a
combination of the Generate and TopCount functions. As an example, the following
returns the top 5 products for each region based on sales amount:
WITH SET [Top N Products] AS Generate( [Sales Territory].[Sales Territory Region].Children , TopCount( [Sales Territory].[Sales Territory Region].CurrentMember * [Product].[Product Categories].[Product].MEMBERS , 5 , ( [Measures].[Internet Sales Amount] ) ) ) SELECT { [Measures].[Internet Sales Amount] } ON COLUMNS, NON EMPTY { [Top N Products] } ON ROWS FROM [Adventure Works]
This commonly available query uses the Generate function to evaluate the set produced by the TopCount of the cross product of regions and products, and applies it to each of the region member's children. This produces the
following results:
But this only provides part of the picture. Typically, I would want to include all other products within each region as well so that I could provide context to the top N items. For example, in a report (which could also be accomplished using MDX... but let's keep the MDX straightforward for this post) I could calculate each product's percent of sales within each region or each region's percent of sales versus all sales. In order
to produce the all other products row for each region, we need to create a new calculated member (i.e., [All Other Products]) using the Aggregate and Except functions.
WITH SET [Top N Products] AS Generate( [Sales Territory].[Sales Territory Region].Children , TopCount( [Sales Territory].[Sales Territory Region].CurrentMember * [Product].[Product Categories].[Product].MEMBERS , 5 , ( [Measures].[Internet Sales Amount] ) ) ) MEMBER [Product].[Product Categories].[All Other Products] AS Aggregate( Except( NonEmpty( { [Sales Territory].[Sales Territory Region].CurrentMember * [Product].[Product Categories].[Product].MEMBERS } ) , [Top N Products] ) ) SELECT { [Measures].[Internet Sales Amount] } ON COLUMNS, NON EMPTY { [Top N Products] , ( [Sales Territory].[Sales Territory Region].[Sales Territory Region] , [All Other Products] ) } ON ROWS FROM [Adventure Works]
The Aggregate function takes a set (e.g., the cross product of region and product) and aggregates a measure (e.g., sales amount) within the current context. However, we additionally must use the Except function to exclude the top N products for each region from the aggregate.
While the NonEmpty function within the Except doesn't make a difference in this example, it can provide a significant performance gain depending on the cross product that is being evaluated.
This query results in the following (as a note, I did use an Order function so that All Other Products would sort within each group, but I left it out of the MDX to keep the example clear):
While the NonEmpty function within the Except doesn't make a difference in this example, it can provide a significant performance gain depending on the cross product that is being evaluated.
This query results in the following (as a note, I did use an Order function so that All Other Products would sort within each group, but I left it out of the MDX to keep the example clear):
Although the MDX is a little more complicated, the results are much more useful, providing a basis for creating context for your top N values.
Can you provide the full query?
ReplyDelete