Author: Eric Jacobsen, Denny Lee
IntroductionDesigning effective aggregations are a well known way to improve query performance (for the sake of AS processing time). As you may have noticed in Tip #2 of the of the
Analysis Services Query Performance Top 10 Best Practices whitepaper, the best way to define effective aggregations is to use the Aggregation Manager sample application. Here we discuss one of the strategies for adding effective aggregations using this tool.
For more information and to download the Aggregation Manager Sample, please go to the
Microsoft SQL Server Product Samples: Analysis Services on Codeplex.
Design StrategyWhile there are many approaches and differing ideas on creating aggregations, the strategy below has worked well with many of the customers we have worked with.
1) When creating your aggregations initially with the AS2k5 Aggregation Design Wizard, start with 5-10% optimization. This initial 5-10% is just to provide us a good starting point.
Note, it is almost never a good idea to use more than 40-60% because many aggregations will be built. Having many aggregations may result in a second query preferring to use a smaller aggregation than a previous query used, thus not benefiting from the previously-used aggregation already being in memory. The effect we see is larger disk IO, in spite of the intuitive expected benefit from having more specialized aggregations.
2) Next, we suggest adding aggregations based on a representative set of queries. This strategy is similar to using SQL Server's Index Tuning Wizard. This set may be maintained over time, as you gather more knowledge about how your database is used. It could include a set of badly performing queries so that they could be used for designing aggregations.
As an aside, we notice that often the gathering of representative queries is a best practice for projects, due to the effect of better understanding the business goals and interactions with the end users.
Turn on the AS query log and set the sampling to 1 so that way you can record all query references to all subcubes. For more information on configuring the AS query log, please refer to Edward Melomed's article:
Configuring the Analysis Services Query Log 3) Create a SQL Server Job agent or some other mechanism that will delete the rows (every day or so) within the query log with Duration < 100 ms. Because the sampling is set to 1, all queries will be recorded. But queries that take < 100 ms do not need additional optimization to improve performance. As well, this way you can ensure that you don't have too many rows in the QueryLog table and fill up the database too quickly.
4) Next, run the representative set of MDX queries. By running this well known scenario that you want to optimize for, new rows in the QueryLog table will be created. Over time this step will be repeated, so automation is highly suggested. One way to do this is to record all of the queries using the SQL Server Profiler and use its playback feature.
We also recommend putting the queries into one or more text files and using ascmd.exe to execute them. This also allows comments to be added, for example explaining where the query came from and the business reason for its importance. The ascmd.exe command file is available from
Codeplex Analysis Services Samples 5) Now that you have a QueryLog full of data, we will create a SQL query to select the appropriately "slow" rows from the QueryLog table. Note that these rows correspond to the SQLProfiler trace event "Query Subcube". These represent storage engine queries internal to the SSAS server. Each MDX query may result in 0 or many rows in the QueryLog table.
We will use the same SQL query in the Aggregation Manager tool to determine which new aggregations to be built on top of the existing Aggregation Design.
select MaxDuration/QueryCount AvgDuration, * from (
select count(*) QueryCount, max(duration) MaxDuration, msolap_database, msolap_objectpath, dataset
from OlapQueryLog
group by msolap_database, msolap_objectpath, dataset
) T
where QueryCount >= 1 -- Change if desired
and MaxDuration/QueryCount >= 1 -- Change if desired
and MaxDuration > 100 -- Change if desired, units are millisec
and msolap_objectpath = 'MyObjectPath' -- Change
order by msolap_objectpath, MaxDuration desc
Above is a sample SQL query to do exactly this. Please change the msolap_objectpath and specify the appropriate QueryCount, MaxDuration/QueryCount, and/or MaxDuration values. Ideally we will end up with on the order of 1-30 aggregations. Having thousands or perhaps even hundreds would be far too many for optimization purposes.
One idea is to start by specifying MaxDuration > 5000 (milliseconds), and see how many rows are returned. Try lowering the threshold to 4000, 3000, 2000, 1000, and pick a threshold value. It is not directly clear what the optimal value is, as there is a trade off between the number of unique aggregations, sizes, and queries. Again, too many may result in more overall disk IO and inability to share the same aggregation between different queries, yet too few may result in no aggregations being available for some queries.
6) In general, repeat every week for a month, and then for every month thereafter on your production systems to gradually add aggregations for MDX queries that are slow.
Closing Comments This strategy doesn't necessarily cover all issues as it does not consider combining one or more aggregations (e.g. replace aggs 5, 17, and 33 with a new one that covers them all), does not consider whether aggregations are still needed (e.g. a new larger agg may make an older smaller agg unnecessary), nor is there an estimation of size or performance benefit. But while it does not cover all possible avenues, there is empirical evidence from customer implementations that this approach will allow for the creation of effective aggregations. In the end, you can see that creating aggregations within Analysis Services 2005 is part science and part art. The above design strategy provides a framework for you to be able to make this process straightforward.