Author: Denny Lee Contributors / Reviewers: Richard Tkachuk, Akshai Mirchandani, Robert Zare, Thomas Kejser, Gang Xiao, Cristian Petculescu
From the early days of Microsoft® SQL Server® Analysis Services, one of the query performance optimization techniques was for developers to use the usage-based optimization feature. Developers would create their OLAP cube and cube partitions and then set the aggregations for the cube at a low percentage. It was common to see developers set the aggregations for their partitions at 0% - 5%. After the OLAP database was deployed to the production servers, there would be a beta test period for users to work with the cube and to store the components of the queries executed against the cube. Afterwards developers could build aggregations based on what users had queried, typically optimizing for the slowest queries. After this was done, these aggregations could significantly improve the performance of these initially slow queries. With the release of SQL Server 2008 Analysis Services, an improved usage-based optimization algorithm has been included that enhances Analysis Services’ ability to create better aggregations.
To write this technical note, we performed some extensive testing on various customer databases to provide the lessons learned below. To do this, the query and database criteria were that the queries resulted in a lot of storage engine-heavy queries (as opposed to formula engine-heavy queries). This could be identified by viewing the Analysis Services profiler trace and noting that there were a lot of Event Class 12\Event SubClass 22 events (Query SubCube Verbose\Non Cache data). Specifically, we wanted queries that were not able to use the cache and needed to hit the file system.
Upon the application of usage-based optimization, we verified that aggregations were being used by verifying the Event Class 60 (Get Data from Aggregation) events in the Analysis Services profiler trace in SQL Server Profiler. For more information about the Analysis Services events, see Query Events in SQL Server Books Online. As well, you can find a translation of the events within the [Analysis Services Root Folder]\OLAP\bin\Resources\1033\tracedefinition100.xml file. For more information about how to use the Profiler to identify query performance bottlenecks, see Identifying and Resolving MDX Query Performance Bottlenecks in SQL Server 2005 Analysis Services.
If usage-based optimization is used correctly, you can consistently reduce the length of long-running queries by using usage-based optimization to create smart aggregations, as shown in Figure 1.
Figure 1: Using usage-based optimization to improve query performance
We learned the following key lessons for using the usage-based optimization feature of SQL Server 2008 Analysis Services:
For more information about the results and tests performed, please read-ahead – if you’re not interested – that’s it! :-)
As noted above, we executed a large number of queries on various customer databases. Table 1 contains twelve queries that are representative of the larger customer query tests where we compared the query times before and after we applied usage-based optimization. (Times are in milliseconds.)
All of the tests (including the values in Table 1) had the cache and file system cleared before query execution to ensure that the query results were not obtained from the cache.
As can be seen in Figure 2, queries Q1 and Q2 show a two-order of magnitude improvement in query performance. But, in the case of query Q3, there is a two-order of magnitude performance degradation in query performance. This occurred because Q3 was already very fast(337 milliseconds) due to an existing set of aggregations. When the usage-based optimization aggregations were applied, they replaced the existing aggregations instead of adding to them, This replacement caused the performance degradation with Q3.
Figure 2: First set of usage-based optimization query tests
When we reapplied the aggregations such that both the usage-based optimization aggregations and the original aggregations were applied, all three queries had return times in the hundreds of milliseconds, that is, less than one second.
Lesson Learned: While this is not really specific to usage-based optimization aggretations, remember to be careful and ensure that any new aggregations that you create with usage-based optimization do not override existing aggregations that you may want to keep.
Figure 3 breaks down query Q5, which shows the same query execution times when no aggregations, handmade aggregations (Before UBO), and usage-based optimization aggregations (After UBO). As seen here, the application of usage-based optimization may sometimes result in the creation of aggregations that may slow down query performance.
Figure 3: Application of usage-based optimization may be slower
There are times when the creation of any aggregations may slow down query performance. There are any number of reasons that could cause this, ranging from cube design to the way the MDX statements were written. But as you can see in Figure 3, when aggregations were removed from the cube and the query was re-executed, query performance actually improved. As noted in the Table 2, even though the number of subcube requests to the storage engine were the same, the duration is significantly higher for usage-based optimization.
Table 2: Usage-based optimization has longer duration to query process the same number of events
Lesson Learned: Make sure to test all of your queries with your original aggregations (if you had any) and without any aggregations at all. There are times when the Analysis Services MOLAP engine can scan through the file system faster than aggregations can be gone through. This can be especially apparent in the case where the size of the aggregations is larger than the data itself.
As well, the “Before UBO” query in Figure 3 was against a cube where handmade aggregations were made following the guidelines within the Analysis Services 2005 Design Strategy technical note. If we discount the “No Agg” query, it becomes apparent that there are times where handmade aggregations can be faster. This is because when you use the Aggregation Manager to design your own aggregations, you can optimize for that specific query. Note that you can also use usage-based optimization to design aggregations for a single query. But the advantage of usage-based optimization is that it can create and combine aggregations for all of your select queries (for example,replace aggregations 5, 17, and 33 with a new one that covers them all).
Lesson Learned: To improve the performance of individual queries, sometimes you can get faster query performance by creating handmade aggregations because you can optimize for that specific query.
As can be seen in Figure 4, all of these queries here had relatively fast query times – that is, all of them were under 5s query time. In some cases, usage-based optimiation had improved performance (Q9, Q6) while in some other cases, there was a negligible performance degradation (Q7, Q8) that may or may not have anything to do with the use of aggregations.
Figure 4: Usage-based optimization query comparison for fast queries
Nevertheless, the key lesson learned here is that the application of usage-based optimization aggregations may have very little effect when you apply them to fast queries.
Lesson Learned: Make sure that when you apply usage-based optimization, you apply it to your long-running queries only. Usage-based optimiation is a complex algorithm that creates aggregations based on the queries that you select. If you end up selecting many queries that are already fast, the impact of these aggregations will only be slight and may even reduce the impact of your long-running queries.
After we take into account the above lessons learned, we can consistently reduce the length of long-running queries by using usage-based optimization to create smart aggregations, as shown in Figure 5.
Figure 5: Using usage-based optimization to improve query performance
A summary of the recommendations to optimize your use of usage-based optimization can be found in the Recommendation section earlier in this technical note. Ultimately, these tests and lessons learned provided us with some valuable lessons learned and affirmation of this feature.
Pingback from 2005 oder 2008 | hilpers
Author: Carl Rabeler Reviewers: Mark Souza, Prem Mehra, Lindsey Allen, Mike Weiner, James Podgorski,
Pingback from Reintroducing Usage-Based Optimization in SQL Server 2008 Analysis Services « Denny Lee's SQL BI and DW Musings
Pingback from Using Trace to Monitor and Audit Analysis Services | MarkTab Data Mining
Pingback from Resource Monitoring | MarkTab Data Mining
Here are this and some other articles on SSAS Performance:
<a href="ssas-wiki.com/.../Articles
ssas-wiki.com/.../Articles
Pingback from Interpretation of column “Dataset” in Query Log - Windows 8 Guide | Server 2008 | Sql Servers | Tutorial
Pingback from Exam 70-448 Study Guide | SQL Server related thoughts
Pingback from Top 10 Performance and Productivity Reasons to Use SQL Server 2008 for Your Business Intelligence Solutions | Developers Blog
Can you provide us with the queries Q1-Q12? I'm using the UBO and not seeing as much of a performance improvement as you got. This makes me wonder what the complexity difference is between our queries and if that even plays a factor.
Nice article , help me lot regarding aggregation design concepts.