Pages

Wednesday, March 30, 2011

Crappy Code Games 2011 – London

Great sponsors, wonderful people and awesome fun – this is how I would summarize the very first version of Crappy Code Games in London.Crappy Code Games 2011

For the uninitiated, Fusion-IO and SQLBits had coordinated to come up with a challenge for all the sql coders - to write the worst code. Yups, you heard it right, the worst code. Further details can be found on Simon Sabin’s (blog ¦ twitter) post.

To recap, I was one of the very first to arrive around 6.00pm at the venue – Revolution Bar in Tower Hill. The venue started filling in by 7.00pm and it was so much of fun walking around and talking with all the SQL peeps. Oh ok, while the lesser mortals like me were busy drinking up the free beer and food, the Gods sat around the system, racking their brains and trying out the challenges. It looks like I was sort of a lucky mascot today, 3 people who were standing near me got an Apple Ipod in the lucky draw and Neil Hambly (blog ¦ twitter), whom I was cheering, ended up getting the first prize in the event.

Neil Hambly at work

The ambience was pretty dark, so couldn’t get any nice photos but I couldn’t resist the urge to paste this half-baked blurry photo of the victors in the podium.

Neil Hambly and the winners with their prizes

A Sony Vaio laptop and 2 XBOXes, the sponsors rocked! Thanks a lot Fusion-IO, once again. If you guys are wondering why I am thanking Fusion-IO again and again, check out the free goodies which I got.Crappy Code goodies

All warmed up for the SQLBits event at Brighton now!

Tuesday, March 15, 2011

Where to store calculated measures – SSAS cube or SSRS reports?

As I mentioned in one of my previous posts, nowadays I am busy preparing the design document of my new project and sending it around for reviews. If there is one question I could pick from the list that keeps coming back to me frequently, that would be this – why are you recommending to store all the report level calculations in the cube rather than in the reports? Usually I get away by just answering it is better for query performance and the reports would come out faster. Of course, I didn’t see any value explaining the technical behind-the-scenes part on why the reports would come out faster to my reviewers (ok, I admit, I am just lazy). But this time, one of my over-zealous peers was adamant on knowing the reason and I decided to demonstrate it to him finally.

Store calculated measure in cube demo

Before I illustrate the actual demonstration, it is necessary to understand the importance of caching in SSAS. Designing a system which will optimally utilize the caching feature is very critical for query performance and user experience. To quote SSAS 2008 Performance Guide

“ During the execution of an MDX query, the query processor stores calculation results in the query processor cache. The primary benefits of the cache are to optimize the evaluation of calculations and to support reuse of calculation results across users. To understand how the query processor uses caching during query execution, consider the following example. You have a calculated member called Profit Margin. When an MDX query requests Profit Margin by Sales Territory, the query processor caches the non null Profit Margin values for each Sales Territory. To manage the reuse of the cached results across users, the query processor distinguishes different contexts in the cache:

Query Context—contains the result of any calculations created by using the WITH keyword within a query. The query context is created on demand and terminates when the query is over. Therefore, the cache of the query context is not shared across queries in a session.
Session Context —contains the result of any calculations created by using the CREATE statement within a given session. The cache of the session context is reused from request to request in the same session, but it is not shared across sessions.
Global Context —contains the result of any calculations that are shared among users. The cache of the global context can be shared across sessions if the sessions share the same security roles.

The contexts are tiered in terms of their level of reusage. At the top, the query context is can be reused only within the query. At the bottom, the global context has the greatest potential for reusage across multiple sessions and users. “

So, in short, calculations defined in Query Context can be reused only within the query while calculations defined in Global context can be reused across sessions. This means that measures which are defined within the global scope will come out faster after the first time it has been run (since it is already there in the cache) compared to the query scope measures.

Now let us see what happens when we execute an MDX query with the same calculation defined within a report and within a cube. For the demonstration, I would be using the Adventure Works cube and the following calculation – ([Measures].[Internet Sales Amount] / 2) +1

Option 1 – Make calculated member within SSRS Report

1) Make a simple report dataset with the following query

WITH MEMBER [measures].[non cache] as
[Measures].[Internet Sales Amount]/2 +1
SELECT NON EMPTY { [measures].[non cache] } ON COLUMNS,
NON EMPTY { ([Customer].[Customer Geography].[State-Province].ALLMEMBERS ) }  ON ROWS
FROM [Adventure Works]

non cache is the name of the new calculated measure.

2) Use the fields from the query to make a simple table in the layout.

Report Layout

3) Before previewing the report, clear the cache of the database cache by running the following XMLA query in Management Studio.

<ClearCache xmlns="http://schemas.microsoft.com/analysisservices/2003/engine">
  <Object>
    <DatabaseID>Adventure Works DW</DatabaseID>
  </Object>
</ClearCache>

4) Keep SQL Server profiler ready and preview the report now. You would be getting the below activity in the trace

1 query scope measure - after cache clearing

We can see some Progress Report Begin and End events which indicates that some I/O is happening to fetch the data. The EventSubClass column of Query Subcube event clearly says that the data being fetched is non-cache data, as expected.

5) Now refresh the report once again and check out the profiler activity.

2 query scope measure - using cache

We can see that there is a Get Data from Cache event. But on further observation, we can see that this data is from the measure group cache, which means that this is from the cache of the storage engine. Data would need to be passed back to the formula engine to execute any possible calculations, as is shown by the following AS querying architecture diagram.

MDX Query execution architecture

The formula Engine processes MDX Queries, figures out what data is needed to cover them, and forwards this information to the storage engine for retrieval. It then performs all calculations needed for the query. The Storage Engine handles all the reading and writing of data. It also retrieves and aggregates the data the formula engine requested at query run time.

Now let us see what happens when we define the calculation inside the cube and use it in our reports.

Option 2 – Make calculated member within SSAS Cube

1) Make the calculation in the calculated member script of the cube as shown below

Create measure in cube

cached is the name of the new calculated measure.

2) Edit the dataset query in the report to use the new measure

SELECT NON EMPTY { [measures].[cached] } ON COLUMNS,
NON EMPTY { ([Customer].[Customer Geography].[State-Province].ALLMEMBERS ) } ON ROWS
FROM [Adventure Works]

3) Make the appropriate changes in the table also to include cached instead of non cache measure.

Report Layout for cached measure

4) Before previewing the report, clear the cache of the database cache by running the following XMLA query in Management Studio.

<ClearCache xmlns="http://schemas.microsoft.com/analysisservices/2003/engine">
<Object>
<DatabaseID>Adventure Works DW</DatabaseID>
</Object>
</ClearCache>

5) Keep SQL Server profiler ready and preview the report now. You would be getting the below activity in the trace

3 global scope measure - after cache clearing

We can see that the activities and time are pretty much the same compared to the first approach when the cache was cleared.

6) Now refresh the report one more time and have a look at the profiler activity.

4 global scope measure - using cache

Voila! Now we can see that there is a Get Data from Cache event and the Text Data clearly says that it is of Global Scope. The description in the EventSubClass column says that the data is from the calculation cache which means it is from the cache of the formula engine.

Note : For the more inquisitive minds, the three main EventSubClass column values of the Get Data from Cache event are

  • 1 – Get data from measure group cache :- storage engine cache, so data from here would need to be passed back to the formula engine to have any calculations executed. This could mean that something in your query is not allowing the calculations to be cached in the formula engine. Some
  • 2 – Get data from flat cache :- formula engine cache, but this is used to store individual cell values and is only used when a query is executing in cell-by-cell mode. So if you see this it is an indication that your query is not executing in block mode
  • 3 – Get data from calculation cache :- formula engine cache and is probably the best type of cache because it is used in block mode and stores the final results of any calculation.

Conclusion

Calculated measures defined within the report use the query context for caching while those defined within the cube uses the global context. Hence, from a query performance angle, it is always better to define measures within the cube than in the reports as the global context would be much faster after the first time the report has been run and also because the cache can be re-used across sessions.

References

This is an advanced topic and at least some of you would have found it a bit confusing (now you know why I don’t explain such stuff! Winking smile). You might want to take your time and go through the following references to know more about Storage Engine & Formula Engine or MDX performance tuning in general.

1) Microsoft SQL Server 2008 Analysis Services Unleashed (Book)
2) SSAS 2008 Performance Guide (Whitepaper)
3) Identifying and Resolving MDX Bottlenecks (Whitepaper)
4) Blogs by Mosha Pasumansky particularly
      a) Inside OLAP Engine: SE cache sharing
      b) Inside OLAP Engine: Cache Prefetching
      c) Take advantage of FE caching to optimize MDX performance
5) Blogs by Chris Webb particularly
      a) Formula Caching and Query Scope
      b) Reporting Services-generated MDX, Subselects and Formula Caching
      c) Missing Members and the Formula Engine Cache
      d) Now() and the Formula Cache
      e) Arbitrary-shaped sets and the Storage Engine cache

Saturday, March 12, 2011

SQL Server Tsql Quiz 2011

Are you one of those who have always wanted to own an Ipad the moment it hit the stores and yet never did because it could burn a hole in your pocket? Are you one of those who like pitting their SQL brains against the rest of the universe and showing them who’s daddy? Or are you one of those who like to read and increase knowledge about SQL Server? If you have answered yes to any of the above questions, it’s time to check out the new boy in the town.

BR TSQL Quiz March 2011

BeyondRelational has coordinated with 31 Microsoft MVPs / experts and come up with this SQL Server feast for all you aficionados. You get to answer questions from some of the best SQL brains out there – every day of the month! And best of all, you even get a chance to win an Ipad sponsored by Redgate!! Have a sneak peek of the quiz masters below

BR Quiz masters

Don’t forget to check out the upcoming quizzes while you are there. And ofcourse, dare you forget to thank me when you get that nice little Ipad Smile.