Pages

Tuesday, January 31, 2012

Set the Slice on your SSAS Cube Partitions now!

I am pretty sure that most of you would have heard or read about this at least once in your SSAS career, but how many of you actually do set slices for your partitions? Well, I am also guilty of the charge and seldom did set slices till a recent issue.

1 Setting slice in SSAS cube partitions

Situation : Duh, that’s strange!
It’s a sunny day in Charlotte and I was just playing around with SQL Server Profiler for learning some performance tuning stuff. I executed this simple MDX query on the sample Adventure Works DB for some purpose which is beyond the scope of this post -

select {[Measures].[Internet Sales Amount]} on 0,
{[Ship Date].[Calendar].[Month].&[2008]&[4]} on 1
from [Adventure Works]

Now while I was going through the Profiler trace, I noticed something strange – three partitions (2005, 2006 and 2008) were being read while I expected just one partition (for 2008) to be read as my query just had April 2008 on the rows.

2 Multiple partitions being read in Profiler trace
Now, for people who are not familiar with Adventure Works cube, there are four partitions for the Internet Sales measure group, one for each year from 2005 to 2008 as shown below -

4 Partitions in Internet Sales Measure Group

I decided to take a detour from my actual purpose to investigate this issue.

Debugging : Need to dig up that grave!
I was aware that SSAS partitions generate automatic data slicers for each partition and these information are present in the info.*.xml file of the partition data file (you can find them usually at C:\Program Files\Microsoft SQL Server\MSAS10_50.MSSQLSERVER\OLAP\Data folder followed by the cube, measure group and partition name if you haven’t changed the default settings. Note that I am using SQL Server 2008 R2). I decided to have a look at these three partition data files.

4 Partition autimatic data slices

As seen above, there was clearly a difference between the partitions of 2005 / 2006 and 2008 (which are having the names Internet_Sales_2001.0.prt / Internet_Sales_2002.0.prt and Internet_Sales_2004.0.prt. These names are the partition IDs and can be seen in the ID property of the partitions)

5 Partition ID property

It seemed like the auto-slices were not getting created for the 2005 and 2006 partition while it was being done for the 2008 partition. This was further confirmed on opening and inspecting the info.*.xml file with notepad.

Knee-jerk Solution : Or is it?
I went into research mode on Google and soon came up with a  property in the msmdsrv.ini file which limits the creation of auto-slicers based on the number of rows in the partition – the IndexBuildThreshold property.

6 Modifying the msmdsrv.ini file

By default, the value is 4096 rows and both the partitions for 2005 and 2006 had less than 4000 rows. This explained why the auto-slicers were not getting created and I modified it to 1000. Now on reprocessing the measure group, I was able to see that the auto-slicers were generated in the partition data files and only the correct partition was being read in the profiler trace.

Actual Solution : The real deal!
I decided to gloat over my findings to my friend and colleague, Javier Guillen (blog | twitter) who was sitting in the next room. When I described the problem statement to him, the very first question he asked me was whether I had set the partition slices, and I was like – sheesh, I missed on that cardinal rule, as usual. I was under the impression that as long as your partition is for a single data member (as in this case – a single year), it should work. But later I found out that partitions for which autoslices are not generated will be scanned anyway, unless the partition slices are explicitly defined. I went back and set the slices for each partition correctly to their respective years and also reverted the IndexBuildThreshold property to the original value of 4096.

7 Setting the partition slices

Once this was done, I processed my measure group and found out that only the correct partitions were being read from the trace files.

Reading Materials
I took some time to read and learn more about partition slices, thought you would be interested in it too if you made it this far.

1) Get most out of partition slices - Mosha Pasumansky
2) SSAS Partition Slicing - Eric Jacobsen
3) Partition Slice Impact on Query Performance - Jesse Orosz

Sunday, January 22, 2012

Using TraceEvent for MDX Debugging

Do you know how it feels like to hear an old song and have a whole lot of memories come flooding back to you? Well, something sort of similar happened to me today. I was doing my weekly reading and going through a brilliant post by Microsoft MVP - Greg Galloway (blog). That was when I noticed that he had released a new functionality to the Analysis Services Stored Procedures (ASSP) Project on Codeplex. I decided to test this new functionality and was thinking of a scenario where I could use it, when I remembered about an old post in the Analysis Services forum that I intended to blog about but never did. New-Tool-in-the-ASSP-Toolkit_thumb1

Now the new addition to the ASSP toolkit is the TraceEvent class. Quoting from the ASSP project page -

“The function in the TraceEvent class is FireTraceEventAndReturnValue. It can be used to detect when Analysis Services is evaluating an expression since the function merely fires a trace event that appears in Profiler as a User Defined event, then returns the integer value you pass in. This function can be used in a scenario such as determining when the cell security expression is evaluated.

Here is an example MDX query displaying the usage of this function:

with member [Measures].[test] as ASSP.FireTraceEventAndReturnValue(99)
select [Measures].[test] on 0
from [Adventure Works]

The signature of the function is:
FireTraceEventAndReturnValue(int)”


Having a base in programming, it was natural for me to expect a sort of “watch window” when I started learning MDX. I wanted to view the values and the change in them while the query was being debugged, but sadly there was none. The TraceEvent class will help alleviate that pain to some extent and I will be showing it in the next few minutes. For the purpose of demonstration, I would be taking the two queries from the AS forum post I mentioned before -


I) PROBLEM


Query 1



with
member [Product].[Product Categories].[Agg] as Aggregate(TopCount([Product].[Product Categories].[Product].members, 3, [Measures].[Internet Sales Amount]))
select {[Measures].[Internet Sales Amount]} on 0,
[Date].[Calendar].[Month].members on 1
from [Adventure Works]
where ([Product].[Product Categories].[Agg])

Query 2



select {[Measures].[Internet Sales Amount]} on 0,
[Date].[Calendar].[Month].members on 1
from [Adventure Works]
where (TopCount([Product].[Product Categories].[Product].members, 3, [Measures].[Internet Sales Amount]))

Results


1-MDX-Query-results_thumb


For a MDX beginner, you would expect the results to be same for both queries, as you are just moving the formula of the calculated member in Query 1 to the WHERE clause in Query 2. The reason for the difference is because calculations overwrite current coordinates while axes don't overwrite the current context. So when the first query is being executed, the current coordinates are overwritten and executed in the context of the query axis which means it will check the top 3 products for each of the month and display the result. However, in the second case, you don't have any calculations defined and hence the current coordinates would be the one in the WHERE clause, which is the Top count of products across all months. (You would get a better idea by reading the original post in the forum)


Enough of theory and let us see whether we can debug and understand the working of the mdx queries using the TraceEvent class. For the purpose of seeing the results, we will modify the queries as shown below and see the results in the Profiler.


II) DEBUGGING THE PROBLEM


Query 1 for July 2005



with
member [Product].[Product Categories].[Agg] as Aggregate(TopCount([Product].[Product Categories].[Product].members, 3, ASSP.FireTraceEventAndReturnValue([Measures].[Internet Sales Amount])))
select {[Measures].[Internet Sales Amount]} on 0,
[Date].[Calendar].[Month].&[2005]&[7] on 1
from [Adventure Works]
where ([Product].[Product Categories].[Agg])

Profiler Results for Query 1


2-Query-1-Profiler-results_thumb1


Even without knowing the theory part, you can find out that the measure gets evaluated at 4, which is after the axes get evaluated (1, 2 and 3 in the image above) and hence, the top 3 products would be calculated for the current context of July 2005. I have not pasted the entire trace here but if you run it, you can see that the measure values for each product for the month of July 2005 is returned in the Integer Data column. I have marked the top 3 values in red and you can see that this corresponds to $261,231 which is what we saw in the results for the first query.


Query 2 for July 2005



select {[Measures].[Internet Sales Amount]} on 0,
[Date].[Calendar].[Month].&[2005]&[7] on 1
from [Adventure Works]
where (TopCount([Product].[Product Categories].[Product].members, 3, ASSP.FireTraceEventAndReturnValue([Measures].[Internet Sales Amount])))

Profiler Results for Query 2


3-Query-2-Profiler-results_thumb1


Here, you can see that the measure gets evaluated at 1, which is before the axes get evaluated (3 and 4 in the image above). I have also marked the top three values for the products, and on close examination, it can be found to be the top 3 values for products across all months.


4-Top-3-products-for-all-months_thum


So once the top products are found, the axes are evaluated and then the results are serialized in 5, which gives us the value of $221,852 which is what we saw in the results for the second query.


III) SUMMARY


Even without knowing the theory, we were able to debug the MDX queries and find out why there was a difference with the help of the TraceEvent function. This technique can be elaborated and used across variable scenarios for query debugging. In short, the TraceEvent functionality is a great function to have when you don’t know why you are getting some particular results in your queries. And yeah, don’t forget to thank Greg for this!Smile

Sunday, January 15, 2012

Reducing SSRS RDL size by uploading Shapefiles to ReportServer

I don’t know about you guys, but I am really petrified of public speaking. So I always try to read about it and it really gives me some comfort to know that I am not alone. In fact, if your fear of public speaking is between mild anxiety and complete terror, you are said to be well within the normal range. I would not be exaggerating if I said that some studies show that there are people who rate their fear of public speaking as more severe than fear of death. Last year was when I finally decided to embrace my fear and I managed to speak at a local user group event in London and also at SQLBits (trust me, it wasn’t that hard as I thought it would be). And since now is the time everyone makes resolutions, I was trying to take some hint from 50 New Year’s Resolutions for Public Speakers and watching my presentation video from the last SQLBits. Seeing it, I realized I had missed mentioning an important point regarding map reports and hence decided to blog about it before I get too lazy.

REDUCING RDL SIZE BY UPLOADING SHAPEFILES TO REPORTSERVER

People who are familiar with map report development would already be aware of the size issues when dealing with shapefiles (If you are a complete newbie on map reports, I would strongly recommend you to go through my presentation video). Most of the shapefiles are in MBs and since this data is embedded in the report RDLs, the size of the RDL also goes into MBs. The problem as well as the solution are demonstrated below:-

1) Make a simple report from a shapefile by selecting the Basic Map option in the Map Wizard. I haven’t linked the shapefile to a dataset for simplicity purpose.

1 map report from Colombia shapefile

2) Now you can verify that the shapefile data has been embedded in the report by right clicking on the report in the solution explorer, and selecting the view code option.

2 embedded spatial data in report rdl

You can see that the territory names are there in the rdl code.

3) Now deploy the report to the report server and then, download the rdl to check the size.

3 Shapefile rdl size

You can see that the rdl size is 3.11 MB. The shapefile that I had used for this report was 2.4 MB.

4) Now to solve this issue, we will have to upload the shapefile (.shp and .dbf files) to the report server. For this, navigate to the folder in Report Manager and click on upload. Then browse to the shp and dbf files and click ok.

4 Upload shp and dbf to reportserver

5) Once this is done, go back to the report in BIDS and click on the map twice to bring the map layers panel on the right. Right click on the polygon layer and select Layer data option.

5 Selecting layer data in map layers

6) Now change the option from Data Embedded in Report to Link to ESRI Shapefile. Now give the location of the files in your report server (In this case, \Blog Reports\COL_adm1.shp). Note that the location has to start with a slash followed by the folder names within the report manager and then the file name of the shp file.

6 Linking the map to the uploaded files in reportserver 

7) Once this is done, you will not be able to view the results in BIDS. You can view the report code to confirm that the map data is not embedded in the rdl now. Just deploy the report after that and preview it in report manager to ensure that it is working.

8) Now, download the rdl from the report server and you can see that the size has been considerably reduced, from over 3 MB to 12.2 KB.

7 Modified Shapefile reduced rdl size

It would actually be a good practice to store your shapefiles in the report server at the end of your development. This way, the shapefiles can be reused by multiple reports if needed. Also, you would not need to open BIDS in case an updated version of the shapefile becomes available later. This technique can also be used for report images though I see little benefit from it as usually the images are only a few KBs. As for performance, I quickly checked a couple of times and the rendering time was a little less using this technique. Maybe I will do a detailed performance test and post a blog on it later. Till then, adieus amigos!

Sunday, January 8, 2012

A New Year and a New Job

For people who are connected with me, this would not come as much of a shock but yes, with this post comes a big change for me. By the time this post gets published, I would have already moved on from MindTree Ltd and started work at Mariner, my new work place. This would also mean that I have changed my base from London, UK to Charlotte, USA.

Mariner

I have lots of fond memories about MindTree and the decision to part was indeed very difficult. I will definitely miss my colleagues there and wish them all the very best for the future. At the same time, I am also very excited to join Mariner and look forward to working and learning from some of the most talented people I have known. Hopefully, I will also be able to contribute something.

The new year has got me a new job and I am looking forward to a lot of things. I would like to end this post with a New Year wish that I had read sometime back, for all of my readers-

I hope you will have a wonderful year, that you'll dream dangerously and outrageously, that you'll make something that didn't exist before you made it, that you will be loved and that you will be liked, and that you will have people to love and to like in return. And, most importantly (because I think there should be more kindness and more wisdom in the world right now), that you will, when you need to be, be wise, and that you will always be kind. I hope that in this year to come, you make mistakes.Because if you are making mistakes, then you are making new things, trying new things, learning, living, pushing yourself, changing yourself, changing your world. You're doing things you've never done before, and more importantly, you're Doing Something. Whatever it is you're scared of doing, Do it. And I hope, somewhere in the next year, you surprise yourself!