Pages

Thursday, June 21, 2012

Going beyond Geospatial Analysis with SSRS Map Reports

When I came to the US six months ago, I had set myself a goal of doing more presentations than I did last year in the UK (which was two – one at the London BI User group and another at SQLBits 9). Well, I am glad to say that I achieved this goal within the last two months where I was able to present at four SQL Saturdays (#118, #130, #129 & #121), one code camp (Carolina Code Camp) and at my local UG in Charlotte. I have already got two other SQL Saturdays confirmed – #122 (Louisville, KY) on July 21 and #158 (New York City) on August 4 (so if you are around, please don’t forget to come over and say a hi). One of my favorite topics to present is on Map Reports in SSRS, and a statement that I just can’t stress enough is that the map reports feature can be used much beyond geospatial analysis. You don’t necessarily have to restrict yourself with geographical boundaries, and this post will give an example on the same. (By the way, if you are reading this post before June 26 and are interested in learning more about map reports, you might want to attend my online session – ‘Fast Track to Spatial Reporting using SSRS 2012’ through PASS BI Virtual Chapter)

image

The inspiration for this post is a demo from the Tableau product gallery where tooth decay is being visualized using images for the tooth, and I thought of replicating the same with SSRS. Follow the steps below to make a simple report:-

1) Get an image for the tooth set and then by using an online image mapping site like Image-Maps, you can trace the coordinates. Once you get the coordinates, you can convert it into polygons using the SQL function - geometry::STPolyFromText. You can download the SQL query for the tooth set that I created from the image below

There are many other ways in which you can obtain the same results, though I usually use Image-Maps because it is free.

2) Create a table called dbo.Teeth and store the results of the above query into this table.

3) Create a table called dbo.[Teeth Data] and store the results of the csv sheet below into this table.

4) Now that you have both the spatial data as well as the analytical data, create a new report and make 2 datasets – DST_Spatial for the spatial dataset (which will have the columns from table dbo.Teeth) and DST_Analytical for the analytical dataset (which will have the columns from the table dbo.[Teeth Data])

image

5) Drag and drop the map report item from the toolbox. From the wizard, select the SQL Server spatial query as the data source and choose DST_Spatial as the Spatial dataset. Select Color Analytical map as the map visualization and choose DST_Analytical as the analytical dataset. The match fields should be Name and Tooth as shown below:-

image

6) Select Decay_Scale measure as the field to visualize and click on finish. Now we should be able to see the map report item in design mode.

image

7) Click on preview and you should be able to see the result.

image

Here, I have just made a very simple report and this can be extended with all the common SSRS features like drill-down, report parameters, color formatting, tooltips, labels, etc. Compare this sort of a spatial report against a tabular report. Here, a dentist might be able to make more sense of the effect in tooth decay on neighboring teeth and hence can be very useful. Hopefully, this short post will make you think of the map report item in a new light now.

Monday, June 18, 2012

Book Review : Visualizing Data with Microsoft Power View

“A truly good book teaches me better than to read it. I must soon lay it down, and commence living on its hint. What I began by reading, I must finish by acting” - I still remember the first time I read this quote from an article about Henry David Thoreau. Even though it was a long time back, I can’t forget the ample amount of time I spent pondering over the depth of these seemingly simple words. It is arguable whether the meaning of these great words ever dawned upon the 14 year old that was me at that time, but what I can say with conviction is that this new book that I am reviewing made me remember and reflect about it once again.
Buy "Visualizing Data with Microsoft Power View"

SQL Server 2012 has been an exciting release for techno geeks like me, and Power View has been one of the star attractions. Being a reporting enthusiast, I spared no pains in getting my laptop upgraded so that I could get to the bare minimums for testing out Power View. From the little time I spent with Power View, I found it to be really simple and easy to understand. Hence when I heard that there was a new book on Power View coming to the stands, my first question was – How on earth could there be enough matter to fill a whole book on Power View? I was pleasantly surprised when I received my copy of “Visualizing Data with Microsoft Power View”.

The authors (Brian Larson, Mark Davis, Dan English and Paul Purington) have done a splendid job of explaining all the features of Power View in an easily understandable format. Even though the target audience for this book are Power View beginners which includes non-technical business users, it also gives professionals who are acquainted with the tool (like me) an opportunity to review all the functionalities and fix the gaps in learning. Let me summarize the contents in a nutshell -the initial chapter takes the reader through an introduction of Power View and chapters 2, 3 and 4 concentrate on the different visualizations available in Power View. Chapter 5 builds up on the previous chapters and discusses the interactive features of Power View (this is the chapter where my favourite feature of Power View is being explained – the Play axis). Chapter 6 gives a fitting end to the Power View part by explaining how to save, secure, print and export Power View reports. Apart from the above 6 chapters, there are another 5 chapters on how to create a BI Semantic Model (BISM). This is important as Power View requires this layer called  BISM between the report and the data. As the author rightly says in the video, this layer might already be built for the users by the IT team. But there is nothing to fear even if this is not the case as the chapters give a good introduction on BISM. This is not intended for the readers who are trying to get advanced knowledge in BISM and there are other books in the market for that. But if your main intention is to create Power View reports and you would like to learn the basics of BISM for the said purpose, there is no better book than this at the moment. The book also has an accompanying DVD with more than 4 hours of video demonstrations for people who like to learn visually. The appendix section gives instructions on how to setup a virtual learning environment as well as configuring the sample data. With all that said, I can say this book is a must have for anyone interested in using Power View reports or looking to expand their knowledge on Power View. Go ahead and buy the paperback edition from this link or get it on your Kindle from this link (it is at a very affordable price too). Meanwhile, I will be having fun going through the exercises. As I said in the beginning – What I began by reading, I must finish by acting.

Thursday, June 7, 2012

LastNonEmpty in Tabular mode : Part 1

My experience as a consultant has mostly been in the manufacturing sector and no wonder, I had frequent encounters with the notorious LastNonEmpty aggregation type in SSAS. I wouldn’t be exaggerating if I say that I had struggled quite a bit to understand and adapt this aggregation type to my needs when dealing with inventory, stock and status related data. I had dealt with varying scenarios but never thought of classifying these findings until a chance discussion with Javier Guillen (blog | twitter) prompted me otherwise. Together, we went through the scenarios and tried classifying them as well as applying them to the tabular mode and this post will be the first part of that effort.

image

For the sake of simplicity, we adapted our discussions to the manufacturing scenario and the key dimensions used in the explanation below would be Time, Store and Product. However, this can be easily translated to other scenarios where the LastNonEmpty might be required. The LastNonEmpty (LNE) scenarios can be broadly classified into 2:- the typical Last Non Empty behavior of SSAS (where you would want to find the last non empty value within the selected time period) and the Last Ever Non Empty (credit for term usage - Chris Webb in this post) behavior (where you would want to find the last non empty value from all preceding time periods instead of just the selected time period). These types can be further broken down to at an overall level or at a store level as is shown in the chart below.

 image

Let me briefly try to explain the four scenarios with an example.

image

Assume that S1 - S8 are stores and C1, C2, C3, C4 are their parent groups as shown in the figure. The stock for each of the stores as well as the date in which they were recorded is also shown above. Now, the 4 LNE scenarios are given below:-

1) Last Non Empty(All Stores)

for C1 in 2012: 5+5 = 10 (last non empty date is 31-May-2012)
for C2 in 2012: 25 (last non empty date is 12-May-2012)
for C3 in 2012: None (no value in 2012)
for C4 in 2012: None (no value in 2012)
for Total in 2012: 5+5 = 10 (last non empty date is 31-May-2012)

This is equivalent to the LastNonEmpty aggregation type in SSAS multi-dimensional, where the last non empty date is calculated across all the stores within the selected time period and the values in that date are summed up. 

2) Last Non Empty(Each Store)

for C1 in 2012: 5+5 = 10 (last non empty date is 31-May-2012 for both stores)
for C2 in 2012: 10+25 = 35 (last non empty date is 12-Jan-2012 for S3 and 12-May-2012 for S4)
for C3 in 2012: None (no value in 2012)
for C4 in 2012: None (no value in 2012)
for Total in 2012: 5+5+10+25 = 45 (sum of all the last non empty dates for each store)

Here, as shown in the example, the last non empty value is calculated for each store within the selected time period and then summed up. The equivalent of this aggregation type in SSAS multi-dimensional has already been explained by Teo Lachev in this post.

3) Last Ever Non Empty(All Stores)

for C1 in 2012: 5+5 = 10 (last non empty date is 31-May-2012)
for C2 in 2012: 25 (last non empty date is 12-May-2012)
for C3 in 2012: 15 (no value in 2012, but there is value in 12-Dec-2011 for S6)
for C4 in 2012: 10 (no value in 2012, but there is value in 12-Dec-2011 for S8)
for Total in 2012: 5+5 = 10 (last non empty date is 31-May-2012)

Here, the last non empty date is calculated across all the stores from the very first date in the database to the last date of the selected time period and the values in that date only are summed up. Note that the last non empty date need not belong to the selected time period here, as is seen from the values for C3 and C4. The equivalent of this aggregation type in SSAS multi-dimensional has already been explained by Chris Webb in this post.

4) Last Ever Non Empty(Each Store)

for C1 in 2012: 5+5 = 10 (last non empty date is 31-May-2012)
for C2 in 2012: 10+25 = 35 (last non empty date is 12-Jan-2012 for S3 and 12-May-2012 for S4)
for C3 in 2012: 10+15 = 25 (no value in 2012, but there is value in 12-Nov-2011 for S5 and 12-Dec-2011 for S6)
for C4 in 2012: 5+10 = 15 (no value in 2012, but there is value in 12-Nov-2011 for S7 and 12-Dec-2011 for S8)
for Total in 2012: 10+35+25+15 = 85 (sum of all the last non empty dates for each store)

Here, the last non empty date is calculated for each store from the very first date in the database to the last date of the selected time period and the values in that date are summed up for each store. Note that the last non empty date need not belong to the selected time period here, as is seen from the values for C3 and C4. It is not possible to have a really well performing method in SSAS multi-dimensional for this aggregation type without insanely extrapolating the fact data; this leaves us with only the extremely slow choice of making a calculated measure.

DAX is very powerful for performing such calculations and by the end of this series, you will see the performance gains that comes with having these scenarios in SSAS tabular. Let me go ahead and give the solutions for the first two types of Last Non Empty in DAX. I am using the Contoso database for my examples and they can be downloaded from this location.

1) Last Non Empty(All Stores)

An example of this scenario is when the product stock for all stores are recorded on a regular interval on the same date (daily, weekly, monthly, etc). As long as all the stores have data for the same dates, this formula will work in calculating the last non empty value. This is the simplest calculation of all the four and can be done using the DAX formula below

LNE_1:= Calculate (
Sum ( FactInventory[OnHandQuantity] ),
Filter ( DimDate, DimDate[DateKey] = Max ( FactInventory[DateKey] ) )
)

or

LNE_2:= Calculate (
Sum ( FactInventory[OnHandQuantity] ),
LastDate ( FactInventory[DateKey] )
)

Performance wise, the results of both of them came back pretty fast and at almost the same time.

Perf_LNEAllStores

It is interesting to note that we are using FactInventory[DateKey] to get the last non empty date at which data is present. If we substitute the date key from the fact with the dimension in the above formula, we will get the DAX formula equivalent to the LastChild aggregation type in SSAS multidimensional.

LastChild:= Calculate (
Sum ( FactInventory[OnHandQuantity] ),
LastDate ( DimDate[DateKey] )
)

2) Last Non Empty(Each Store)

Now, this scenario is used when the stock is recorded for all stores on a regular interval but not necessarily on the same date for all the stores (However, it is assumed that the stock for all the products in a store would be taken on the same day). For eg, the stock for all the stores may be recorded monthly, but the stock would be recorded for some stores on the first week of the month, and some may be on the third or last week of the month. There are different ways in which you can write this calculation, and some of the ways I could find is written below:-

LNE_St_1:=
SumX (
Values ( DimStore[StoreKey] ),
Calculate (
Sum ( FactInventory[OnHandQuantity] ),
LastNonBlank (
FactInventory[Datekey],
1 )
)
)

or

LNE_St_2:=
SumX (
Values ( DimStore[StoreKey] ),
Calculate (
Sum ( FactInventory[OnHandQuantity] ),
LastDate ( FactInventory[Datekey] )
)
)

If you had read these posts (1 and 2), you might notice the recurring use of the pattern SumX(Values()) explained in those posts in both of the above formulas. You can also substitute the SumX(Values()) pattern with a SumX(Summarize()) pattern as shown below

LNE_St_3:=
SumX (
  Summarize (
    DimStore,
    DimStore[StoreKey],
    "LNE_P",
   Calculate ( Max ( FactInventory[DateKey] ) )
  ),
  Calculate (
    Sum ( FactInventory[OnHandQuantity] ),
    Filter (
      All ( Dimdate ),
      DimDate[DateKey] = [LNE_P]
    )
  )
)

The performance of all the formulas listed above were really good (all of them came back in 1 second). But the moment you add one more dimension to the formula (for eg, you need to find the last non empty for each store and product combination), there seems to be significant differences between the performance and I am currently doing some more testing on them (Well, might be a topic for another blog!). Meanwhile, you can go ahead and read the next part of this series in Javier’s blog.

Reference & Interesting Reads

1) Chris Webb - Last Ever Non Empty – a new, fast MDX approach
2) Teo Lachev - Last Non Empty Affairs
3) Jason Thomas - Changing Granularity of Leaf Level Calculations in SSAS Tabular
4) Javier Guillen - SCOPING at different granularities in DAX (Part I)
5) Paul te Braak - The Last Non-Empty Dilemma TABULAR Style
6) Hilmar Buchta - Sparse Snapshots in DAX / BISM Tabular