Pages

Tuesday, February 26, 2013

100% Visualization in SSRS

Recently, Bill Jelen aka Mr Excel came up with a podcast that showcased a visualization that he termed as the “100% Visualization”. I have never seen this visualization before and I am not sure whether Stephen Few would approve of it. But I found the concept interesting and decided to exercise my grey cells by creating this in SSRS. Read on for the solution.

100% visualization in SSRS

1) Create a new report with a simple dataset having Category name and percentage value

SELECT     'Bikes' AS Category, 82 AS Pc
UNION ALL
SELECT     'Accessories' AS Category, 49 AS Pc
UNION ALL
SELECT     'Textile' AS Category, 24 AS Pc
UNION ALL
SELECT     'Furniture' AS Category, 99 AS Pc

Dataset creation

2) Make a simple tablix with 10 columns and 10 rows. Then fill the cell values from 1 to 100 as shown below

Making the 10 * 10 matrix

It is good to make all the cells of the same size and squares (for eg, size = 0.6,0.6)

3) Now make the border color as white, border style as solid, border width as 4pt and FontSize as 2pt.

formatting the 10 * 10 matrix

4) Add the custom code below

Public Shared Function ColorDWB(ByVal Value As Decimal, ByVal MaxPositive As Decimal, ByVal Neutral As Decimal, ByVal ColStr As String) As String

Dim ColVar1 As Integer
Dim ColVar2 As Integer
Dim ColVar3 As Integer

'Split the #RGB color to R, G, and B components
ColVar1=Convert.ToInt32(left(right(ColStr, 6),2),16)
ColVar2=Convert.ToInt32(left(right(ColStr, 4),2),16)
ColVar3=Convert.ToInt32(right(ColStr, 2),16)

'Find Largest Range
Dim decPosRange As Decimal = Math.Abs(MaxPositive - Neutral)

'Find appropriate color shade
Dim Shd As Decimal = 255
   
Dim iColor1 As Integer
Dim iColor2 As Integer
Dim iColor3 As Integer
Dim strColor As String

'Reduce a shade for each of the R,G,B components
iColor1 = ColVar1 + CInt(Math.Round((MaxPositive-Value) * ((Shd - ColVar1) / decPosRange)))
iColor2 = ColVar2 + CInt(Math.Round((MaxPositive-Value) * ((Shd - ColVar2) / decPosRange)))
iColor3 = ColVar3 + CInt(Math.Round((MaxPositive-Value) * ((Shd - ColVar3) / decPosRange)))

'Return the new color
strColor = "#" & iColor1.ToString("X2")  & iColor2.ToString("X2") & iColor3.ToString("X2")   

Return strColor
End Function


Private colorPalette As String() = { "#C85200", "#FF800E", "#5F9ED1", "#2CA02C", "#A59D93", "#B8341B", "#352F26", "#F1E7D6", "#E16C56", "#CFBA9B"}

Private count As Integer = 0
Private mapping As New System.Collections.Hashtable()

Public Function GetColor(ByVal groupingValue As String) As String
   If mapping.ContainsKey(groupingValue) Then
      Return mapping(groupingValue)
   End If
   Dim c As String = colorPalette(count Mod colorPalette.Length)
   count = count + 1
   mapping.Add(groupingValue, c)
   Return c
End Function

The GetColor() is used to select new colors for each category type and the ColorDWB() is used to get a lighter shade of the selected color (you might recognize the ColorDWB function from my post on Custom Code for Color Gradation in SSRS).

5) Add the expression below for the Color and BackgroundColor property for all the cells.

=iif(Me.Value <= Fields!Pc.Value, Code.GetColor(Fields!Category.Value), Code.ColorDWB(2, 10, 1, Code.GetColor(Fields!Category.Value))  )

Adding color expressions

6) Add a textbox on the top with the Category value, and a textbox below with the Pc field value and place all three objects inside a rectangle to keep them together. You can also format it by adding a background color of Green and a border of White, if needed.

Adding textboxes

7) Now add a matrix with Category field on column group. Then, add the rectangle with the three objects inside the value field.

Adding the parent matrix

Now you can delete the row and the column for the matrix so that it looks like shown below.

row and columns deleted

8) Preview it to see the result.

End result

Isn’t that pretty? And if there are more charts than what can be fitted in the horizontal way, you might want to check this post on how to repeat it horizontally as well as vertically. As always, you can download the SQL 2012 rdl file from here.

Tuesday, February 19, 2013

Bandlines in SSRS

Sometime in January, visualization expert Stephen Few expanded on Tufte’s idea of Sparklines and came up with this new visualization called Bandlines. Bandlines use horizontal bands of color in the background of the plot area to display information about the distribution of values and you can read a detailed description on bandlines in Stephen’s article. Companies like XLCubed have already started integrating this within their products and it was only a matter of time before yours truly tried to replicate the same within SSRS. Read on for the solution.

Bandlines in SSRS

One of the most important steps to reproduce this in SSRS is to find the percentiles (25th and 75th) in your dataset query. You can do it the hard way by coming up with the logic and then computing it within your dataset. I chose not to reinvent the wheel and used Richard Mintz’s (yes, the same guy who came up with the code behind Squarified Heatmaps in SSRS) SSAS stored procedure for the same. He has written an excellent post on how to compute percentiles in both SQL and MDX, so you can use his code depending on whether your source is a relational database or a multidimensional cube. For this demo, my source would be the AdventureWorks multidimensional cube. Follow the steps below to reproduce bandlines:-

1) Download the dll from here and add it to your SSAS instance as per the instructions at the end of Richard’s post. I have named the assembly as SPPercentile.

Adding assembly

2) Make a dataset query which will show the Internet Sales amount for a product category and month with the year as a parameter (optional, dataset query for year parameter is not shown here). Also make 2 calculated measures which will show the maximum and minimum values for internet sales by category across the months.

WITH MEMBER measures.[maxP] AS
Max (
StrToSet ( @DateCalendarYear ) * [Product].[Category].CurrentMember * [Date].[Month Name].[All] * [Date].[Calendar].[Month].AllMembers,
    [Measures].[Internet Sales Amount]
  )
MEMBER measures.[minP] AS
Min (
StrToSet ( @DateCalendarYear ) * [Product].[Category].CurrentMember * [Date].[Month Name].[All] * [Date].[Calendar].[Month].AllMembers,
    [Measures].[Internet Sales Amount]
  )
SELECT NON EMPTY
{ [Measures].[Internet Sales Amount], measures.[maxP], measures.[minP] } ON COLUMNS,
{
NonEmpty ( [Date].[Month Name].Children, [Measures].[Internet Sales Amount] ) * [Product].[Category].[Category].AllMembers
} ON ROWS
FROM [Adventure Works]
WHERE StrToSet (
  @DateCalendarYear
)

Max and min stay same for categories

Note that the maximum / minimum will be the same for a category across all months. This is required for doing the scaling of the bandlines, and will be explained in a later step.

3) Make a dataset query for getting the 75th and 25th percentiles.

WITH MEMBER measures.[25th percentile] AS
[SPPercentile].ValueAtPercentile ( NonEmpty (
    ( StrToSet ( @DateCalendarYear ) * [Date].[Calendar].[Month].AllMembers * [Product].[Category].Children,
    [Measures].[Internet Sales Amount] )
  ),
  [Measures].[Internet Sales Amount],
  .25,
TRUE,
"INC" )
MEMBER measures.[75th percentile] AS
[SPPercentile].ValueAtPercentile ( NonEmpty (
    ( StrToSet ( @DateCalendarYear ) * [Date].[Calendar].[Month].AllMembers * [Product].[Category].Children,
    [Measures].[Internet Sales Amount] )
  ),
  [Measures].[Internet Sales Amount],
  .75,
TRUE,
"INC" )
SELECT NON EMPTY
{ measures.[25th percentile], measures.[75th percentile] } ON COLUMNS
FROM [Adventure Works]

Getting 25 and 75 percentile

Note that we are using the [SPPercentile].ValueAtPercentile() stored procedure and you can find the syntax in Richard’s post. Now we should be having the following three datasets.

Datasets

4) Now that we have found out the percentiles, store their values in two hidden report parameters respectively – PC25 and PC75. Make sure that that the available values and default values are obtained from the corresponding field in DST_Percentile dataset as shown below.

Report parameter properties for PC25 and PC75

5) Now drag and drop a matrix into the layout. Use the Categories on the rows and drag and drop a line chart to the values as shown below. In the chart, use Internet Sales Amount as the Values, Month Name as the category group and Category as the Series Group.

Getting the chart inside matrix

6) Make the max and min of the vertical axis as the maxP and minP fields. This will help in scaling the sparkline (yes, by now we have got the basic structure of a sparkline).

Scaling the chart

Also, remove the vertical axis, horizontal axis, legend and all other titles.

7) Add a value to the chart with the expression below.

=iif(Fields!maxP.Value <= Parameters!PC75.Value, Nothing, Fields!maxP.Value)

Make the chart type as Area Chart and color as #f5f5f5

Adding the 75-100 percentile band

This will act as the band for the values between 75 and 100 percentile, so ensure it is at the very top as shown in the picture above. If the max value for the category is lesser than the PC75, we don’t want to show this band.

8) Add another value to the chart with the expression below

=iif(Fields!maxP.Value >=Parameters!PC75.Value, Parameters!PC75.Value, Fields!maxP.Value)

Make the chart type as Area Chart and color as #d7d7d7

Adding the 25-75 percentile band

This will act as the band between 25 and 75 percentile and should be appearing on top of the 75-100 band, so ensure that it appears just under the previous value as shown in the picture above.

9) Add the last band by adding the expression below

=iif(Fields!maxP.Value <= Parameters!PC25.Value, Fields!maxP.Value, iif(Fields!minP.Value>=Parameters!PC25.Value, Nothing, Parameters!PC25.Value))

Make the chart type as Area Chart and color as #adadad

Adding the 0-25 percentile band

Again ensure that this value appears just under the previous one, as shown in the picture above.

10) Now, we can preview it and see the results.

Preview the results

From the markers, you can also note that there are 24 values. Out of that, there are 6 values in the 75-100 band, 12 values in the 25-75 band and 6 in the 0-25 band. So the percentiles are indeed working! And very quickly, you can make out that Bikes have more of their values in the 75-100 percentile, an extra bit of information that you wouldn’t have got with sparklines. That is the power f Bandlines.

11) Now with a bit of formatting, this is the final end result and you can download the completed report rdl (SSRS 2012) from here.

The formatted bandline in ssrs

Tuesday, February 12, 2013

Custom Code for Color Gradation in SSRS

Throughout my career as a Business Intelligence Consultant, I have met and interacted with a lot of DBAs. One thing (among many others) I particularly admire about them is that they have their own list of SQL scripts that they carry with them from job to job (and my observation is that the list usually grows with experience). This way, they don’t have to remember all of them, but when the need arises, they have the scripts at their fingertips. The only thing I have close to that is my list of custom codes for doing some specific operations in SSRS. One of my most frequently used scripts originates from this series of posts on how to conditionally color in SSRS. Based on my needs, I had tweaked the code such that it will display the color gradation to white for any input color. Thought that this might come in handy for some of you guys.

Custom code for Color gradation in SSRS

Code

Public Shared Function ColorDWB(ByVal Value As Decimal, ByVal MaxPositive As Decimal, ByVal Neutral As Decimal, ByVal ColStr As String) As String
Dim ColVar1 As Integer
Dim ColVar2 As Integer
Dim ColVar3 As Integer

'Split the #RGB color to R, G, and B components
ColVar1=Convert.ToInt32(left(right(ColStr, 6),2),16)
ColVar2=Convert.ToInt32(left(right(ColStr, 4),2),16)
ColVar3=Convert.ToInt32(right(ColStr, 2),16)

'Find Largest Range
Dim decPosRange As Decimal = Math.Abs(MaxPositive - Neutral)

'Find appropriate color shade
Dim Shd As Decimal = 255
   
Dim iColor1 As Integer
Dim iColor2 As Integer
Dim iColor3 As Integer
Dim strColor As String

'Reduce a shade for each of the R,G,B components
iColor1 = ColVar1 + CInt(Math.Round((MaxPositive-Value) * ((Shd - ColVar1) / decPosRange)))
iColor2 = ColVar2 + CInt(Math.Round((MaxPositive-Value) * ((Shd - ColVar2) / decPosRange)))
iColor3 = ColVar3 + CInt(Math.Round((MaxPositive-Value) * ((Shd - ColVar3) / decPosRange)))

'Return the new color
strColor = "#" & iColor1.ToString("X2")  & iColor2.ToString("X2") & iColor3.ToString("X2")    

Return strColor
End Function

How To Use

1) Lets say we have a simple report which shows the sales by vehicle type and month

report

2)  You can either compile this code and use it in your report or just paste it in the code tab of the Report Properties. If we just need to get different shades of a color (say #2322EE), we can just use the expression below as the BackgroundColor property of the cell.

=Code.ColorDWB(sum(Fields!Sales.Value), Max(Fields!Sales.Value, "DataSet1"), Min(Fields!Sales.Value, "DataSet1"), "#2322EE")

Color gradation across entire table

3) If you need the color gradation to be based on each row, then make sure that the max and min values are passed for the row (in this case, the Vehicle group) rather than the whole dataset.

=Code.ColorDWB(sum(Fields!Sales.Value), Max(Fields!Sales.Value, "Vehicle"), Min(Fields!Sales.Value, "Vehicle"), "#2322EE")

Color gradation across each row in table

4) If you need different colors for your vehicle, you can make use of the GetColor() custom code and then pass that in your expression:-

=Code.ColorDWB(sum(Fields!Sales.Value), Max(Fields!Sales.Value, "Vehicle"), Min(Fields!Sales.Value, "Vehicle"), Code.GetColor(Fields!Vehicle.Value)) 

Color gradation across each row in table with different color

Just make sure that the colors in the palette are using the hex values and not just strings like “Red”.

Make sure colors are hex values and not strings in the palette

5) You can also use this code to go across two colors. For eg, what if we want to go from Blue to White to Orange such that the mean is white, more is blue and less is orange? We can write an expression as shown below

=iif(sum(Fields!Sales.Value) >= (Max(Fields!Sales.Value, "DataSet1") + Min(Fields!Sales.Value, "DataSet1"))/2,
Code.ColorDWB(sum(Fields!Sales.Value), Max(Fields!Sales.Value, "DataSet1"), (Max(Fields!Sales.Value, "DataSet1") + Min(Fields!Sales.Value, "DataSet1"))/2, "#6495ED"),
Code.ColorDWB(-sum(Fields!Sales.Value), -Min(Fields!Sales.Value, "DataSet1"), -(Max(Fields!Sales.Value, "DataSet1") + Min(Fields!Sales.Value, "DataSet1"))/2, "#FFC125"))

Note that we are writing a conditional expression to see if the current field is more than the mean (max+min / 2). If yes, then we use the familiar expression with the Blue color (so more the value, darker the blue). If no, we need to do add a negative for all the fields so that the coloring happens in the reverse order (lesser the number, darker the Orange).

Blue to Orange color gradation across entire table

You can keep on tweaking this code and use it for a wide variety of visualizations. I am no VB.NET developer, so it is possible that there is a much better way to do the same. If you do know of any better technique, do post in the comments so that others can benefit from it.

Monday, February 4, 2013

UNION Operation in DAX Queries

In one of my previous posts - Row Selection Using Slicers in PowerPivot - Part 1, I had demonstrated the use of what Marco Russo defined as Reverse Linked Tables and Linkback tables. A particularly eye-catching thing in my post was the use of Microsoft Query instead of DAX Query to obtain the required reverse-linked table and one of the reasons I gave was that it was difficult to do an UNION operation using DAX queries (yes, you heard it right. I said difficult and not impossible). Well, since difficult is such a subjective word, I decided to jot down the technique, maybe it might seem easy for you guys!

Union operation with DAX queries / PowerPivot

For the purpose of this demonstration, I am using two tables having identical structures. There are just two columns in the tables - country and state and we need to do an UNION operation on these two tables.

Source

The result should be 7 rows as shown below

Expected Result

Follow the steps below for the solution:-

1) Import the two tables to PowerPivot (you can also use this technique on a SSAS tabular model). I will be using DAX Studio for writing my queries and displaying the results (though you might as well as use this in SSMS or in the DMX query editor for SSRS depending on your purpose).

2) The first thing to understand here is that DAX, as a query language, can not add rows usually, and UNION requires that we get more rows (since it is the combined result of both the tables). However, there is an operator in DAX which generally generates more rows than its source tables – CROSSJOIN (except when any one of the participating tables has only one row). So let’s first crossjoin the two tables and see the results.

CrossJoin

Now you can see that we get 12 rows, however no single column gives the result that we need. Somehow, we need to get a logic to filter the 5 rows and also a logic to combine the right results in one calculated column.

3) To identify the individual rows, let us add a rank column to each of the two tables and then crossjoin them.

EVALUATE
CROSSJOIN (
ADDCOLUMNS (Table1,"Rnk1",RANKX ( Table1, Table1[State],, 1, DENSE )    ),
ADDCOLUMNS (Table2,"Rnk2",RANKX ( Table2, Table2[State],, 1, DENSE )    )
  )

Adding rank identifiers to the table rows

4) When I looked at this resultset initially, I felt that I could just filter all rows having Rank1 = 1 and Rank2 = 2 and then add a calculated column each for Country and State such that if Rank1 = 1, then the value comes from Table2 else it comes from Table1.

EVALUATE
ADDCOLUMNS (
FILTER (
CROSSJOIN (
ADDCOLUMNS (Table1,"Rnk1",RANKX ( Table1, Table1[State],, 1, DENSE )        ),
ADDCOLUMNS (Table2,"Rnk2",RANKX ( Table2, Table2[State],, 1, DENSE )        )
      ),
      [Rnk1] = 1 || [Rnk2] = 2    ),
"UnionCntry", IF ( [Rnk1] = 1, Table2[Country], Table1[Country] ),
"UnionState", IF ( [Rnk1] = 1, Table2[State], Table1[State] )
  )

Incorrect logic

Even though the UnionCntry and UnionState columns might appear right, they are not, as one row is missing (in this case, the row with IN country and KL state). This happens because there will always be one row which has Rnk1 = 1 and Rnk2 = 2, and hence there is an overlap. So we need to think of a different technique to filter the rows.

5) Since we are short of one row, we need to include one more extra row having the complementary values of ranks for the overlapping row (in this case, the row with Rnk1=2 and Rnk2=1). This can be done by using the query given below.

EVALUATE
FILTER(
CROSSJOIN (
ADDCOLUMNS (Table1,"Rnk1",RANKX ( Table1, Table1[State],, 1, DENSE )    ),
ADDCOLUMNS (Table2,"Rnk2",RANKX ( Table2, Table2[State],, 1, DENSE )    )
  ), ([Rnk1]=1 || [Rnk2]=2) || ([Rnk1]=2 && [Rnk2]=1))

Filtering the right number of rows

6) Now all we have to do is to get the logic for creating the calculated columns. We can do it with the help of a SWITCH statement by (a) assigning the extra row (Rnk1=2 and Rnk2=1) to Table1 and (b) making sure that the overlapping row (Rnk1=1 and Rnk2=2) is assigned to Table2. This can be done by the following query

EVALUATE
ADDCOLUMNS(
FILTER(
CROSSJOIN (
ADDCOLUMNS (Table1,"Rnk1",RANKX ( Table1, Table1[State],, 1, DENSE )    ),
ADDCOLUMNS (Table2,"Rnk2",RANKX ( Table2, Table2[State],, 1, DENSE )    )
  ), ([Rnk1]=1 || [Rnk2]=2) || ([Rnk1]=2 && [Rnk2]=1))
, "UnionCountry", SWITCH(TRUE,
                    ([Rnk2]=2 && [Rnk1]<>1) || ([Rnk2]=1 && [Rnk1]=1)
                    ,Table1[Country], Table2[Country])
, "UnionState", SWITCH(TRUE,
                    ([Rnk2]=2 && [Rnk1]<>1) || ([Rnk2]=1 && [Rnk1]=1)
                    ,Table1[State], Table2[State]))

Correct result of UNION ALL operation

Now you can see that the UnionCountry and UnionState column gives us the required results for a UNION ALL operation. This approach can be easily extended in case we have more than two tables to join. The only scenario where this approach will not work is when any of the participating tables has just one row (as the cross join will not give us more rows. Eg, cross-joining two tables with 3 and 1 rows will give only 3 rows whereas we need 4 rows for the UNION ALL). However, this can be resolved by adding a temporary table with two rows (so the cross join will be 3 * 1 * 2 = 6 rows) and then using this logic.

7) To do the UNION operation instead of UNION ALL, we should remove all duplicate rows. For that, we can use SUMMARIZE function.

EVALUATE
SUMMARIZE(
ADDCOLUMNS(
FILTER(
CROSSJOIN (
ADDCOLUMNS (Table1,"Rnk1",RANKX ( Table1, Table1[State],, 1, DENSE )    ),
ADDCOLUMNS (Table2,"Rnk2",RANKX ( Table2, Table2[State],, 1, DENSE )    )
  ), ([Rnk1]=1 || [Rnk2]=2) || ([Rnk1]=2 && [Rnk2]=1))
, "UnionCountry", SWITCH(TRUE,
                    ([Rnk2]=2 && [Rnk1]<>1) || ([Rnk2]=1 && [Rnk1]=1),
Table1[Country], Table2[Country])
, "UnionState", SWITCH(TRUE,
                    ([Rnk2]=2 && [Rnk1]<>1) || ([Rnk2]=1 && [Rnk1]=1),
Table1[State], Table2[State]))
, [UnionCountry], [UnionState])

Correct result of UNION operation

In this scenario, there is no difference in the result as there are no duplicate rows. Let me know what you think of this!