Pages

Thursday, June 9, 2011

When and How to Snowflake Dimension Sources : SSAS Design Part 3

In my last post, I had quickly hinted why I prefer using a star schema and also explained the second scenario in which I would snowflake the dimension sources for the DSV. 

Star to Snowflake Part 3

The three scenarios as well as the example product dimension are repeated below for quick reference

  • Multiple lower levels for the dimension (Scenario 1)
  • Multiple facts coming at multiple granularity (Scenario 2)
  • Multiple levels joining to the same parent (Scenario 3)

Facts at different granularity_

Scenario 3 – Multiple Levels joining to same Parent

Cyclical relationship in Product Dimension

Consider the above example where there are multiple levels (Local SKU and Nielsen SKU) joining to the same parent. Invariably, a “diamond” or cyclic relationship would be involved the moment you say that there are multiple levels joining to the same parent in SSAS dimensions (You could avoid this by splitting the dimensions and making referenced dimensions in SSAS but then you would lose the ability to make hierarchies between the levels in the two split dimensions). For Scenario 3 to be applied, an additional condition also needs to be satisfied – at least one of the intermediate attributes involved in the cyclic relationship needs to have fact coming in at that level and this attribute should not be the start point or end point of the cyclic relationship. Some examples of diamond relationship are given below

 

Examples of Cyclic Relationship 

In our example, both Local SKU and Nielsen SKU have fact coming in at their levels and they are neither the start point (Dummy Level) nor the end point (Retail SKU) of the cyclic relationship, and hence Retail SKU qualifies under Scenario 3. In this scenario, the following views would be made

a)      View for parent level – A view is made for the parent entity (Retail SKU in our example) which will contain all the attributes / entities till the next entity where one of the three scenarios’ happens.  For our example, one view is made.

// View for Retail SKU and attributes

SELECT DISTINCT

   // Scenario 2 at Brand Pack

Retail_SKU, Standard_SKU, BrandPack#FK

FROM

<table>

WHERE

   // <levels inside “diamond” at which fact granularity comes>

<level=Local_SKU_ID or level=Nielsen_SKU_ID>

 

Some things to note here

  • As mentioned in the note section of my previous post, there is a basic difference in the WHERE clause of the views in Scenario 3 and the rest. For the rest of the scenarios, the clause would be for the lowest level of the view. So if the same principle was applied to the above view, then the WHERE clause would have been for <level=Retail_SKU>. Consider a case where there are Retail SKUs which don’t have any Local SKUs or Nielsen SKUs. There would not be any facts associated with them, and would be a pain for the business users if they start appearing in the dropdowns for the reports. Having the Scenario 3 WHERE clause would prevent this from happening as only the Retails SKUs which have facts associated with them would be brought in the cube.
  • The reason why Brand Family doesn’t qualify for Scenario 3 is because apart from the lowest level (Brand Pack), there are no attributes in the “diamond” relationship which has fact coming at that level. Hence, all these attributes have come under the view for Brand Pack in Scenario 2.
  • In both Scenario 1 and Scenario 3, there is a cyclic relationship or diamond relationship present in the model. These views will only ensure that the master data is correct. Care should be taken to handle the cyclic relationships within SSAS by defining multiple hierarchies or by making the relationship linear. Else you might see the blue squiggly line in the dimension attribute relationship manager in SSAS warning you of the occurrence of cyclic relationships.

That said, this marks the end of my trilogy on When and How to Snowflake Dimension Sources. I have made an effort to make it as generic as possible, but there could be scenarios which I have missed. Feel free to throw bricks or debate on the above concepts, constructive criticism is always welcome!

1 comment:

  1. I think this will be a helpful hint.
    Infosystems are very complicated in such manner that it must be understanded.

    ReplyDelete