Microsoft Analysis Services: 🕒 Time Dimensions

Microsoft Analysis Services: 🕒 Time Dimensions

I did a little research the other day with regards to Time Dimensions and I thought I would share my findings here.  

Background:
Traditionally a data warehouse will have at least one “Time” dimension that will have one record for every time period that data exists.  For example, if we had data from 1995 all the way to today, and we decided that we wanted the granularity to be 1 hour, we would have in our Time dimension one record for every hour since 1995.

That would be 24 hours x 365 days x 12 years (roughly) = 105,000 records.

Now imagine that the requirements changed, and we now have to summarize data to the minute. 

That would be 60 minutes x 24 hours x 365 days x 12 years (roughly) = 6,307,000 records.

Add another 10 years of data and our time dimension would have over 10,000,000 records!

Performance Consideration:
In Analysis Services 2000, all dimension data is loaded into main memory when processing.  Large dimensions (more than 100k records) were avoided where possible.  Otherwise there could be performance problems when processing the cubes on servers with moderate amounts of memory.

Because of this performance restriction, and my own desire for efficiency, I would normally separate the hours/minutes data into their own dimensions, one for “Days / Dates”, one for “Hours / Minutes”.

Using our example above, the “Day/Date” dimension would have one record for every day since 1995, or 4380 records.  The “Hour/Minute” dimension would have one record for every minute of every hour for just one day, or 1440 records.  There is no reason to repeat this data i.e. for every day of the year, because each day always has 24 hours and each hour always has 60 minutes.

As you can see, 6000 records in two dimensions can be used to contain the exact same amount of data as 6,000,000 records in one dimension.

Findings: 
Analysis Services 2005 does not need to load the entire dimension into memory when processing, so the performance restriction is no longer an issue. Analysis Services 2005 also has many built in functions and examples that assume you have one time dimension containing ALL time periods.  Taking advantage of these built in functions and examples add great value to the reporting capabilities of our cubes while decreasing the time to deploy.

Summary: I believe it is worthwhile to have a traditional time dimension called “TimeDIM” which will have one record for every hour of every day for all the time periods for which you have data.  In addition I would also create “Day/Date” and “Hour/Minute” dimensions because they are quite handy for reports at the Date and/or Hour level and they take up relatively little disk space.

whoami
Stefan Pejcic
Join the discussion

I enjoy constructive responses and professional comments to my posts, and invite anyone to comment or link to my site.