Just Do it….. Where The Data Is!

For some this will be a no-brainer and for those I apologize, however for a good majority of you I think this is going to be a good post to remind you that any report you are building whether in Cognos, SAP, MicroStrategy, Microsoft, or what have you, strive to always do the majority of the work (meaning partial aggregation, calcu

lation, and any other business rules) in the database/data warehouse.

Best practice is to use a dimensional model and that is inherent to what I am saying here. In

a well designed dimensional model the calculations and business rules are done before hand. So when you look at a Fact table or a Dimension table you will see a nice clean, ready to use data set based on the criteria you will solicit in your report. You should not use a totally normalized operational data table that forces you to

do all the calculations in the report. The reporting tool is not built to calculate data like a database can and does instinctively.

If in your report you are calculating on every value you have or even any significant amount of data being reported for that matter then you need to take a step back and realize your data is not modeled in the best manner. You need to go back to the data warehouse/database developers and insist that this be modeled correctly or more appropriate to your reporting requirements.

It should not be acceptable to be doing a ton of calculations in a report. Not only does it make the report slower but it will be a real pain going forward when someone wants the report modified. Companies have spent and will continue to spend thousands on just trying to change a report that could have been prevented.

A key point to pay attention to is how fast your report runs. As long as your report isn’t so complex that it uses millions of rows your report should most of the time run in seconds (no more than 30 seconds). If you find that your reports are taking longer than that then you should start looking at having another layer of the data being aggregated first before it hits your report. So you would have your dimensional model as always but there would be a level before or after where the data gets aggregated to a higher level that will cause your data source to work less when querying the data for your report and hence speed up the time your report runs.

Remember you want happy clients whether you are a consultant or part of a reporting team, make them happy and keep your reports fast 🙂