Tuesday, January 11, 2011

Archiving Data

The last phase of the BI life-cycle is archival of BI data. We will spend relatively little time on this topic, mainly because data archival is a subject area that ranges outside the normal scope of business intelligence policy. Archival policy is generally dictated by legal and tax considerations.

Before data processing and mass electronic storage capabilities were widely available to business firms, archival policies were driven as much by financial and space constraints as by legally mandated document retention requirements. Now, with the advent of cheap electronic storage, legal and tax considerations are dominant in the decision to archive or destroy data. Of course, to the extent that local jurisdictions require source business documents to be archived in their original physical form there may still be a significant storage cost associated with maintaining archives.

With cheap electronic storage there is a temptation to keep everything. In the case of business intelligence data, this is not necessarily a good idea. If you have latitude to set archival policy for your BI data, here are a few common-sense suggestions.

Focus on strategic data. There is a diminishing return associated with archiving historical data at progressively lower levels of detail. For example, there may be little or no interest in knowing about sales of a particular USB hub five years ago, but there may be useful insights to be gleaned from examining the five-year trend in sales of USB hubs as a product class.

Data are like breadcrumbs: both become increasingly stale with the passage of time. With the pace of business activity constantly accelerating, the older the data, the less useful. This is especially true in cases where a company has pushed into new markets and left old ones behind. Although having some historical information about “the old days” may be interesting, from a BI perspective it does not add value.

Consider the industry and product functionality. It might make sense for an automaker to keep older historical data on sales of sedans, because the product class “sedan” has been stable for decades. By the same token it might make sense for a computer maker to keep older data about “laptops.” Even though the technology behind a laptop computer has changed dramatically since its introduction, the core functionality of the product arguably has not.

No comments:

Post a Comment