Home > Blog: Diary of a Cloud Expert > Data Retention Best Practices

Data Retention Best Practices

  • Posted by Bryan Willman
  • On September 26, 2016
  • NetSuite, Data

The amount of data being produced today is projected to grow ten-fold over the next six years, effectively doubling every two years.  That’s according to the IDC’s annual Digital Universe study which also predicts that the amount of electronic data will increasingly outpace available storage.

IT departments face the growing challenge of managing ever increasing amounts of data produced by numerous disparate applications; all while ensuring access to the data is managed securely and available for business and legal purposes.  For modern organizations all this information can be expensive to store not only because of the cost of storage, but also because of the potential liability of keeping information for too long.  To limit exposure to security breaches which are all too common these days, organizations should institute data retention policies to retain records for only as long as they are required and no longer.  Keeping old customers’ data in the system after ten years provides negligible value compared to the risk of exposing those customers to data theft.  For these reasons, organizations should have a formal written data retention policy and enforce this policy for all of their enterprise systems (ERP, CRM, Data Warehouse, Email, et al.) to limit liability in the long run.

Consider what purpose your data retention policy will serve

When considering implementing a Data Retention policy it’s important to first consider the purpose and needs of the organization as a whole.  Are the requirements being driven by the legal or compliance implications?  In this case, it’s important to think about preserving data in its original form and the need to apply a legal hold to prevent archival of records related to an entity as litigation can span many years.  Or, are they being driven by the CIO’s need to reduce storage cost and clutter and meet service level agreements (SLAs)?  The answers to these questions, will drive the identification of what data needs to remain live and what data should be archived.

Determine how long you need to keep different classes of data

The second key consideration in a Data Retention Policy is how long to keep different types of records in the live system and the archive system before eventually being purged.  To comply with Sarbanes-Oxley (SOX) most public companies need to retain financial records for seven years, but payroll data is only necessary for three years.  Often times industry regulations and the statute of limitations for a party being able to bring legal action against the organization dictate the duration of time necessary to retain documents.  The IT group is often not well versed in document retention regulations so it is advised to consult with corporate legal counsel to ensure compliance.  There is an increasing trend in the duration of time needed to store documents across a variety of industries in order to remain in compliance.  Figure 1 shows the various time periods spanning from three years for private companies all the way to 100 years for nuclear energy companies.


Figure 1. Various Document Retention Policies by Industry

How you define the policy

As a best practice organizations should carefully craft their approach to implementing a document retention policy.  These are the rules by which to move files from the live system to the archive and eventually purge from the archive system.  It may seem reasonable to define a policy to archive records created greater than seven years ago, but what if these records are still being accessed?  In this case it would be prudent to consider using the Last Accessed Date instead.  But what if your system doesn’t store this information, but instead only stores the created date and last modified date?  For example, what happens when a ten-year contract is archived because the document retention policy was to archive any record modified over seven years ago.  In this case, it’s likely this contract has not been modified since it’s been created, but it’s probably not one that should be archived or purged from the live system.  You may want to set an exception to preserve this document for longer than the standard policy.

Where Data Retention Policies fit into the overall data management strategy

Information Lifecycle Management (ILM) is a wide-ranging set of strategies for managing data throughout its lifecycle.  The Information Lifecycle defines where and how to store data and for what period of time based on its classification.

The ILM Lifecycle is comprised of the following phases:

  • Create – The creation or receipt of records is the origination point of the lifecycle continuum.  Records can include correspondence, reports, documents that are created/received, and other medium.
  • Classify – An essential step in the ILM strategy is identification of which data is important, where it originates from and flows through within the organization, where it is located, and what must be retained.  Key considerations include legal or business requirements around storing and retrieving of data, mapping out what happens to this data over time, and what degree of availability and protection is needed.
  • Use & Distribute – Once information has been created consider how it is used and managed within the organization.  Records may be distributed internally or externally.  Failure to establish a rational system to manage/store information makes retrieval later nearly impossible.
  • Archive – Over time records that are used less frequently become less valuable, but continue to take up valuable space in operational databases.  Archiving – or moving records to a cost-effective, secure, and less available tier based on a Document Retention Policy are a best practice for reducing litigation exposure and freeing up space in applications and thus improving application and reporting performance in the organization.  In exceptional cases, some information may need to be retained longer than the assigned policy if a legal hold is required to keep information available for litigation purposes or certain records continue to hold their value over time.
  • Retain & Secure – In today’s climate of constant security breaches where sensitive data is leaked to the public organizations need to focus on securing data in their organization thought-out the lifecycle.  The choice of where you store your data and information vendors are an important consideration in this phase to protect the integrity of the data from threats including hackers, employee theft, and natural disasters.
  • Purge – Long-term records that expire their useful value and the document retention period should be removed permanently.

Data retention policies play a critical role in an organization’s overall ILM approach and should be carefully crafted and implemented as they have long-term implications for the organization.  In Figure 2 below, Data Retention Policies play a pivotal role in the Classify, Archive, and eventually the Retain / Hold / Purge phases.

information lifecycle management graph

Figure 2. The Information Lifecycle Management Framework

Do you have some additional strategies or considerations that you feel are important to add to the conversation, if so tell us what you think!  Are you considering implementing an ILM framework in your organization?  If so, perhaps our team can help.


Leave a Reply