Backup Tapes and Archives Bursting at the Seams? The Seven Year Itch Has Technology to Answer the Scratchby Allison Walton on December 12th, 2011
Just like Marilyn Monroe stopped traffic in her white dress in The Seven Year Itch, enterprises are being stopped dead in their tracks by the data explosion, lack of information governance policies and overstuffed IT infrastructures. During the 2004-05 timeframe, a large number of enterprises began migrating to an archive, and this trend has kept steady pace since. Archiving historically began with email, but has been recently extended to many other forms of information, including social media, unstructured data and cloud content. This adoption was somewhat related to the historic Zubulake ruling, that required preservation to attach upon “reasonable anticipation of litigation.” Another significant driver behind the archive need is the ability to comply with a range of statutes and regulations. The reality is it is difficult to preserve efficiently and defensibly without an archive and other automatic classification technologies. Some companies still complete the information management and eDiscovery processes manually, but not without peril.
Currently, there is a sudden upsurge in corporations finally starting to shrink the archives that they implemented to manage email, legal preservation requirements and regulatory compliance. After roughly seven years, over which time there have been many advances in technology, a shift in thinking is taking place with regard to information governance and data retention. Change has been borne out of necessity, as infrastructures are suffering with the amount of data they are retaining and the pains associated with searching that data. This shift will enable companies to delete with confidence, clean up their backup tapes, shrink their archives, and manage/expire data on a go-forward basis effectively. Collectively, this type of good information governance hygiene allows organizations to minimize the litigation risk that’s attendant with bloated information stores.
One reason many archives have become so bloated is because many enterprises purchased archiving software, but did not properly enable expiry procedures according to a defensible document retention policy. This resulted in saving everything for the past seven or so years. Another reason for retaining all data in the archive was because enterprises were afraid to delete anything fearing being accused of spoliation and/or the inability to retrieve data that should have been on legal hold. These two reasons combined have resulted in companies being forced to address the impact of having to search this massive amount of data in the archive each time a matter arises. The resulting workflow for data collection is time consuming and expensive, especially for companies that still employ third party vendors for data collection. For many organizations, the situation has become unsustainable from both a legal and IT perspective.
In recent years, backup has been given less attention as archives have become more common, storage has become more affordable, and most lawyers argue that tapes are “inaccessible” – making restoration less common. However, there is still an area of concern with regard to over-retention of backup, especially when organizations do not have an archive. They may be required to produce backup tapes as much of the relevant information to a matter could be contained therein. This has led to saving large numbers of backup tapes with no real knowledge of what data is on the tapes and no one wanting to be accountable for pulling the trigger on deletion. When forced to restore backup tapes it can be expensive and an eDiscovery nightmare.
For example, in Moore v. Gilead Sciences (N.D. Ca. Nov. 16, 2011), the plaintiff sought production of “all archived emails” that he sent or received during his five-year tenure with the defendant pharmaceutical company. The company objected to the request as being unduly burdensome. The company argued that:
- The emails were exclusively stored on its disaster recovery backup tapes;
- It would cost $360,000 to index those tapes, exclusive of processing and review costs;
- Many of the requested emails would not be retrieved since the company conducted its backups on monthly (not daily) intervals; and
- Over 25,000 pages of the plaintiff’s emails had already been produced in the litigation.
It is common for the inaccessibility and unduly burdensome arguments to be made with regard to backup tapes to combat indexing and restoration. However, where a discovery dispute has merit, courts routinely reject projected cost estimates (such as the company’s $360,000 figure) as being unfounded/speculative and order production nevertheless. [See Pippins v. KPMG and Escamilla v. SMS Holdings Corp.] Had the judge gone the other way on restoration in Moore, the outcome could have easily been different, expensive and detrimental to the company.
What does this mean for organizations keeping seven years or more of legacy content? Firstly, take inventory on where backup tapes reside and determine if they need to be saved or if they can be deleted. Most corporations have amassed many tapes that are only a legal liability at this point. Technology exists today that can index and search what is on the tapes, enabling educated decisions to then be made about whether to delete and/or transfer to the archive for legal hold. Essentially, new technology can give sight to the blind. Those decisions must be made according to a plan and documented. Backup should only be for disaster recovery.
Secondly, purchase an archive if the company does not yet have one and configure the archive to expire data according to the document retention policy that can protect the company’s data decisions under Safe Harbor laws.
Is the company experiencing what many others are right now, which is an archive that is bursting at the seams? If the company does have an archive, check to see if expiry has been properly deployed according to the company’s policy. If not, initiate a project to free up the archive from information retention that is unnecessary and that should not be subject to discovery. Again, this must be documented. Archives are for discovery and they need to be lean, efficient, and executing the information management lifecycle.
Avoid the request for backup tapes in litigation by having a sufficient archive and clearly stating that backup tapes are solely for disaster recovery. Delete tapes when possible by analyzing what is on them with appropriate technology and through a documented process that will eliminate the possibility of them being discoverable in litigation.
In sum, it is very helpful to examine the EDRM model and carve out what technologies and policies will apply to each aspect of the continuum. For the challenges addressed in this blog, backup tapes fall under information management as does an archive all the way to the left of the model. Backup tapes need search and expiry in order to retain only what is necessary for legal hold and should be ingested into an archive; then, the company’s disaster recovery policies should be enforced on a go-forward basis. Similarly, the archive needs search and expiration according to document retention policies so it does not become overgrown. From left to right, the model logically walks through the lifecycle of data, and many of the responsibilities associated with the data can be automated. This project will require commitment, resources and time, but in light of the fact that data is only growing, there aren’t any other options.