Will Predictive Coding Live Up to the eDiscovery Hype?

by Philip Favro on May 14th, 2012

The myriad of published material regarding predictive coding technology has almost universally promised reduced costs and lighter burdens for the eDiscovery world. Indeed, until the now famous order was issued in the Da Silva Moore v. Publicis Groupe case “approving” the use of predictive coding, many in the industry had parroted this “lower costs/lighter burdens” mantra like the retired athletes who chanted “tastes great/less filling” during the 1970s Miller Lite commercials. But a funny thing happened on the way to predictive coding satisfying the cost cutting mandate of Federal Rule of Civil Procedure 1: the same old eDiscovery story of high costs and lengthy delays are plaguing the initial outlay of this technology. The three publicized cases involving predictive coding are particularly instructive on this early, but troubling development.

Predictive Coding Cases

In Moore v. Publicis Groupe, the plaintiffs’ attempt to recuse Judge Peck has diverted the spotlight from the costs and delays associated with use of predictive coding. Indeed, the parties have been wrangling for months over the parameters of using this technology for defendant MSL’s document review. During that time, each side has incurred substantial attorney fees and other costs to address fairly routine review issues. This tardiness figures to continue as the parties now project that MSL’s production will not be complete until September 7, 2012. Even that date seems too sanguine, particularly given Judge Peck’s recent observation about the slow pace of production: “You’re now woefully behind schedule already at the first wave.” Moreover, Judge Peck has suggested on multiple occasions that a special master be appointed to address disagreements over relevance designations. Special masters, production delays, additional briefings and related court hearings all lead to the inescapable conclusion that the parties will be saddled with a huge eDiscovery bill (despite presumptively lower review costs) due to of the use of predictive coding technology.

The Kleen Products v. Packing Corporation case is also plagued by cost and delay issues. As explained in our post on this case last month, the plaintiffs are demanding a “do-over” of the defendants’ document production, insisting that predictive coding technology be used instead of keyword search and other analytical tools. Setting aside plaintiffs’ arguments, the costs the parties have incurred in connection with this motion are quickly mounting. After submitting briefings on the issues, the court has now held two hearings on the matter, including a full day of testimony from the parties’ experts. With another “Discovery Hearing” now on the docket for May 22nd, predictive coding has essentially turned an otherwise routine document production query into an expensive, time consuming sideshow with no end in sight.

Cost and delay issues may very well trouble the parties in the Global Aerospace v. Landow Aviation matter, too. In Global Aerospace, the court acceded to the defendants’ request to use predictive coding technology over the plaintiffs’ objections. Despite allowing the use of such technology, the court provided plaintiffs with the opportunity to challenge the “completeness or the contents of the production or the ongoing use of predictive coding technology.” Such a condition essentially invites plaintiffs to re-litigate their objections through motion practice. Moreover, like the proverbial “exception that swallows the rule,” the order allows for the possibility that the court could withdraw its approval of predictive coding technology. All of which could lead to seemingly endless discovery motions, production “re-dos” and inevitable cost and delay issues.

Better Times Ahead?

At present, the Da Silva Moore, Kleen Products and Global Aerospace cases do not suggest that predictive coding technology will “secure the just, speedy, and inexpensive determination of every action and proceeding.” Nevertheless, there is room for considerable optimism that predictive coding will ultimately succeed. Technological advances in the industry will provide greater transparency into the black box of predictive coding technology that to date has not existed. Additional advances should also lead to easy-to-use workflow management consoles, which will in turn increase defensibility of the process and satisfy legitimate concerns regarding production results, such as those raised by the plaintiffs in Moore and Global Aerospace.

Technological advances that also increase the accuracy of first generation predictive coding tools should yield greater understanding and acceptance about the role predictive coding can play in eDiscovery. As lawyers learn to trust the reliability of transparent predictive coding, they will appreciate how this tool can be deployed in various scenarios (e.g., prioritization, quality assurance for linear review, full scale production) and in connection with existing eDiscovery technologies. In addition, such understanding will likely facilitate greater cooperation among counsel, a lynchpin for expediting the eDiscovery process. This is evident from the Moore, Kleen Products and Global Aerospace cases, where a lack of cooperation has caused increased costs and delays.

With the promise of transparency and simpler workflows, predictive coding technology should eventually live up to its billing of helping organizations discover their information in an efficient, cost effective and defensible manner.  As for now, the “promise” of first generation predictive coding tools appears to be nothing more than that, leaving organizations looking like the cash-strapped “Monopoly man,” wondering where there litigation dollars have gone.

Comment on this post »

Morton’s Fork, Oil Filters the Nexus with Information Governance

by Dean Gonsowski on May 10th, 2012

Those old enough to have watched TV in the early eighties will undoubtedly remember the FRAM oil slogan where the mechanic utters his iconic catchphrase: “You can pay me now, or pay me later.”  The gist of the vintage ad was that the customer could either pay a small sum now for the replacement of oil filter, or a far greater sum later for the replacement of the car’s entire engine.

This choice between two unpleasant alternatives is sometimes called a Morton’s Fork (but typically only when both choices are equal in difficulty).  The saying (not to be confused with the equally colorful Hobson’s Choice) apparently originated with the collection of taxes by John Morton (the Archbishop of Canterbury) in the late 15th century.  Morton was apparently fond of saying that a man living modestly must be saving money and could therefore afford to pay taxes, whereas if he was living extravagantly then he was obviously rich and could still afford them.[i]

This “pay me now/pay me later” scenario perplexes many of today’s organizations as they try to effectively govern (i.e., understand, discover and retain) electronically stored information (ESI).  The challenge is similar to the oil filter conundrum, in that companies can often make rather modest up-front retention/deletion decisions that help prevent monumental, downstream eDiscovery charges.

This exponential gap has been illustrated recently by a number of surveys contrasting the cost of storage with the cost of conducting basic eDiscovery tasks (such as preservation, collection, processing, review and production).  In a recent AIIM webcast it was noted that “it costs about 20¢/day to buy 1GB of storage, but it costs around $3,500 to review that same gigabyte of storage.” And, it turns out that the $3,500 review estimate (which sounds prohibitively expensive, particularly at scale) may actually be on the conservative side.  While the review phase is roughly 70 percent of the total eDiscovery costs – there is the other 30% that includes upstream costs for preservation, collection and processing.

Similarly, in a recent Enterprise Strategy Group (ESG) paper the authors noted that eDiscovery costs range anywhere from $5,000 to $30,000 per gigabyte, citing the Minnesota Journal of Law, Science & Technology.  This $30,000 figure is also roughly in line with other per-gigabyte eDiscovery costs, according to a recent survey by the RAND Corporation.  In an article entitled “Where the Money Goes — Understanding Litigant Expenditures for Producing Electronic Discovery” authors Nicholas M. Pace and Laura Zakaras conducted an extensive analysis and concluded that “… the total costs per gigabyte reviewed were generally around $18,000, with the first and third quartiles in the 35 cases with complete information at $12,000 and $30,000, respectively.”

Given these range of estimates, the $18,000 per gigabyte metric is probably a good midpoint figure that advocates of information governance can use to contrast with the exponentially lower baseline costs of buying and maintaining storage.  It is this stark (and startling) gap between pure information costs and the expenses of eDiscovery that shows how important it is to calculate latent “information risk.”  If you also add in the risks for sanctions due to spoliation, the true (albeit still murky) information risk portrait comes into focus.  It is this calculation that is missing when legal goes to bat to argue about the necessity of information governance solutions, particularly when faced with the host of typical objections (“storage is cheap” … “keep everything” … “there’s no ROI for proactive information governance programs”).

The good news is that as the eDiscovery market continues to evolve, practitioners (legal and IT alike) will come to a better and more holistic understanding of the latent information risk costs that the unchecked proliferation of data causes.  It will be this increased level of transparency that permits the budding information governance trend to become a dominant umbrella concept that unites Legal and IT.



[i] Insert your own current political joke here…

Comment on this post »

Look Before You Leap! Avoiding Pitfalls When Moving eDiscovery to the Cloud

by Philip Favro on May 7th, 2012

It’s no surprise that the eDiscovery frenzy gripping the American legal system over the past decade has become increasingly expensive.  Particularly costly to organizations is the process of preserving and collecting documents, a fact repeatedly emphasized by the Advisory Committee in its report regarding the 2006 amendments to the Federal Rules of Civil Procedure (FRCP).  These aspects of discovery are often lengthy and can be disruptive to business operations.  Just as troubling, they increase the duration and expense of litigation.

Because these costs and delays affect the courts as well as clients, it comes as no surprise that judges have now heightened their expectation for how organizations store, manage and discover their electronically stored information (ESI).  Gone are the days when enterprises could plead ignorance for not preserving or producing their data in an efficient, cost effective and defensible manner.  Organizations must now follow best practices – both during and before litigation – if they are to safely navigate the stormy seas of eDiscovery.

The importance of deploying such practices applies acutely to those organizations that are exploring “cloud”-based alternatives to traditional methods for preserving and producing electronic information.  Under the right circumstances, the cloud may represent a fantastic opportunity to streamline the eDiscovery process for an organization.  Yet it could also turn into a dangerous liaison if the cloud offering is not properly scrutinized for basic eDiscovery functionality.  Indeed, the City of Los Angeles’s recent decision to partially disengage from its cloud service provider exemplifies this admonition to “look before you leap” to the cloud.  Thus, before selecting a cloud provider for eDiscovery, organizations should be particularly careful to ensure that a provider has the ability both to efficiently retrieve data from the cloud and to issue litigation hold notices.

Effective Data Retrieval Requires Efficient Data Storage

The hype surrounding the cloud has generally focused on the opportunity for cheap and unlimited storage of information.  Storage, however, is only one of many factors to consider in selecting a cloud-based eDiscovery solution.  To be able to meet the heightened expectations of courts and regulatory bodies, organizations must have the actual – not theoretical – ability to retrieve their data in real time.  Otherwise, they may not be able to satisfy eDiscovery requests from courts or regulatory bodies, let alone the day-to-day demands of their operations.

A key step to retrieving company data in a timely manner is to first confirm whether the cloud offering can intelligently organize that information such that organizations can quickly respond to discovery requests and other legal demands.  This includes the capacity to implement and observe company retention protocols.  Just like traditional data archiving software, the cloud must enable automated retention rules and thus limit the retention of information to a designated time period.  This will enable data to be expired once it reaches the end of that period.

The pool of data can be further decreased through single instance storage.  This deduplication technology eliminates redundant data by preserving only a master copy of each document placed into the cloud.  This will reduce the amount of data that needs to be identified, preserved, collected and reviewed as part of any discovery process.  For while unlimited data storage may seem ideal now, reviewing unlimited amounts of data will quickly become a logistical and costly nightmare.

Any viable cloud offering should also have the ability to suspend automated document retention/deletion rules to ensure the adequate preservation of relevant information.  This goes beyond placing a hold on archival data in the cloud.  It requires that an organization have the ability to identify the data sources in the cloud that may contain relevant information and then modify aspects of its retention policies to ensure that cloud-stored data is retained for eDiscovery.  Taking this step will enable an organization to create a defensible document retention strategy and be protected from court sanctions under the Federal Rule of Civil Procedure 37(e) “safe harbor.”  The decision from Viramontes v. U.S. Bancorp (N.D. Ill. Jan. 27, 2011) is particularly instructive on this issue.

In Viramontes, the defendant bank defeated a sanctions motion because it timely modified aspects of its email retention policy.  The bank implemented a policy that kept emails for 90 days, after which the emails were deleted.  That policy was promptly suspended, however, once litigation was reasonably foreseeable.  Because the bank followed that procedure in good faith, it was protected from sanctions under Rule 37(e).

As the Viramontes case shows, an organization can be prepared for eDiscovery disputes by appropriately suspending aspects of its document retention policies.  By creating and then faithfully observing a policy that requires retention policies be suspended on the occurrence of litigation or other triggering event, an organization can develop a defensible retention procedure. Having such eDiscovery functionality in a cloud provider will likely facilitate an organization’s eDiscovery process and better insulate it from litigation disasters.

The Ability to Issue Litigation Hold Notices

To be effective for eDiscovery purposes, a cloud service provider must also enable an organization to deploy a litigation hold to prevent users from destroying data. Unless the cloud has litigation hold technology, the entire discovery process may very well collapse.  For electronic data to be produced in litigation, it must first be preserved.  And it cannot be preserved if the key players or data source custodians are unaware that such information must be retained.  Indeed, employees and data sources may discard and overwrite electronically stored information if they are oblivious to a preservation duty.

A cloud service provider should therefore enable automated legal hold acknowledgements.  Such technology will allow custodians to be promptly and properly notified of litigation and thereby retain information that might otherwise have been discarded.  Inadequate litigation hold technology leaves organizations vulnerable to data loss and court punishment.

Conclusion

Confirming that a cloud offering can quickly retrieve and efficiently store enterprise data while effectively deploying litigation hold notices will likely address the basic concerns regarding its eDiscovery functionality. Yet these features alone will not make that solution the model of eDiscovery cloud providers. Advanced search capabilities should also be included to reduce the amount of data that must be analyzed and reviewed downstream. In addition, the cloud ought to support load files in compatible formats for export to third party review software. The cloud should additionally provide an organization with a clear audit trail establishing that neither its documents, nor their metadata were modified when transmitted to the cloud.  Without this assurance, an organization may not be able to comply with key regulations or establish the authenticity of its data in court. Finally, ensure that these provisions are memorialized in the service level agreement governing the relationship between the organization and the cloud provider.

Comment on this post »

District Court Upholds Judge Peck’s Predictive Coding Order Over Plaintiff’s Objection

by Matthew Nelson on April 30th, 2012

In a decision that advances the predictive coding ball one step further, United States District Judge Andrew L. Carter, Jr. upheld Magistrate Judge Andrew Peck’s order in Da Silva Moore, et. al. v. Publicis Groupe, et. al. despite Plaintiff’s multiple objections. Although Judge Carter rejected all of Plaintiff’s arguments in favor of overturning Judge Peck’s [...]

Comment on this post »

First State Court Issues Order Approving the Use of Predictive Coding

by Matthew Nelson on April 26th, 2012 (1 Comment)

On Monday, Virginia Circuit Court Judge James H. Chamblin issued what appears to be the first state court Order approving the use of predictive coding technology for eDiscovery. Tuesday, Law Technology News reported that Judge Chamblin issued the two-page Order in Global Aerospace Inc., et al, v. Landow Aviation, L.P. dba Dulles Jet Center, et [...]

1 Comment »

The 2012 EDGE Summit (21st Century Technology for Information Governance) Debuts In Nation’s Capitol

by Allison Walton on April 23rd, 2012

The EDGE Summit this week is one of the most prestigious eDiscovery events of the year as well as arguably the largest for the government sector. This year’s topics and speakers are top notch. The opening keynote speaker will be the Director of Litigation for the National Archives and Records Administration (NARA), Mr. Jason Baron. The [...]

Comment on this post »

Breaking News: Court Clarifies Duty to Preserve Evidence, Denies eDiscovery Sanctions Motion Against Pfizer

by Philip Favro on April 18th, 2012

It is fortunately becoming clearer that organizations do not need to preserve information until litigation is “reasonably anticipated.” In Brigham Young University v. Pfizer (D. Utah Apr. 16, 2012), the court denied the plaintiff university’s fourth motion for discovery sanctions against Pfizer, likely ending its chance to obtain a “game-ending” eDiscovery sanction. The case, which [...]

Comment on this post »

Proportionality Demystified: How Organizations Can Get eDiscovery Right by Following Four Key Principles

by Philip Favro on April 17th, 2012

Talk to most any organization about legal issues and invariably the subject of eDiscovery will be raised. The skyrocketing costs and lengthy delays associated with data preservation and document review provide ample justification for organizations to be on the alert about eDiscovery. While these costs and delays tend to make the eDiscovery landscape appear bleak, [...]

Comment on this post »

Plaintiffs Ask Judge Nan R. Nolan to Go Out On a Limb in Kleen Products Predictive Coding Case

by Matthew Nelson on April 13th, 2012 (2 Comments)

While the gaze of the eDiscovery community has been firmly transfixed on the unfolding drama in the Da Silva Moore, et. al. v. Publicis Groupe, et. al. predictive coding case, an equally important case in the Northern District of Illinois has been quietly flying under the radar. I recently traveled to Chicago to attend the [...]

2 Comments »

Take Two and Call me in the Morning: U.S. Hospitals Need an Information Governance Remedy

by Allison Walton on April 11th, 2012

Given the vast amount of sensitive information and legal exposure faced by hospitals today it’s a mystery why these organizations aren’t taking advantage of enabling technologies to minimize risk. Both HIPPA and the HITECH Act are often achieved by manual, ad hoc methods, which are hazardous at best. In the past, state and federal auditing [...]

Comment on this post »