The eDiscovery Trinity: Spoliation Sanctions, Keywords and Predictive Coding
Monday, May 20th, 2013
The world of eDiscovery appears to be revolving around a trifecta of issues that are important to both clients and counsel. A discovery-focused conversation with litigants and lawyers in 2013 will almost invariably turn to some combination of this eDiscovery trinity: Spoliation sanctions, keyword searches and predictive coding. This should not come as a surprise since all three of these issues can have a strong impact on the cost, duration and disposition of a lawsuit. Indeed, the near universal desire among parties to minimize discovery costs and thereby further the resolution of cases on the merits has driven the Civil Rules Advisory Committee to explore ways to address the eDiscovery trinity in draft amendments to the Federal Rules.
While the proposed amendments may or may not succeed in reducing discovery expenses, the examples of how the eDiscovery trinity is playing out in litigation are instructive. These cases – bereft of the additional guidance being developed by the Advisory Committee – provide valuable insight on how courts, counsel and clients are handling the convergence of these issues. One such example is a recent decision from the DuPont v. Kolon Industries case.
Spoliation, Keywords and a $4.5 Million Sanction
In DuPont, the court awarded the plaintiff manufacturer $4.5 million in fees and costs that it incurred as part of its effort to address Kolon’s spoliation of ESI. In an attempt to stave off the award, Kolon argued that DuPont’s fees were not justified due to “inefficiencies” associated with DuPont’s review of Kolon’s document productions. In particular, Kolon complained about the extensive list of search terms that DuPont developed to comb through the ESI Kolon produced. According to Kolon, DuPont’s search methodology was “recklessly inefficient”:
DuPont’s forensic experts ran a list of almost 350 “keywords,” which yielded thousands of “false positives” that nevertheless had to be translated, analyzed, and briefed. Of the nearly 18,000 “hits,” only 1,955 (roughly 10 percent) were determined to be even “potentially relevant.” Thus, to state the obvious, 90 percent of the results were wholly irrelevant to the issue, but DuPont still seeks to tax Kolon for having the bulk of those documents translated and analyzed.
Kolon then asserted that the “reckless inefficiency” of the search methodology was “fairly attributable to the fact that DuPont ran insipid keywords like ‘other,’ ‘news,’ and ‘mail.’” Had DuPont been more precise with its keywords searches, argued Kolon, it “would have saved vast amounts of time and money.”
Before addressing the merits of Kolon’s arguments, the court observed how important search terms had become in discovery:
Of course, in the current world of litigation, where so many documents are stored and, hence, produced, electronically, the selection of search terms is an important decision because it, in turn, drives the subsequent document discovery, production and review.
After doing so, the court rejected Kolon’s arguments, finding instead that DuPont’s search methodology was reasonable under the circumstances. The court based its decision on the source of those search terms (derived from Kolon documents suggesting that ESI had been deleted), the “considerable volume” of Kolon’s productions and the nature of DuPont’s search (an investigation for deleted evidence).
The Impact of Predictive Coding on DuPont’s Search Efficiency
While DuPont considered the issues of spoliation and keywords in connection with the imposition of attorney fees and costs, it was silent on the impact that predictive coding might have had on the fee award. Indeed, neither the court’s order, nor the parties’ briefing considered whether the proper application of machine learning technology could have raised the success rate of DuPont’s searches for documents relevant to Kolon’s spoliation above the ten percent (10%) figure cited by Kolon.
On the one hand, many eDiscovery cognoscenti would likely assert that a properly applied predictive coding solution could have produced the same corpus of relevant documents at a fraction of the cost and effort. Others, however, might argue that predictive coding perhaps would not yield the results that DuPont obtained through keyword searches given that DuPont was looking for evidence of deleted ESI. Still others would contend that the issue is moot since DuPont was fully within its right to determine how it should conduct the search of Kolon’s document productions.
Whether predictive coding could have made a difference in DuPont is entirely speculative. Regardless, the debate over keyword searches versus machine learning technology will likely continue unabated. As it stands, the DuPont case, together with the recent decision from Apple v. Samsung, confirm that keywords may be an acceptable method for conducting searches for relevant ESI. The issue, as the DuPont court observed, turns on “the selection of the search terms.”
Nevertheless, the promise of predictive coding cannot be ignored, particularly if the technology that is used could ultimately reduce the costs and duration of discovery. Given that this debate is far from settled, these issues, along with spoliation sanctions, will likely continue to dominate the eDiscovery airwaves for the foreseeable future.








