24h-payday

In Re: Biomet Order Addresses Hot Button Predictive Coding Issue

by Matthew Nelson on December 20th, 2013 (2 Comments)

United States District Court Judge for the Northern District of Indiana, Ronald J. Miller, recently addressed what has arguably become the hottest predictive coding issue since Judge Andrew J. Peck’s February 2012 order in Da Silva Moore v. Publicis Groupe. The issue is whether or not parties who use predictive coding technology to assist with document productions should disclose the non-responsive documents used to train their system to the other side.

Judge Peck Opens the Predictive Coding Door

In Da Silva Moore, Judge Peck became the first judge to state that the use of predictive coding technology is “acceptable in appropriate cases.” Since the decision, some litigation attorneys have criticized the predictive coding protocol the parties established. Central to that criticism is the inclusion of a provision requiring the voluntary disclosure of non-privileged documents used to train the predictive coding system.

Fearing a judicial trend, many attorneys have argued that the Federal Rules of Civil Procedure (Rules) simply do not require the disclosure of non-responsive documents under any circumstances. Others argue that a little cooperation and transparency between adversaries isn’t a bad thing when one party saves money and time and the other receives a more thorough production. Not surprisingly, both sides have eagerly awaited judicial guidance.

Judge Miller Tackles the Hot Button Issue

In In Re: Biomet, Judge Miller provided that long-awaited guidance by holding that Rule 26 does not require a party to disclose seed set documents used to train a predictive coding system. The order came on the heels of an earlier April 2013 order denying plaintiffs’ motion to compel Biomet to re-do earlier document productions (unless plaintiffs paid). The plaintiffs argued that Biomet’s decision to use key word search terms and de-duplication techniques to cull 19.5 million documents down to 2.5 million before using predictive coding technology “tainted” the production process. More specifically, plaintiffs contended that using keywords to filter out documents likely excluded responsive documents that should have been produced. Judge Miller found plaintiffs’ arguments unconvincing, largely due to the fact that Biomet had already spent approximately $1.07 million on eDiscovery.

Four months later plaintiffs filed another motion requesting more transparency into Biomet’s predictive coding process. Plaintiffs moved to compel Biomet to disclose and identify the initial seed set documents used to train the predictive coding system to distinguish between a responsive and non-responsive document. Plaintiffs reasoned that knowing which documents Biomet coded as responsive and non-responsive was necessary to measure the accuracy of Biomet’s production. In the order denying plaintiffs’ request, Judge Miller stated:

“As I understand it, a predictive coding algorithm offers up a document, and the user tells the algorithm to find more like that document or that the user doesn’t want more documents like what was offered up. The Steering Committee wants the whole seed set Biomet used for the algorithm’s initial training. That request reaches well beyond the scope of any permissible discovery by seeking irrelevant or privileged documents used to tell the algorithm what not to find. That the Steering Committee has no right to discover irrelevant or privileged documents seems self-evident.”

Judge Miller continued by acknowledging plaintiffs’ argument that Biomet was not proceeding in the cooperative spirit endorsed by the Sedona Conference Cooperation Proclamation and the 7th Circuit Pilot Program. However, he stated that:

“[N]either the Sedona Conference nor the Seventh Circuit project expands a federal district court’s powers, so they can’t provide me with authority to compel discovery of information not made discoverable by the Federal Rules.”

In particular, Judge Miller pointed to the language contained in FRCP 26(b)(1) as a basis for his decision. He concluded that because the plaintiffs knew of the “existence and location” of each discoverable document Biomet used in the seed set, Biomet had complied with their production obligation. Surprisingly, Judge Miller’s analysis did not specifically address what some may argue is the key language in FRCP 26(b)(1) which states:

“For good cause, the court may order discovery of any matter relevant to the subject matter involved in the action. Relevant information need not be admissible at the trial if the discovery appears reasonably calculated to lead to the discovery of admissible evidence.”

Judge Miller went on to criticize Biomet’s “unexplained lack of cooperation” and urged Biomet to rethink its refusal to at least reveal the responsive documents used in the seed set. His comments indicated that plaintiffs’ position would be stronger if they had only requested the identification of the non-privileged and non-responsive seed set. However, he ultimately refused to compel the identity of any of the seed set documents because he lacked “any discretion in this dispute.”

Is the Issue Resolved?

Even though Judge Miller explained that he lacked “any discretion in this dispute,” some future litigants are likely to argue that Rule 26 provides judges with the discretion to order the disclosure of documents that are both non-responsive and non-privileged where appropriate. For example, proponents of disclosure are likely to argue that coding decisions applied to training documents could have a significant impact on the discovery of admissible evidence. If training documents are coded accurately, the likelihood of discovering admissible evidence increases if that evidence exists. On the other hand, adversaries are likely to respond sharply that sharing non-responsive documents has not been required in the past and should not be required in the future. In fact, following Da Silva Moore, some have argued that even keywords are work-product protected and should not be disclosed.

Conclusion

In Re: Biomet appears to be the first case addressing whether or not parties are obligated to share non-responsive documents used to train a predictive coding system — but likely won’t be the last. First, the decision is not binding. Second, Judge Miller did not thoroughly address key language contained within 26(b)(1) which invites further analysis. Lastly, the legal industry is struggling to define predictive coding best practices and to understand the range of different predictive coding technology solutions. Given the current confusion, demands for more predictive coding transparency are likely to continue as the market evolves. Don’t expect this hot button issue to cool off any time soon.

*Blog post co-authored by Matt Nelson and Adam Kuhn

2 Comments »

Music piracy the least of your audio worries; Dodd–Frank forces a closer listen

by Chris Talbott on December 11th, 2013

We’re quickly approaching another milestone in the epic implementation of the Commodity Futures Trading Commission (CFTC) rules associated with the Dodd Frank Wall Street Reform and Consumer Protection Act (DFA); the expiration of a very contentious exemptive order that provided relief to cross border swap dealers (SD) and major swap participants (MSP) and foreign groups of US SDs and MSPs. If you follow the heated debate between Wall Street and the CFTC it is quite fitting that the order happens to expire on the winter solstice, December 21st 2013. Let’s hope the day at which the sun comes to a standstill in the sky before reversing direction doesn’t forebode a similar experience in the cross border free markets.

The 848 pages of Dodd-Frank legislation has resulted in (at current count) 67 new rules, exemptive orders, guidance and five ‘other’ actions from the CFTC – the regulatory body tasked with enforcing Title VII of the DFA. Prior to the DFA, the CFTC averaged about four rules per year. eDiscovery nerds will appreciate the fact that the complexity and length of the rules issued by the CFTC requires a website that offers Proximity and Boolean search options to navigate. Within these 67 rules are critical adjustments to the way that organizations, subject to the CFTC’s scope, need to capture, store, manage, search and produce information related to the many flavors of swaps – basically derivatives by which counterparties exchange cash flows of one financial instrument for another. That information includes all data concerning the swap, and communications leading up to the execution of the swap, including any voicemail or phone conversations with relevant information.

While audio discovery is nothing new, especially in regards to criminal investigations, these new regulations, rules and guidance have anointed audio data into the critical content sources category for many enterprises. Let’s discuss what that means for the eDiscovery technology world.

1. Audio search is now must-have eDiscovery functionality

If your organization is categorized as a swap data repository, derivatives clearing organization, designated contract market, swap execution facility, swap dealer, major swap participant and non-MSP counterparty (where most organizations outside financial services will be categorized) you are now subject to new rules for swap record keeping.

First, covered organizations must retain the following:

“…all oral and written communications provided or received concerning quotes, solicitations, bids, offers, instructions, trading, and prices, that lead to the conclusion of a related cash or forward transaction, whether communicated by telephone, voicemail, facsimile, instant messaging, chat rooms, electronic mail, mobile device, or other digital or electronic media.” 77 Fed. Reg. 17 CFR Part 45 (December 8 2010)

Secondly, this data has specific retention and retrieval requirements. At Symantec, we’re keeping track by categorizing them into the 5 & 5, 5 & 3 and 1 & 5 rules:

  • All the data above, except audio files, must be retained for a period of 5 years post termination of the underlying swap.
  • For SDs and MSPs it must be retrievable and producible within 3 days
  • For non-MSP counterparties it must be retrievable and producible within 5 days
  • Audio files, they must be kept for a period of 1-year post termination of the swap and also retrievable and producible within 5 days.

2. A turnkey ‘Dodd – Frank’ solution is unlikely, so a repeatable eDiscovery process is critical

As the CFTC rules were being finalized over the past two years, Symantec invited our customers to discuss the impact of the DFA on their eDiscovery workflows. A primary concern was the belief that the rules required organizations to have a system in place to store and eventually reproduce a trade and associated communications in their entirety. The many lobbyists and organizations that submitted grievances and clarification requests to the CFTC shared this concern. In response, the CFTC adjusted its rules to state that an organization’s swap data need not be categorized and retained in what amounts to a single-swap file, provided that all related information could be retrieved and produced from wherever it resides within the required timeframe.

Although the CFTC isn’t forcing organizations into the implementation of a magical swap data captor, data growth, diversification and dispersion across the organization could still present major challenges to collecting, searching and producing requested swap information on an ad hoc basis. For example, sales and marketing data, research information on commodity markets, email and instant message communications and voice data, would very often be found in multiple systems.

In order to comply, organizations should evaluate whether they have the ability to collect audio files and other information in a timely manner from multiple data repositories. If not retained in a per-swap manner, organizations will need to be able to consolidate all relevant communications and data into a single system so that the review is complete and audit-able for requesting regulatory bodies. But pulling from these various sources is likely to collect a large amount of non-swap data. The ability to confidently exclude the large amount of non-swap related information will help organizations curtail the potential time and costs associated with identifying the proper swap data. Finally, this process should be duplicable for each search, retrieval and production to the CFTC or Swap Data Repositories.

Side note; I’m writing with an eDiscovery-only lens, but the retention and management angle of this particular challenge lends itself to a proactive information governance discussion, one that our friends at eDiscovery Journal have touched upon already.

3. eDiscovery search capabilities must satisfy the unique nature of swap data

The DFA record keeping requirements as it pertains to swaps are unique in that they require the combination of both static, database-like structured data (trade value, time, etc.) and un-structured communications (email, Bloomberg messages, voice mail, etc.) These communications will often bridge multiple systems, for instance, multiple emails and Bloomberg IM’s prior to a phone call confirming the trade. Teams reviewing data prior to production to the CFTC or Swap Data Repositories will be challenged to make sense of the entire communication thread especially under a five-day deadline. This review process is not one to be taken lightly either. Teams need to be extra careful with the search and review of all audio content as they risk mistakenly producing spoken information, not as easily identified as written, that is not related to the trade.

Organizations should consider how quickly they could get the necessary information in a searchable form. Five days to retrieve and produce is slim at best, so even audio processing advantages, like phonetic based audio indexing as opposed to speech to text to transcription could be critical. They should also consider how they can organize swap communications into a coherent form – functionality like discussion threading and topic clustering can help teams quickly understand and identify communication related to a specific swap.

The Symantec eDiscovery team considered the Dodd Frank Act and CFTC rules as we developed our latest release of the Clearwell eDiscovery Platform, from Symantec, now enabling advanced audio processing, search, and review capabilities to drastically accelerate audio discovery efforts. In addition to supporting over 400 file types for electronic discovery, these new capabilities leverage a powerful phonetic engine that can index up to 20,000 hours of recorded audio per day. Whether you are investigating voicemails, call-center recordings, or financial transactions, Symantec makes it easy to find what you are looking for.

 

Comment on this post »

Kleen Products Update: Is Technology Usage Becoming the New “Proportionality” Factor for Judges?

by Matthew Nelson on October 30th, 2013

Readers may recall last year’s expensive battle over the use of predictive coding technology in the 7th Circuit’s Kleen Products case. Although the battle was temporarily resolved in Defendants’ favor (they were not required to redo their production using predictive coding or other “Content Based Advanced Analytics” software), a new eDiscovery battle has surfaced this year between Plaintiffs and a non-party, The Levin Group (“TLG”).

In Kleen, Plaintiffs allege anticompetitive and collusive conduct by a number of companies in the containerboard industry. The Plaintiffs served TLG with a subpoena requesting “every document relating to the containerboard industry.” TLG, a non-party retained as a financial and strategic consultant by two of the Defendants, complied by reviewing 21,000 documents comprising 82,000 pages of material.

Extraordinary Billing Rates for Manual Review?

The wheels began to fall off the bus when Plaintiffs received a $55,000 bill from TLG for the review and production of documents in response to the subpoena. TLG billed $500/hour for 110 hours of document review performed by TLG’s founder (a lawyer) and a non-lawyer employee. Although FRCP 45(c)(3)(C) authorizes “reasonable compensation” of a subpoenaed nonparty and the Court previously ordered the Plaintiffs to “bear the costs of their discovery request,” TLG and the Plaintiffs disagreed over the definition of “reasonable compensation” once the production was complete. Plaintiffs argue that the bill is excessive in light of market rates of $35-$45/hour charged by contract attorneys for review and they also claim that they never agreed to a billing rate.

Following a great deal of back and forth about the costs, the court decided to defer its decision until December 16, 2013 because discovery in the underlying antitrust action is still ongoing. Regardless of the outcome in Kleen, the current dispute feels a bit like déjà vu all over again. Both disputes highlight the importance of cooperation and role of technology in reducing eDiscovery costs. For example, better cooperation among the parties during earlier stages of discovery might have helped prevent or at least minimize some of the downstream post-production arguments that occurred last year and this year. Although the “cooperation” drum has been beaten loudly for several years by judges and think tanks like the Sedona Conference, cooperation is an issue that will never fully disappear in an adversarial system.

Judges May Increasingly Consider Technology as Part of Proportionality Analysis

A more novel and interesting eDiscovery issue in Kleen relates to the fact that judges are increasingly being asked to consider the use (or non-use) of technology when resolving discovery disputes. Last year in Kleen the issue was whether or not a producing party should be required to use advanced technology to assure a more thorough production. This year the Kleen court may be asked to consider the role of technology in the context of the disputed document review fees. For example, the court may consider whether or not TLG could have reduced the number of documents by leveraging de-duplication, domain filtering, document threading or other tools in the Litigator’s Toolbelt™ to reduce the number of documents requiring costly manual review.

Recent trends indicate that the federal bench is increasingly under pressure to consider whether or not and how parties utilize technology as factors in resolving eDiscovery disputes. For example, a 2011 Forbes article titled: “Will New Electronic Discovery Rules Save Organizations Millions or Deny Justice?” framed early discussions about amending the Federal Rules of Civil Procedure (Rules) as follows:

A key question that many feel has been overlooked is whether or not organizations claiming significant eDiscovery costs could have reduced those costs had they invested in better technology solutions.  Most agree that technology alone cannot solve the problem or completely eliminate costs.  However, many also believe that understanding the extent to which the inefficient or non-use of modern eDiscovery technology solutions impacts overall costs is critical to evaluating whether better practices might be needed instead of new Rules.”

Significant interest in the topic was further sparked in Da Silva Moore v. Publicis Group in 2012 when Judge Andrew Peck put parties on notice that technology is increasingly important in evaluating eDiscovery disputes. In Da Silva Moore, Judge Peck famously declared that “computer-assisted review is acceptable in appropriate cases.” Judge Peck’s decision was the first to squarely address the use of predictive coding technology, and a number of cases, articles, and blogs on the topic quickly ensued in what seemed to be the opening of Pandora’s Box with respect to the technology discussion.

More recently, The Duke Law Center for Judicial Studies proposed that the Advisory Committee on Civil Rules add language to the newly proposed amendments to the Federal Rules of Civil Procedure addressing the use of technology-assisted review (TAR). The group advocates adding the following sentence at the end of the first paragraph of the Committee Note to proposed Rule 26(b)(1) dealing with “proportionality” in eDiscovery:

“As part of the proportionality considerations, parties are encouraged, in appropriate cases, to consider the use of advanced analytical software applications and other technologies that can screen for relevant and privileged documents in ways that are at least as accurate as manual review, at far less cost.

Conclusion

The significant role technology plays in managing eDiscovery risks and costs continues to draw more and more attention from lawyers and judges alike. Although early disputes in Kleen highlight the fact that litigators do not always agree on what technology should be used in eDiscovery, most in the legal community recognize that many technology tools in the Litigator’s Toolbelt™ are available to help reduce the costs of eDiscovery. Regardless of how the court in Kleen resolves the current issue, the use or non-use of technology tools is likely to become a central issue in the Rules debate and a prominent factor in most judges’ proportionality analysis in the future.

*Blog post co-authored by Matt Nelson and Adam Kuhn

Comment on this post »

Moving Data to the Cloud? Top 5 Tips for Corporate Legal Departments

by Matthew Nelson on September 30th, 2013

One of the hottest information technology (IT) trends is to move data once stored within the corporate firewall into a hosted cloud environment managed by third-party providers. In 2013 alone, the public cloud services market is forecast to grow an astonishing 18.5 percent to $131 billion worldwide, up from $111 billion in 2012. The trend is [...]

Comment on this post »

Judge Scheindlin Blasts Proposed FRCP Amendments in Unconventional Style

by Matthew Nelson on August 29th, 2013 (1 Comment)

A prominent federal judge wasted little time to air her dissatisfaction with the proposed amendments to the Federal Rules of Civil Procedure (Rules) the exact day the period for public comment on the Rules opened. In lieu of following the formal process of submitting written comments to the proposed amendments the Honorable Shira Scheindlin, Federal [...]

1 Comment »

The Top 3 Forensic Data Collection Myths in eDiscovery

by Matthew Nelson on August 7th, 2013

Confusion about establishing a legally defensible approach for collecting data from computer hard drives during eDiscovery has existed for years. The confusion stems largely from the fact that traditional methodologies die hard and legal requirements are often misunderstood. The most traditional approach to data collection entails making forensic copies or mirror images of every custodian [...]

Comment on this post »

The Need for a More Active Judiciary in eDiscovery

by Philip Favro on July 24th, 2013

Various theories have been advanced over the years to determine why the digital age has caused the discovery process to spiral out of control. Many believe that the sheer volume of ESI has led to the increased costs and delays that now characterize eDiscovery. Others place the blame on the quixotic advocacy of certain lawyers [...]

Comment on this post »

The Proportionality Amendments to the Federal Rules Spotlight the Importance of Efficient, Cost-Effective eDiscovery

by Philip Favro on July 16th, 2013 (1 Comment)

One of the most compelling objectives for amending the Federal Rules of Civil Procedure is to make civil discovery more efficient and cost effective. The proposed amendment to Federal Rule 1 – featured in our introductory post on this series that provides a comprehensive overview of the proposed amendments – is only one of several [...]

1 Comment »

A Comprehensive Look at the Newly Proposed eDiscovery Amendments to the Federal Rules of Civil Procedure

by Philip Favro on July 9th, 2013 (1 Comment)

You have probably heard the news. Changes are in the works for the Federal Rules of Civil Procedure that govern the discovery process. Approved for public comment last month by the Standing Committee on Rules of Practice and Procedure, the proposed amendments are generally designed to streamline discovery, encourage cooperative advocacy among litigants and eliminate [...]

1 Comment »

Push or Pull? Deciding How Much Oversight is Required of In-house Counsel in eDiscovery

by Philip Favro on June 18th, 2013

When Kolon Industries recently found itself on the wrong side of a $919 million verdict, the legal department for the South Korean-based manufacturer probably started to take inventory on what it might have done differently to have avoided such a fate. While that list could have included any number of entries, somewhere near the top [...]

Comment on this post »