Posts Tagged ‘duty to preserve’

Breaking News: Over $12 million in Attorney Fees Awarded in Patent Case Involving Predictive Coding

Thursday, February 14th, 2013

A federal judge for the Southern District of California rang in the month of February by ordering plaintiffs in a patent related case to pay a whopping $12 million in attorney fees. The award included more than $2.8 million in “computer assisted” review fees and to add insult to injury, the judge tacked on an additional $64,316.50 in Rule 11 sanctions against defendants’ local counsel. Plaintiffs filed a notice of appeal on February 13th, but regardless of the final outcome, the case is chock-full of important lessons about patent litigation, eDiscovery and the use of predictive coding technology.

The Lawsuit

In Gabriel Technologies Corp. v. Qualcomm Inc., plaintiffs filed a lawsuit seeking over $1 billion in damages. Among its eleven causes of action were claims for patent infringement and misappropriation of trade secrets.  The Court eventually dismissed or granted summary judgment in defendants’ favor as to all of plaintiffs’ claims making defendants the prevailing party and prompting Defendants’ subsequent request for attorneys’ fees.

In response to defendants’ motion for attorney fees, U. S. District Judge Anthony J. Battaglia relied on plaintiffs’ repeated email references to “the utter lack of a case” and their inability to identify the alleged patent inventors to support his finding that their claims were brought in “subjective bad faith” and were “objectively baseless.” Given these findings, Judge Battaglia determined that an award of attorney fees was warranted.

The Attorney Fees Award

The judge then turned to the issue of whether or not defendants’ fee request for $13,465,331.01 was reasonable. He began by considering how defendants itemized their fees which were broken down as follows:

  • $10,244,053 for its outside counsel Cooley LLP (“Cooley”);
  • $391,928.91 for document review performed by Black Letter Discovery, Inc. (“Black Letter”); and
  • $2,829,349.10 for a document review algorithm generated by outside vendor H5.

The court also considered defendants’ request that plaintiffs’ local counsel be held jointly and severally liable for the entire fee award based on the premise that local counsel is required to certify that all pleadings are legally tenable and “well-grounded in fact” under Federal Rule of Civil Procedure 11.

Following a brief analysis, Judge Battaglia found the overall request “reasonable,” but reduced the fee award by $1 million. In lieu of holding local counsel jointly liable, the court chose to sanction local counsel in the amount of $64,316.50 (identical to the amount of local counsel’s fees) for failing to “undertake a reasonable investigation into the merits of the case.”

Three Lessons Learned

The case is important on many fronts. First, the decision makes clear that filing baseless patent claims can lead to financial consequences more severe than many lawyers might expect. If reviewed and upheld on appeal, counsel in the Ninth Circuit accustomed to fending off unsubstantiated patent or misappropriation claims will be armed with an important new tool to ward off would-be patent trolls.

Second, Judge Battaglia’s decision to order Rule 11 sanctions should serve as a wake-up call for local counsel. The ruling reinforces the fact that merely rubber-stamping filings and passively monitoring cases is a risky proposition. Gabriel Technologies illustrates the importance of properly monitoring lead counsel and the consequences of not complying with the mandate of Rule 11 whether serving as lead or local counsel.

The final lesson relates to curbing the costs of eDiscovery and the importance of understanding tools like predictive coding technology. The court left the barn door wide open for plaintiffs to attack defendants’ predictive coding and other fees as “unreasonable,” but plaintiffs didn’t bite. In evaluating H5’s costs, the court determined that Cooley’s review fees were reasonable because Cooley used H5’s “computer-assisted” review services to apparently cull down 12 million documents to a more reasonable number of documents prior to manual review. Although one would expect this approach to be less expensive than paying attorneys to review all 12 million documents, $2,829,349.10 is still an extremely high price to pay for technology that is expected to help cut traditional document review costs by as much as 90 percent.

Plaintiffs were well-positioned to argue that predictive coding technology should be far less expensive because the technology allows a fraction of documents to be reviewed at a fraction of the cost compared to traditional manual review. These savings are possible because a computer is used to evaluate how human reviewers categorize a small subset of documents in order to construct and apply an algorithm that ranks the remaining documents by degree of responsiveness automatically. There are many tools on the market that vary drastically in quality and price, but a price tag approaching $3 million is extravagant and should certainly raise a few eyebrows in today’s predictive coding market. Whether or not plaintiffs missed an opportunity to challenge the reasonableness of defendants’ document review approach may never be known. Stay tuned to see if these and other arguments surface on appeal.

Q&A with Allison Walton of Symantec and Laura Zubulake, Author of Zubulake’s e-Discovery: The Untold Story of my Quest for Justice

Monday, February 4th, 2013

The following is my Q&A with Laura Zubulake, Author of Zubulake’s e-Discovery: The Untold Story of my Quest for Justice.

Q: Given your case began in 2003, and the state of information governance today, do you believe that adoption to has been too slow? Do you think organizations in 2013, ten years later, have come far enough in managing their information?

A: From a technology standpoint, the advancements have been significant. The IT industry has come a long way with regard to the tools available to conduct eDiscovery. Alternatively, surveys indicate a significant percentage of organizations do not prioritize information management and have not established eDiscovery policies and procedures. This is disappointing. The fact that organizations apparently do not understand the value of proactively managing information only puts them at a competitive disadvantage and at increased risk.

 Q: Gartner predicts that the market will be $2.9 billion by 2017. Given this prediction, don’t you think eDiscovery is basically going to be absorbed as a business process and not something so distinct as to require outside 3rd party help? 

A: First, as a former financial executive those predictions, if realized, are reasonably attractive. Any business that can generate double-digit revenue growth until 2017, in this economy and interest rate environment, is worthy of note (assuming costs are controlled). Second, here I would like to distinguish between information governance and eDiscovery. I view eDiscovery as a subset of a broader information governance effort. My case while renowned for eDiscovery, at its essence, was about information. I insisted on searching for electronic documents because I understood the value and purpose of information. I could not make strategic decisions without, what I refer to as, “full” information. The Zubulake opinions were a result of my desire for information, not the other way around. I believe corporations will increasingly recognize the need to proactively manage information for business, cost, legal, and risk purposes. As such, I think information governance will become more of a business process, just like any management, operational, product, and finance process.

With regard to eDiscovery, I think there will continue to be a market for outside third-party assistance. eDiscovery requires specific skills and technologies. Companies lacking financial resources and expertise, and requiring assistance to address the volume of data will likely deem it economical to outsource eDiscovery efforts. As with any industry, eDiscovery will evolve.  The sector has grown quickly. There will be consolidation. Eventually, the fittest will survive.

Q: What do you think about the proposed changes to the FRCP regarding preservation? 

A: As a former plaintiff (non-attorney), eDiscovery was (to me) about preservation. Very simply, documents could not be collected, reviewed, and produced if they had not been preserved. Any effort to clarify preservation rules would benefit all parties—uncertainty created challenges. Of course, there needs to be a balance between overwhelming corporations with legal requirements and costs versus protecting a party’s rights to evidence. Apparently, the current proposals do not specifically pertain to preservation. They concern the scope of discovery and proportionality and thus indirectly address the issue of preservation. While this would be helpful, it is not ideal. Scope is, in part, a function of relevance – a frequently debated concept. What was relevant to me might not have been relevant to others. Regarding proportionality, my concern is perspective.  Too often I find discussions about proportionality, stem from the defendant’s perspective. Rarely, do I hear the viewpoint of the plaintiff represented. Although not all plaintiffs are individuals, often the plaintiff is the relatively under-resourced party. Deciding whether the burden of proposed discovery outweighs its likely benefits is not a science. As I wrote in my book:

Imagine if the Court were to have agreed with [the Defendant’s] argument and determined the burden of expense of the proposed discovery in my case outweighed its likely benefit. Not only would the Zubulake opinions not have come to fruition, but also I would have been denied my opportunity to prove my claims. 

Q: Lastly, what other trends are you see in in the area of eDiscovery and what predictions do you have for the market in 2013? 

A: eDiscovery Morphs. Organizations will realize that eDiscovery should be part of a broader information governance effort. Information governance will become a division within a corporation with separate accountable management from which operations, legal, IT, and HR professionals can source and utilize information to achieve goals. Financial markets will increasingly reward companies (with higher multiples) who proactively manage information.

Reorganization. Organizations will recognize while information is their most valuable asset it is fearless— crossing functions, divisions, borders and not caring if it overwhelms an entity with volume, costs, and risks. Organizational structures will need to adapt and accommodate the ubiquitous nature of information. A systems thinking framework (understanding how processes influence one another within a whole) will increasingly replace a business silo structure. Information and communication managed proactively and globally, will improve efficiency, enhance profitability, reduces costs, increase compliance, and mitigate risks.

Search. Algorithms become an accepted search tool. Although keyword, concept, cluster, etc. searches will still play a role. For years, law enforcement, government, and Wall Street have used algorithms—the concept is not new and not without peril (significant market corrections were the result of algorithms gone wrong). Parties confronted with volumes of data and limited resources will have no choice but to agree to computer assistance. However, negative perceptions and concerns about algorithms will only change when there is a case where the parties initiate and voluntarily agree to their use.

Education. Within information governance efforts, organizations will increasingly establish training for employees. Employees need to be educated about the origination, maintenance, use, disposal, risks, rules, and regulations associated with ESI. A goal should be to lessen the growth of data and encourage smart and efficient communications. Education is a cost-control and risk-mitigating effort.

BYOD Reconsidered. Thinking a BYOD to work policy is cost-effective will be questioned and should be evaluated on a risk-adjusted basis. When companies analyze the costs (cash outlay) of providing employees with devices versus the unquantifiable costs associated with the lack of control, disorganization, and increased risks – it will become clear BYOD has the potential to be very expensive.

Government Focus. I had the privilege of addressing the Dept. of Justice’s Civil E-Discovery training program. It was evident to me that eDiscovery is one of the department’s focuses. With recent headlines concerning emails uncovering evidence (e.g. Fast and Furious), government entities (state and federal) will increasingly adopt rules, procedures, and training to address ESI. This brings me back to your first question—have organizations come far enough in managing their information? Government efforts to focus on eDiscovery will incentivize more corporations to (finally) address eDiscovery and information governance challenges.

Stay tuned for more breaking news coverage with industry luminaries.

For Westerners Seeking Discovery From China, Fortune Cookie Reads: Discovery is Uncertain, and Will Likely Be Hard

Monday, January 7th, 2013

In a recent Inside Counsel article, we explored the eDiscovery climate in China and some of the most important differences between the Chinese and U.S. legal systems. There is an increased interest in China and the legal considerations surrounding doing business with Chinese organizations, which we also covered on this Inside Counsel webcast.

 Five highlights from this series include:

1.  Conflicting Corporate Cultures- In general, business in China is done in a way that relies heavily on relationships. This can easily cause a conflict of interest for organizations and put them at risk for violations under the FCPA and UK Bribery Act. The concept that “relationships are gold” or Guanxi is crucial to conducting successful business in China. However, a fine line exists for organizations, necessitating a need for strong local counsel and guidance. Moreover, Chinese businesses don’t share the same definitions the Western world does for concepts like: information governance, legal hold or privacy.

 2.   FCPA and the UK Bribery Act- Both of these regulations are very troublesome for those doing business in China, yet necessary for regulating white-collar crime. In order to do business in China one must walk a fine line developing close relationships, without going too far and participating in bribery or other illegal acts. There are increased levels of prosecution under both of these statutes as businesses globalize.

3.  Drastically Different Legal Systems- The Chinese legal system is very different than those of common law jurisdictions. China’s legal system is based on civil law and there is no requirement for formal pre-litigation discovery. For this reason, litigants may find it very difficult to successfully procure discovery from Chinese parties. Chinese companies have been historically slow to cooperate with U.S. regulatory bodies and many discovery requests in civil litigation can take up to a year for a response. A copy of our eDiscovery passport on China can be found here, along with other important countries.

4.  State Secrets- In addition to the differences between common and civil law jurisdictions, China has strict laws protecting state secrets. Anything deemed a state secret would not be discoverable, and an attempt to remove state secrets from China could result in criminal prosecution. The definition of a state secret under People’s Republic of China law includes a wide range of information and is more ambiguous than Western definitions about national security (for example, the Chinese definitions are less defined than those in the U.S. Patriot Act). Politically sensitive data is susceptible to the government’s scrutiny and protection, regardless of whether it is possessed by PRC citizens or officials working for foreign corporations- there is no distinction or exception for civil discovery.

5.  Globalization- Finally, it is no secret that the world has become one huge marketplace. The rapid proliferation of information creation as well as the clashing of disparate legal systems creates real discovery challenges. However, there are also abundant opportunities for lawyers that become specialized in the Asia Pacific region today. Lawyers that are particularly adept in eDiscovery and Asia will flourish for years to come.

For more, read here…

Predictive Coding 101 & the Litigator’s Toolbelt

Wednesday, December 5th, 2012

Query your average litigation attorney about the difference between predictive coding technology and other more traditional litigation tools and you are likely to receive a wide range of responses. The fact that “predictive coding” goes by many names, including “computer-assisted review” (CAR) and “technology-assisted review” (TAR) illustrates a fundamental problem: what is predictive coding and how is it different from other tools in the litigator’s technology toolbelt™?

 Predictive coding is a type of machine-learning technology that enables a computer to “predict” how documents should be classified by relying on input (or “training”) from human reviewers. The technology is exciting for organizations attempting to manage skyrocketing eDiscovery costs because the ability to expedite the document review process and find key documents faster has the potential to save organizations thousands of hours of time. In a profession where the cost of reviewing a single gigabyte of data has been estimated to be around $18,000, narrowing days, weeks, or even months of tedious document review into more reasonable time frames means massive savings for thousands of organizations struggling to keep litigation expenditures in check.

 Unfortunately, widespread adoption of predictive coding technology has been relatively slow due to confusion about how predictive coding differs from other types of CAR or TAR tools that have been available for years. Predictive coding, unlike other tools that automatically extract patterns and identify relationships between documents with minimal human intervention, requires a deeper level of human interaction. That interaction involves significant reliance on humans to train and fine-tune the system through an iterative, hands-on process. Some common TAR tools used in eDiscovery that do not include this same level of interaction are described below:

  •  Keyword search: Involves inputting a word or words into a computer which then retrieves documents within the collection containing the same words. Also known as Boolean searching, keyword search tools typically include enhanced capabilities to identify word combinations and derivatives of root words among other things.
  •  Concept search: Involves the use of linguistic and statistical algorithms to determine whether a document is responsive to a particular search query. This technology typically analyzes variables such as the proximity and frequency of words as they appear in relationship to a keyword search. The technology can retrieve more documents than keyword searches because conceptually related documents are identified, whether or not those documents contain the original keyword search terms.
  •  Discussion threading: Utilizes algorithms to dynamically link together related documents (most commonly e-mail messages) into chronological threads that reveal entire discussions. This simplifies the process of identifying participants to a conversation and understanding the substance of the conversation.
  •  Clustering: Involves the use of algorithms to automatically organize a large collection of documents into different topical categories based on similarities between documents. Reviewing documents organized categorically can help increase the speed and efficiency of document review. 
  •  Find similar: Enables the automated retrieval of other documents related to a particular document of interest. Reviewing similar documents together accelerates the review process, provides full context for the document under review, and ensures greater coding consistency.
  •  Near-duplicate identification: Allows reviewers to easily identify, view, and code near-duplicate e-mails, attachments, and loose files. Some systems can highlight differences between near-duplicate documents to help simplify document review.

Unlike the TAR tools listed above, predictive coding technology relies on humans to review a small fraction of the overall document population, which ultimately results in a fraction of the review costs. The process entails feeding decisions about how to classify a small number of case documents called a training set into a computer system. The computer then relies on the human training decisions to generate a model that is used to predict how the remaining documents should be classified. The information generated by the model can be used to rank, analyze, and review the documents quickly and efficiently. Although documents can be coded with multiple designations that relate to various issues in the case during eDiscovery, many times predictive coding technology is simply used to segregate responsive and privileged documents from non-responsive documents in order to expedite and simplify the document review process.

 Training the predictive coding system is an iterative process that requires attorneys and their legal teams to evaluate the accuracy of the computer’s document prediction scores at each stage. A prediction score is simply a percentage value assigned to each document that is used to rank all the documents by degree of responsiveness. If the accuracy of the computer-generated predictions is insufficient, additional training documents can be selected and reviewed to help improve the system’s performance. Multiple training sets are commonly reviewed and coded until the desired performance levels are achieved. Once the desired performance levels are achieved, informed decisions can be made about which documents to produce.

 For example, if the legal team’s analysis of the computer’s predictions reveals that within a population of 1 million documents, only those with prediction scores in the 70 percent range and higher appear to be responsive, the team may elect to produce only those 300,000 documents to the requesting party. The financial consequences of this approach are significant because a majority of the documents can be excluded from expensive manual review by humans. The simple rule of thumb in eDiscovery is that the fewer documents requiring human review, the more money saved since document review is typically the most expensive facet of eDiscovery.

 Hype and confusion surrounding the promise of predictive coding technology has led some to believe that the technology renders other TAR tools obsolete. To the contrary, predictive coding technology should be viewed as one of many different types of tools in the litigator’s technology toolbelt™ that often can and should be used together. Choosing which of these tools to use and how to use them depends on the case and requires balancing factors such as discovery deadlines, cost, and complexity. Many believe the choice about which tools should be used for a particular matter, however, should be left to producing party as long as the tools are used properly and in a manner that is “just” for both parties as mandated by Rule 1 of the Federal Rules of Civil Procedure

 The notion that parties should be able to choose which tools they use during discovery recently garnered support in the 7th Federal Circuit. In Kleen Products, LLC, et. al. v. Packaging Corporation of America, et. al., Judge Nolan was faced with exploring plaintiffs’ claim that the defendants’ should be required to supplement their use of keyword searching tools with more advanced tools in order to better comply with their duty to produce documents. Plaintiffs’ argument hinged largely on the assumption that using more advanced tools would result in a more thorough document production. In response to this argument, Judge Nolan referenced Sedona Best Practices Recommendations & Principles for Addressing Electronic Document Production during a hearing between the parties to suggest that carpenter (end user) is best equipped to select the appropriate tool during discovery. Sedona Principle 6 states that:

“[r]esponding parties are best situated to evaluate the procedures, methodologies, and technologies appropriate for preserving and producing their own electronically stored information.”

Even though the parties in Kleen Products ultimately postponed further discussion about whether tools like predictive coding technology should be used when possible during discovery, the issue remains important because it is likely to resurface again and again as predictive coding momentum continues to grow. Some will argue that parties who fail to leverage modern technology tools like predictive coding are attempting to game the legal system to avoid thorough document productions.  In some instances, that argument could be valid, but it should not be a foregone conclusion.

Although there will likely come a day where predictive coding technology is the status quo for managing large scale document review, that day has not yet arrived. Predictive coding technology is a type of machine learning technology that has been used in other disciplines for decades. However, predictive coding tools are still very new to the field of law. As a result, most predictive coding tools lack transparency because they provide little if any information about the underlying statistical methodologies they apply. The issue is important because the misapplication of statistics could have a dramatic effect on the thoroughness of document productions. Unfortunately, these nuanced issues are sometimes misunderstood or overlooked by predictive coding proponents –a problem that could ultimately result in unfairness to requesting parties and stall broader adoption of otherwise promising technology. 

Further complicating matters is the fact that several solution providers have introduced new predictive coding tools in recent months to try and capture market share. In the long term, competition is good for consumers and the industry as a whole. In the short term, however, most of these tools are largely untested and vary in quality and ease of use, thereby adding more confusion to would-be consumers. The unfortunate end result is that many lawyers are shying away from using predictive coding technology until the pros and cons of various technology solutions and their providers are better understood.  Market confusion is often one of the biggest stumbling blocks to faster adoption of technology that could save organizations millions and the current predictive coding landscape is a testament to this fact.

Eliminating much of the current confusion through education is the precise goal of Symantec’s Predictive Coding for Dummies book. The book addresses everything from predictive coding case law and defensible workflows, to key factors that should be considered when evaluating different predictive coding tools. The book strives to provide attorneys and legal staff accustomed to using traditional TAR tools like keyword searching with a baseline understanding of a new technological approach that many find confusing. We believe providing the industry with this basic level of understanding will help ensure that predictive coding technology and related best practices standards will evolve in a manner that is fair to both parties –ultimately, expediting rather than slowing broader adoption of this promising new technology. To learn more, download a free copy of Predictive Coding for Dummies and feel free to share your feedback and comments below.

Q&A With Predictive Coding Guru, Maura R. Grossman, Esq.

Tuesday, November 13th, 2012

Can you tell us a little about your practice and your interest in predictive coding?

After a prior career as a clinical psychologist, I joined Wachtell Lipton as a litigator in 1999, and in 2007, when I was promoted to counsel, my practice shifted exclusively to advising lawyers and clients on legal, technical, and strategic issues involving electronic discovery and information management, both domestically and abroad.

I became interested in technology-assisted review (“TAR”) in the 2007/2008 time frame, when I sought to address the fact that Wachtell Lipton had few associates to devote to document review, and contract attorney review was costly, time-consuming, and generally of poor quality.  At about the same time, I crossed paths with Jason R. Baron and got involved in the TREC Legal Track.

What are a few of the biggest predictive coding myths?

There are so many, it’s hard to limit myself to only a few!  Here are my nominations for the top ten, in no particular order:

Myth #1:  TAR is the same thing as clustering, concept search, “find similar,” or any number of other early case assessment tools.
Myth #2:  Seed or training sets must always be random.
Myth #3:  Seed or training sets must always be selected and reviewed by senior partners.
Myth #4:  Thousands of documents must be reviewed as a prerequisite to employing TAR, therefore, it is not suitable for smaller matters.
Myth #5:  TAR is more susceptible to reviewer error than the “traditional approach.”
Myth #6:  One should cull with keywords prior to employing TAR.
Myth #7:  TAR does not work for short documents, spreadsheets, foreign language documents, or OCR’d documents.
Myth #8:  Tar finds “easy” documents at the expense of “hot” documents.
Myth #9:  If one adds new custodians to the collection, one must always retrain the system.
Myth #10:  Small changes to the seed or training set can cause large changes in the outcome, for example, documents that were previously tagged as highly relevant can become non-relevant. 

The bottom line is that your readers should challenge commonly held (and promoted) assumptions that lack empirical support.

Are all predictive coding tools the same?  If not, then what should legal departments look for when selecting a predictive coding tool?

Not at all, and neither are all manual reviews.  It is important to ask service providers the right questions to understand what you are getting.  For example, some TAR tools employ supervised or active machine learning, which require the construction of a “training set” of documents to teach the classifier to distinguish between responsive and non-responsive documents.  Supervised learning methods are generally more static, while active learning methods involve more interaction with the tool and more iteration.  Knowledge engineering approaches (a.k.a. “rule-based” methods) involve the construction of linguistic and other models that replicate the way that humans think about complex problems.  Both approaches can be effective when properly employed and validated.  At this time, only active machine learning and rule-based approaches have been shown to be effective for technology-assisted review.  Service providers should be prepared to tell their clients what is “under the hood.”

What is the number one mistake practitioners should avoid when using these tools?

Not employing proper validation protocols, which are essential to a defensible process.  There is widespread misunderstanding of statistics and what they can and cannot tell us.  For example, many service providers report that their tools achieve 99% accuracy.  Accuracy is the fraction of documents that are correctly coded by a search or review effort.  While accuracy is commonly advanced as evidence of an effective search or review effort, it can be misleading because it is heavily influenced by prevalence, or the number of responsive documents in the collection.  Consider, for example, a document collection containing one million documents, of which ten thousand (or 1%) are relevant.  A search or review effort that identified 100% of the documents as non-relevant, and therefore, found none of the relevant documents, would have 99% accuracy, belying the failure of that search or review effort to identify a single relevant document.

What do you see as the key issues that will confront practitioners who wish to use predictive coding in the near-term?

There are several issues that will be played out in the courts and in practice over the next few years.  They include:  (1) How does one know if the proposed TAR tool will work (or did work) as advertised?; (2) Must seed or training sets be disclosed, and why?; (3) Must documents coded as non-relevant be disclosed, and why?; (4) Should TAR be held to a higher standard of validation than manual review?; and (5) What cost and effort is justified for the purposes of validation?  How does one ensure that the cost of validation does not obliterate the savings achieved by using TAR?

What have you been up to lately?

In an effort to bring order to chaos by introducing a common framework and set of definitions for use by the bar, bench, and vendor community, Gordon V. Cormack and I recently prepared a glossary on technology-assisted review that is available for free download at:  http://cormack.uwaterloo.ca/targlossary.  We hope that your readers will send us their comments on our definitions and additional terms for inclusion in the next version of the glossary.

Maura R. Grossman, counsel at Wachtell, Lipton, Rosen & Katz, is a well-known e-discovery lawyer and recognized expert in technology-assisted review.  Her work was cited in the landmark 2012 case, Da Silva Moore v. Publicis Group (S.D.N.Y. 2012).

5 questions with Ralph Losey about the New Electronic Discovery Best Practices (EDBP) Model for Attorneys

Tuesday, November 6th, 2012

The eDiscovery world is atwitter with two new developments – one is Judge Laster’s opinion in the EORHB case where he required both parties to use predictive coding. The other is the new EDBP model, created by Ralph Losey (and team) to “provide a model of best practices for use by law firms and corporate law departments.” Ralph was kind enough to answer a few questions for eDiscovery 2.0:

1. While perhaps not fair, I’ve already heard the EDBP referred to as the “new EDRM.” If busy folks could only read one paragraph on the distinction, could you set them straight?

“EDRM, the Electronic Discovery Reference Model, covers the whole gamut of an e-discovery project. The model provides a well-established, nine-step workflow that helps beginners understand e-discovery. EDBP, Electronic Discovery Best Practices, is focused solely on the activities of lawyers. The EDBP identifies a ten-step workflow of the rendition of legal services in e-discovery. Moreover, EDBP.com attempts to capture and record what lawyers specializing in the field now consider the best practices for each of these activities.”

“By the way, although I have a copyright on these diagrams, anyone may freely use the diagrams. We encourage that. We are also open to suggestions for best practices from any practicing lawyer. We anticipate that this will be a constantly evolving model and collection of best practices.”

2. Given the lawyer-centric focus, what void are you attempting to fill with the EDBP?

I was convinced by my friend Jason Baron of the need for standards in the world of e-discovery. It is too much of a wild west out there now, and we need guidance. But as a private lawyer I am also cognizant of the dangers of creating minimum standards for lawyers that could be used as a basis for malpractice suits. It is not an appropriate thing for any private group to do. It is a judicial matter that will arise out of case law and competition. So after a lot of thought we realized that minimum standards should only be articulated for the non-legal-practice part of e-discovery, in other words, standards should be created for vendors only and their non-legal activities. The focus for lawyers should be on establishing best practices, not minimum standards. I created this graphic using the analogy of a full tank of gas to visualize this point and explained it my blog Does Your CAR (“Computer Assisted Review”) Have a Full Tank of Gas?


“This continuum of competence applies not only to the legal service of Computer Assisted Review (CAR), aka Technology Assisted Review (TAR), but to all legal services. The goal of EDBP is to help lawyers avoid negligence by staying far away from minimum standards and focus instead of the ideals, the best practices.”


3. The EDBP has ten steps. While assuredly unfair, what step contains the most controversy/novelty compared to business as usual in the current e-Discovery world?

“None really. That’s the beauty of it. The EDBP just documents what attorneys already do. The only thing controversial about it, if you want to call it that, is that it established another frame of reference for e-discovery in addition to the EDRM. It does not replace EDRM. It supplements it. Most lawyers specializing in the field will get EDBP right away.”


“I suppose you could say giving Cooperation its very own key place in a lawyer’s work flow might be somewhat controversial, but there is no denying that the rules, and best practices, require lawyers to talk to each other and at least try to cooperate. Failing that, all the judges and experts I have heard suggest that you should initiate early motion practice and not wait until the end. There seems to be widespread consensus in the e-discovery community on the key role of cooperative dialogues with opposing counsel and the court, so I do not think it is really controversial, but may still be news to the larger legal community. In fact, all of these best practices may not be well-known to the average Joe Litigator, which just shows the strong need for an educational resource like EDBP.”

4. Why not use “information governance” instead of “litigation readiness” on the far left hand side of the EDBP?

 There is far more to getting a client ready for litigation than helping them with their information governance. Plus, remember, this is not a workflow for vendors or management or records managers. It is not a model for an entire e-discovery team. This is a workflow only for what lawyers do.”

5. Given your recent, polarizing article urging law firms to get out of the eDiscovery business, how does the EDBP model either help or hinder that exhortation?

 This article was part of my attempt to clarify the line between legal e-discovery services and non-legal e-discovery services. EDBP is a part of that effort because it is only concerned with the law. It does not include non-legal services. As a practicing lawyer my core competency is legal advice, not processing ESI and software. Many lawyers agree with me on this, so I don’t think my article was polarizing so much as it is exposing, kind of like the young kid who pointed out that the emperor had no clothes.

The professionals in law firm lit support departments will eventually calm down when they realize no jobs are lost in this kind of outsourcing, and it all stays in the country. The work just moves from law firms, that also do some e-discovery, to businesses, most of whom only do e-discovery. I predict that when this kind of outsourcing catches on, that it will be common for the vendor with the outsourcing contract to hire as many of the law firm’s lit-support professionals as possible.

My Emperor’s no-clothes expose applies to the vendor side of the equation too. Vendors, like law firms, should stick to their core competence and stay away from providing legal advice. UPL is a serious matter. In most states it is a crime. Many vendors may well be competent to provide legal services, but they do not have a license to do so, not to mention their lack of malpractice insurance.

I am trying to help the justice system by clarifying and illuminating the line between law and business. It has become way too blurred to the detriment of both. Much of this fault lies on the lawyer-side as many seem quite content to unethically delegate their legal duties to non-lawyers, rather than learn this new area of law. I am all for the team approach. I have been advocating it for years in e-DiscoveryTeam.com. But each member of the team should know their strengths and limitations and act accordingly. We all have different positions to play on the team. We cannot all be quarterbacks.”

6. [Bonus Question] “EDBP” doesn’t just roll off the tongue. Given your prolific creativity (I seem to recall hamsters on a trapeze at one point in time), did you spend any cycles on a more mellifluous name for the new model?

“There are not many four-letter dot-com domain names out there for purchase, and none for free, and I did not want to settle for dot-net like EDRM did. I am proud, and a tad poorer, to have purchased what I think is a very good four-letter domain name, EDBP.com. After a few years EDBP will flow off your tongue too, after all, if has an internal rhyme – ED BP. Just add a slight pause to the name, ED … BP, and it flows pretty well thank you.”

Thanks Ralph.  We look forward to seeing how this new model gains traction. Best of luck.

Judicial Activism Taken to New Heights in Latest EORHB (Hooters) Predictive Coding Case

Monday, October 29th, 2012

Ralph Losey, an attorney for Jackson Lewis, reported last week that a Delaware judge took matters into his own hands by proactively requiring both parties to show cause as to why they should not use predictive coding technology to manage electronic discovery. Predictive coding advocates around the globe will eagerly trumpet Judge Laster’s move as another judicial stamp of approval for predictive coding much the same way proponents lauded Judge Peck’s order in in Da Silva Moore, et. al. v. Publicis Groupe, et. al.  In Da Silva Moore, Judge Peck stated that computer-assisted review is “acceptable in appropriate cases.” In stark contrast to Da Silva Moore, the parties in EORHB, Inc., et al v. HOA Holdings, LLC, not only never agreed to use predictive coding technology, there is no indication they ever initiated the discussion with one another let alone with Judge Laster. In addition to attempting to dictate the technology tool to be used, Judge Laster also directed the parties to use the same vendor. Apparently, Judge Laster not only has the looks of Agent 007, he shares James Bonds’ bold demeanor as well.

Although many proponents of predictive coding technology will see Judge Laster’s approach as an important step forward toward broader acceptance of predictive coding technology, the directive may sound alarm bells for others. The approach contradicts the apparent judicial philosophy applied in Kleen Products, LLC, et. al. v. Packaging Corporation of America, et. al. — a 7th Circuit case also addressing the use of predictive coding technology. During one of many hearings between the parties in Kleen, Judge Nan Nolan stated that “the defendant under Sedona 6 has the right to pick the [eDiscovery] method.”  Judge Nolan’s statement is a nod to Principle 6 of the Sedona Best Practices Recommendations & Principles for Addressing Electronic Document Production which states:

“[r]esponding parties are best situated to evaluate the procedures, methodologies, and technologies appropriate for preserving and producing their own electronically stored information.”

Many attorneys shudder at the notion that the judiciary should choose (or at least strongly urge) the specific technology tools parties must use during discovery. The concern is based largely on the belief that many judges lack familiarity with the wide range of eDiscovery technology tools that exist today.  For example, keyword search, concept search, and email threading represent only a few of the many technology tools in the litigator’s tool belt that can be used in conjunction with predictive coding tools to accelerate document review and analysis.  The current challenge is that predictive coding technology is relatively new to the legal industry so the technology is much more complex than some of the older tools in the litigator’s tool belt.  Not surprisingly, this complexity combined with an onslaught of new entrants to the predictive coding market has generated a lot of confusion about how to use predictive coding tools properly.

Current market confusion is precisely what Judge Laster and the parties in EORHB must overcome in order to successfully advance the adoption of predictive coding tools within the legal community. Key to the success of this mission is the recognition that predictive coding pitfalls are not always easy to identify– let alone avoid. However, if these pitfalls are properly identified and navigated, then Judge Laster’s mission may be possible.

Identifying pitfalls is challenging because industry momentum has led many to erroneously assume that all predictive coding tools work the same way. The momentum has been driven by the potential for organizations to save millions in document review costs with predictive coding technology. As a result, vendors are racing to market at breakneck speed to offer their own brand of predictive coding technology. Those without their own solutions are rapidly forming partnerships with those who have offerings so they too can capitalize on the predictive coding financial bonanza that many believe is around the corner. This rush to market has left the legal and academic communities with little time to build consensus about the best way to properly vet a wide range of new technology offerings.  More specifically, the predictive coding craze has fostered an environment where there is often a lack of scrutiny related to individual predictive coding tools.

The harsh reality is that all predictive coding tools are not created equally.  For example, some providers erroneously call their solution “predictive coding technology” when the solution they offer is merely a type of clustering and/or concept searching technology that has been commonly used for over a decade. Even among predictive coding tools that are perceived as legitimate, pricing varies so widely that using some tools may not even be economically feasible considering the value of the case at hand. Some solution providers charge a premium to use their predictive coding tools and require additional expenditures in the form of consulting fees, while others tools are integrated within easy-to-use eDiscovery platforms at no additional cost.

If the court and parties decide that using predictive coding technology in EORHB makes economic sense, they must understand the importance of statistics and transparency to insure a fair playing field. The widespread belief that all predictive coding technologies surpass the accuracy of human review is a pervasive misperception that continues to drive confusion in the industry. The assumption is false not only because these tools must be used correctly to yield reliable results, but because the underlying statistical methodology applied by the tools must also be sound for the tools to work properly and exceed the accuracy of human review. (See Predictive Coding for Dummies for a more comprehensive explanation of predictive coding and statistics).

The underlying statistical methodology utilized by most tools today is almost always unclear which should automatically raise red flags for Judge Laster. In fact, this lack of transparency has led many to characterize most predictive coding tools as “black box” technologies – meaning that inadequate information about how the tools apply statistics makes it difficult to trust the results. There are differing schools of thought about the proper application of statistics in predictive coding that have largely been ignored to date.  Hopefully Judge Laster and the parties will use the present case as an opportunity to clarify some of this confusion so that the adoption of predictive coding technology within the legal community is accelerated in a way that involves sufficient scrutiny of the processes and tools used.

Judge Laster and the parties in EORHB are presented with a unique opportunity to address many important issues related to the use of predictive coding technology that are often misunderstood and overlooked. Hopefully the parties use predictive coding technology and engage in a dialogue that highlights the importance of selecting the right predictive coding tool, using that tool correctly, and the proper application of statistics.  If the court and the parties shed light on these three areas, Judge Laster’s predictive coding mission may be possible.

Many Practitioners “Dazed and Confused” over Electronic Discovery Definitions

Wednesday, October 24th, 2012

The song “Dazed and Confused,” by legendary rock band Led Zeppelin, has a great stanza:

 Been Dazed and Confused for so long it’s not true.

Wanted a woman, never bargained for you.

Lots of people talk and few of them know, soul of a woman was created below.

As I recently surveyed the definitions for “eDiscovery,” it occurred to me that lots of folks talk as if they know the definition, but few likely appreciate many of the subtle nuances. And, if you forced them to, many wouldn’t be able to write a concise eDiscovery definition.

The first, obvious place to look for an eDiscovery north star is the EDRM, which was originally responsible for creating the lingua franca for the entire industry.

EDRM (Electronic Discovery definition)

  • “Discovery documents produced in electronic formats rather than hardcopy. The production may be contained on hard drives, tapes, CDs, DVDs, external hard drives, etc. Once received, these documents are converted to .tif format. It is during the conversion process that metadata can be extracted.
  • A process that includes electronic documents and email into a collection of ‘discoverable’ documents for litigation. Usually involves both software and a process that searches and indexes files on hard drives or other electronic media. Extracts metadata automatically for use as an index. May include conversion of electronic documents to an image format as if the document had been printed out and then scanned.
  • The discovery of electronic documents and data including e-mail, Web pages, word processing files, computer databases, and virtually anything that is stored on a computer. Technically, documents and data are ‘electronic’ if they exist in a medium that can only be read through the use of computers. Such media include cache memory, magnetic disks (such as computer hard drives or floppy disks), optical disks (such as DVDs or CDs), and magnetic tapes. 
  • The process of finding, identifying, locating, retrieving, and reviewing potentially relevant data in designated computer systems.”

Gartner, the large IT analyst firm, proffers a different version.

Gartner (E-discovery definition)

“E-discovery is the identification, preservation, collection, preparation, review and production of electronically stored information associated with legal and government proceedings. The e-discovery market is not unified or simple — significant differences exist among vendors and service providers regarding technologies, specialized markets, overall functionality and service offerings. Content and records management, information access and search, and e-mail archiving and retention technologies provide key foundations to the e-discovery function. More and more enterprises are looking to insource at least part of the e-discovery function, especially records management, identification, preservation and collection of electronic files. E-discovery technology can be provided as a stand-alone application, embedded in other applications or services, or accessed as a hosted offering.”

The Sedona Conference, which is the leading think tank on all things eDiscovery, has the following definition:

Sedona (Electronic Discovery/Discovery definition)

“Electronic Discovery (“E-Discovery”): The process of identifying, preserving, collecting, preparing, reviewing, and producing electronically stored information (“ESI”) in the context of the legal process. See Discovery.”

“Discovery: Discovery is the process of identifying, locating, securing, and producing information and materials for the purpose of obtaining evidence for utilization in the legal process. The term is also used to describe the process of reviewing all materials that may be potentially relevant to the issues at hand and/or that may need to be disclosed to other parties, and of evaluating evidence to prove or disprove facts, theories, or allegations. There are several ways to conduct discovery, the most common of which are interrogatories, requests for production of documents, and depositions.”

Looking at these in concert, a few things come into focus, aside from the vexingly diverse naming conventions.  First, the EDRM definition focuses (as some might expect) on the tactics and practice of eDiscovery. This is a useful starting place, but they’ve missed out on other elements, like the overall market dynamics, which are discussed (again not surprisingly) by Gartner. Gartner likewise addresses how eDiscovery is accomplished, referencing the need for software and the escalating trend of taking eDiscovery tools in house. Sedona (coming from a legal theory perspective) relies heavily on the legal definition of “discovery,” properly referencing its context in the legal process, a fact sometimes lost by practitioners that have expanded eDiscovery into other non-legal avenues.

These definitions are fine in the abstract, but even collectively they nevertheless fail to take into account several key points. First, as eDiscovery is quickly subsumed into the larger information governance umbrella, it’s important to stress the historically reactive nature of eDiscovery. This reactive posture can be nicely contrasted with the upstream concepts of information management and governance, which significantly impact the downstream, reactive elements.

Next, it’s important to recognize the costs/risks inherent in the eDiscovery process. Whether it’s due to spoliation sanctions or simply the costs of eDiscovery (which easily costs $1.5M per matter), the potential impact to the organization can’t be ignored. Without a true grasp on the organizational costs/risks, entities can’t properly begin to deploy either reactive or proactive solutions since they won’t have enough data for comprehensive ROI calculations. Finally, eDiscovery as a term has started to experience scope creep. What used to be firmly tethered to the legal discovery process has recently expanded into use cases where the process is now deployed in a number of similar (but non-legal) scenarios such as internal investigations, governmental inquiries, FOIA requests, FCPA, etc.

These additional aspects are critical for developing a comprehensive understanding of eDiscovery. And, while a comprehensive definition isn’t the final end game to this complex challenge, it’s certainly a better starting place than being “dazed and confused” about the nuances of eDiscovery.  Eliminating unnecessary confusion early in the game is ultimately essential to promoting and not hindering long term initiatives.

Federal Directive Hits Two Birds (RIM and eDiscovery) with One Stone

Thursday, October 18th, 2012

The eagerly awaited Directive from The Office of Management and Budget (OMB) and The National Archives and Records Administration (NARA) was released at the end of August. In an attempt to go behind the scenes, we’ve asked the Project Management Office (PMO) and the Chief Records Officer for the NARA to respond to a few key questions. 

We know that the Presidential Mandate was the impetus for the agency self-assessments that were submitted to NARA. Now that NARA and the OMB have distilled those reports, what are the biggest challenges on a go forward basis for the government regarding record keeping, information governance and eDiscovery?

“In each of those areas, the biggest challenge that can be identified is the rapid emergence and deployment of technology. Technology has changed the way Federal agencies carry out their missions and create the records required to document that activity. It has also changed the dynamics in records management. In the past, agencies would maintain central file rooms where records were stored and managed. Now, with distributed computing networks, records are likely to be in a multitude of electronic formats, on a variety of servers, and exist as multiple copies. Records management practices need to move forward to solve that challenge. If done right, good records management (especially of electronic records) can also be of great help in providing a solid foundation for applying best practices in other areas, including in eDiscovery, FOIA, as well as in all aspects of information governance.”    

What is the biggest action item from the Directive for agencies to take away?

“The Directive creates a framework for records management in the 21st century that emphasizes the primacy of electronic information and directs agencies to being transforming their current process to identify and capture electronic records. One milestone is that by 2016, agencies must be managing their email in an electronically accessible format (with tools that make this possible, not printing out emails to paper). Agencies should begin planning for the transition, where appropriate, from paper-based records management process to those that preserve records in an electronic format.

The Directive also calls on agencies to designate a Senior Agency Official (SAO) for Records Management by November 15, 2012. The SAO is intended to raise the profile of records management in an agency to ensure that each agency commits the resources necessary to carry out the rest of the goals in the Directive. A meeting of SAOs is to be held at the National Archives with the Archivist of the United States convening the meeting by the end of this year. Details about that meeting will be distributed by NARA soon.”

Does the Directive holistically address information governance for the agencies, or is it likely that agencies will continue to deploy different technology even within their own departments?

“In general, as long as agencies are properly managing their records, it does not matter what technologies they are using. However, one of the drivers behind the issuance of the Memorandum and the Directive was identifying ways in which agencies can reduce costs while still meeting all of their records management requirements. The Directive specifies actions (see A3, A4, A5, and B2) in which NARA and agencies can work together to identify effective solutions that can be shared.”

Finally, although FOIA requests have increased and the backlog has decreased, how will litigation and FOIA intersecting in the next say 5 years?  We know from the retracted decision in NDLON that metadata still remains an issue for the government…are we getting to a point where records created electronically will be able to be produced electronically as a matter of course for FOIA litigation/requests?

“In general, an important feature of the Directive is that the Federal government’s record information – most of which is in electronic format – stays in electronic format. Therefore, all of the inherent benefits will remain as well – i.e., metadata being retained, easier and speedier searches to locate records, and efficiencies in compilation, reproduction, transmission, and reduction in the cost of producing the requested information. This all would be expected to have an impact in improving the ability of federal agencies to respond to FOIA requests by producing records in electronic formats.”

Fun Fact- Is NARA really saving every tweet produced?

“Actually, the Library of Congress is the agency that is preserving Twitter. NARA is interested in only preserving those tweets that a) were made or received in the course of government business and b) appraised to have permanent value. We talked about this on our Records Express blog.”

“We think President Barack Obama said it best when he made the following comment on November 28, 2011:

“The current federal records management system is based on an outdated approach involving paper and filing cabinets. Today’s action will move the process into the digital age so the American public can have access to clear and accurate information about the decisions and actions of the Federal Government.” Paul Wester, Chief Records Officer at the National Archives, has stated that this Directive is very exciting for the Federal Records Management community.  In our lifetime none of us has experienced the attention to the challenges that we encounter every day in managing our records management programs like we are now. These are very exciting times to be a records manager in the Federal government. Full implementation of the Directive by the end of this decade will take a lot of hard work, but the government will be better off for doing this and we will be better able to serve the public.”

Special thanks to NARA for the ongoing dialogue that is key to transparent government and the effective practice of eDiscovery, Freedom Of Information Act requests, records management and thought leadership in the government sector. Stay tuned as we continue to cover these crucial issues for the government as they wrestle with important information governance challenges. 


How to Keep “Big Data” From Turning into “Bad Data” Resulting in eDiscovery and information Governance Risks

Wednesday, October 10th, 2012

In a recent Inside Counsel article, I explored the tension between big data and the potentially competing notion of information governance by looking at the 5 Vs of Big Data…

“The Five Vs” of Big Data 

1.  Volume: Volume, not surprisingly, is the hallmark of the big data concept. Since data creation doubles every 18 months, we’ve rapidly moved from a gigabyte world to a universe where terabytes and exabytes rule the day.  In fact, according to a 2011 report from the McKinsey Global Institute, numerous U.S. companies now have more data stored than the U.S. Library of Congress, which has more than 285 terabytes of data (as of early this year). And to complicate matters, this trend is escalating exponentially with no reasonable expectation of abating. 

2. Velocity: According to the analysts firm Gartner, velocity can be thought of in terms of “streams of data, structured record creation, and availability for access and delivery.” In practical terms, this means organizations are having to constantly address a torrential flow of data into/out of their information management systems. Take Twitter, for example, where it’s possible to see more than 400 million tweets per day. As with the first V, data velocity isn’t slowing down anytime either.

3. Variety: Perhaps more vexing than both the volume and velocity issues, the Variety element of big data increases complexity exponentially as organizations must account for data sources/types that are moving in different vectors. Just to name a few variants, most organizations routinely must wrestle with structured data (databases), unstructured data (loose files/documents), email, video, static images, audio files, transactional data, social media, cloud content and more.

4. Value:  more novel big data concept, value hasn’t typically been part of the typical definition. Here, the critical inquiry is whether the retained information is valuable either individually or in combination with other data elements, which are capable of rendering patterns and insights. Given the rampant existence of spam, non-business data (like fantasy football emails) and duplicative content, it’s easy to see that just because data may have the other 3 Vs, it isn’t inherently valuable from a big data perspective.

5. Veracity: Particularly in an information governance era, it’s vital that the big data elements have the requisite level of veracity (or integrity). In other words, specific controls must be put in place to ensure that the integrity of the data is not impugned. Otherwise, any subsequent usage (particularly for a legal or regulatory proceeding, like e-discovery) may be unnecessarily compromised.”

“Many organizations sadly aren’t cognizant of the lurking tensions associated with the rapid acceleration of big data initiatives and other competing corporate concerns around important constructs like information governance. Latent information risk is a byproduct of keeping too much data and the resulting exposure due to e-discovery costs/sanctions, potential security breaches and regulatory investigations. As evidence of this potential information liability, it costs only $.20 a day to manage 1GB of storage. Yet, according to a recent Rand survey, it costs $18,000 to review that same gigabyte of storage for e-discovery purposes.”

For more on this topic, click here.