Posts Tagged ‘review’

Review-less E-Discovery Review

Monday, July 21st, 2008

terminator.jpgMost science fiction visions of the distant future seem to contain a rather singular fear: that the human race will be taken over by computers.  Think “Terminator” series, preferably without the naked Arnold Schwarzenegger visual.  Regardless of whether this vision fills you with trepidation or excitement there is a very real possibility that we’re on the cusp of computers taking over a significant e-discovery task for attorneys.

For past several decades, attorneys have had to manually review information for relevancy and privilege in response to the e-discovery process.  Quoting from Information Inflation: Can the Legal System Adapt? by George Paul and Jason Baron, this task has always been viewed as sacrosanct “because of ‘death penalty’ waiver doctrine that evolved long ago when information was still manageable.”

Like so many industries, the legal profession has attempted to grapple with the transformation that the digital revolution has brought to the forefront.  The latest revisions to the Federal Rule of Civil Procedure (FRCP) is the most obvious case in point.  And yet, electronically stored information (ESI) is proving difficult to fit into traditional, even remodeled, paradigms.  Even ignoring (for the moment) the proliferation of novel data types (i.e., blog content, voice over IP or VOIP, webmail, text messaging, web services, etc.) the amount of data that attorneys are being required to review has reached a tipping point of review feasibility.

Back in the day, information was viewed in terms banker boxes of information, and even in the most document intensive discovery matters this measuring stick belied the belief that armies of attorneys could conceivably conquer the massive document review problem.  But now, we often see clients that process routine matters containing terabytes of information.  Most of us in the e-discovery space have become numbed to the abstract nomenclature of megabytes, gigabytes, terabytesi, petabytesii, and in the process we may have failed to realize that we have moved well beyond the scale of information that can be reasonably attacked with even the largest armada of contract attorneys (assuming that the client could conceivably bear the astronomical costs).

“At the petabyte scale, information is not a matter of simple three- and four-dimensional taxonomy and order but of dimensionally agnostic statistics. It calls for an entirely different approach, one that requires us to lose the tether of data as something that can be visualized in its totality. It forces us to view data mathematically first and establish a context for it later.”iii

I’m certainly not the first to point out that this tipping point is coming, but now we are really starting to see early adopters respond to this sea change. In their linked article above, George Paul and Jason Baron state “It is no exaggeration to say that litigation, as we have known it, is threatened by information’s new hyper-flow. The amount of electronically stored information relevant to a case is already a stress point in litigation.  […]  Litigators can no longer depend on manual review alone….”

Up until now, attorneys and the clients that are footing the bill have had to make a Hobson’s choice:  either “force parties to continue hugely expensive privilege reviews, or to forego the attorney-client privilege or work-product privilege altogether.”   But, now it appears that another way is evolving.

The following lays out a scenario where a non-manual review methodology may make sense.  ***Please note: this approach is not without risk.  At this moment in time neither clawback provisions, the potential adoption of Evidence Rule 502 nor any other know prophylactic measure can completely insulate a producing party from the unforeseen consequences of an inadvertent disclosure.  But, as they say, desperate times call for desperate measures….

Step one: Evaluate the Environment

The following factors represent some of the elements that should be taken into consideration prior to skipping the normal, human based review steps that are seen in most e-discovery matters.

  1. Large data set.  This may sound a bit obvious, but a non-manual approach is best suited for large, unwieldy data sets.  The corpus doesn’t need to be in the terabytes, but the data set should be evaluated in term of discovery processing costs and attorney review estimates.
  2. Short Production Timelines.  Once the above calculations are conducted, the next step is to determine if a human based review could even conceivably be conducted in the given time frame.  In many instances, an eyes-on review process just won’t be feasible since there won’t be enough bodies to throw at the problem.
  3. Next Gen “PAR” Tools.  In order to pull this “review-less” review process off, both safely and quickly, the responding party needs to have access to fast, robust processing, analysis and review (“PAR”) tools.  Certainly, it’s possible to have this scenario work with an e-discovery service provider, if they have the capability.
  4. Relatively Small Amount in Controversy.  For the time being, this approach should not be considered for any “bet the company” litigation, nor anything with significant downside risk (governmental inquiries, punitive damages, class actions, 2nd requests, etc.).  Yet, for many standard commercial lawsuits, corporate investigations, HR claims, etc. this review-less approach may be worth considering.
  5. Ability to Use a Clawback Provision.  Entering into a clawback provision with the opposition is mandatory in this methodology since the chances of an inadvertent production are statistically ever-present.  Yet, until Evidence Rule 502 is resolved, there will always be a risk that the clawback won’t be enforceable against 3rd parties.
  6. Non-governmental Production.  Most information in governmental productions becomes part of the public record, meaning that a clawback isn’t going to be feasible.  Here, trade secret information, personally identifiably data and the like would be disastrous if pushed out into the public domain.

Step two: Perform a Risk/Benefit Analysis

Next, take all the above factors into consideration and determine if the risks (of inadvertent production, the clawback being ineffective, etc.) are worth the benefits (reduced costs, lower attorney review fees, ability to meet deadlines, etc.).

Sure this is hard work, but the alternative (manual review) is more ephemeral than realistic.

[In my next post, I’ll address the tactical steps to conduct a review-less review process.  Stay tuned……]

i One terabyte is generally estimated to contain 75 million pages and could conceivably cost $18,750,000 to review.  Anne Kershaw, Automated Document Review Proves Its Reliability, 5 DIGITAL DISCOVERY & E-EVIDENCE 11 (2005).

ii According to Wired, we’re now in the “Petabyte Age” where that amount of data is processed by Google’s servers every 72 minutes.

iii Wired article, above.

E-Discovery Review Platforms: The Merits Of “Review Faster” vs. “Review Less”

Wednesday, January 23rd, 2008

ReviewersPerhaps the single greatest component of e-discovery costs is review, meaning the pain-staking process whereby teams of attorneys evaluate information to determine its relevance to the case at hand. Why has review become so expensive? A recent Sedona Working Group Paper explains:

In 1990, a typical gigabyte of storage cost about $20,000; today it costs less than $1 dollar. As a result, more individuals and companies are generating, receiving and storing more data, which means more information must be gathered, considered, reviewed and produced in litigation. But, with billable rates for junior associates at many law firms now starting at over $200 per hour, the cost to review just one gigabyte of data can easily exceed $30,000.

That’s quite a difference: $1 to store a gigabyte of data vs. $30,000 to review it; and it has driven corporate legal departments and law firms to embrace e-discovery review platforms. These review platforms, which can be either packaged software or a hosted service, typically emphasize one of two main benefits:

  • Review Faster”: Traditional review platforms increase attorney productivity by increasing the number of documents they can review each hour. For example, the name “Attenex” derives from the claim that it will help attorneys review documents “at 10x” the speed that they could do otherwise. These products help to a point, but – no matter how good the software – there is a limited number of documents that the human brain can digest in a day, so, even with them, review remains very expensive;
  • Review Less”: More recent e-discovery solutions have focused on having attorneys review fewer documents by culling down data prior to review. This can massively reduce review costs, since 80%+ of documents can be eliminated without being read, but it does raise one serious question: how can you be sure that responsive documents do not inadvertently get culled?

The technical term for this issue is “elusion”, meaning: out of all the material judged as not responsive, how many are in fact responsive (i.e., how many false-negatives does your culling methodology produce)? It is virtually impossible to answer that question definitively without a human reviewing the entire dataset to assess relevance which, of course, defeats the point of culling in the first place. So the accepted practice is to use statistical sampling theory, whereby you test a sample that gives you a certain confidence level about the total population. For example, to get a margin of error of 2-sigma with 95% confidence level, you need to randomly select and process one-in-400 documents. How easy is this to do? Actually, it’s pretty straight forward. Any good e-discovery solution should let you create a separate folder containing a subset of non-responsive documents for human review as a quick check on the effectiveness of culling. You can determine the size of your sample according to what confidence level you want to have.

This is an area that Sedona and others have considered in great depth, and there are many excellent papers on the subject by people far more knowledgeable than me. To pick just a few, Herbert Roitblatt has written extensively about sampling in e-discovery and elusion; and, Daticon’s paper may be a few years old, but is well worth reading to understand the origins of the “review less” movement.

Practically speaking, as someone who has seen both approaches in action, I think that “review faster” is helpful, but if you want to massively reduce your e-discovery costs, then the big win is “review less” – even with sampling to mitigate concerns about elusion.

Can E-Discovery Really Be That Expensive?

Monday, May 21st, 2007

I tend to have a “Mark Twain perspective” on statistics and apply a healthy grain of salt to any numbers quoted by analysts and industry experts. But when end-users speak, I sit up and listen. That’s why I was very interested to read here that Microsoft “spends an average of US$ 20 million for e-discovery per litigation, according to one company exec.” (My thanks to George for alterting me to the article)

If true, it is an astounding number - but one that is quite consistent with what we have seen first hand working with other large enterprises ourselves. Once you factor in processing costs (an average of $1,800 per GB), review costs ($200/hour), and the huge volume of information being generated and stored, you can get up to $20 million on a single case surprisingly fast.