Posts Tagged ‘Judge Grimm’

As the Electronic Discovery World Zurns

Wednesday, July 29th, 2009

Judge Grimm’s Victor Stanley case was lauded by many as one of the most significant electronic discovery cases of 2008, mainly for its bold proclamation that e-discovery search is a much more complex and technical discipline than has been typically understood by litigators.

“[F]or lawyers and judges to dare opine that a certain search term or terms would be more likely to produce information than the terms that were used is truly to go where angels fear to tread.”

Despite, legions of articles and blogs on the topic, at least certain portions of the bench haven’t taken heed.  In the case In re: Zurn Pex Plumbing Products Liability Litigation, 2009 U.S. Dist. LEXIS 47636 (June, 5, 2009) (hereinafter “Zurn“), U.S. District Judge Ann Montgomery receives points for understanding some basic e-discovery tenants around recall and precision, but then mysteriously goes where “angels fear to tread” by suggesting her own search terms.

Examining the case facts in more detail,…  Zurn is a class action products liability case where discovery was bifurcated (as is often the case – see Spieker v. Quest Cherokee) to first cover the class “certification” component.  Initially, the Magistrate partially closed the door on broader ESI discovery, stating that “while ESI may prove to be relevant to the first stage of discovery, we cannot meaningfully make that prediction now, and require the parties to engage in what could be vastly more expensive, and yet utterly futile, discovery.”  However, the Magistrate didn’t shut the door entirely, suggesting that “should the parties uncover voids in the information disclosed in hard copy form, they are . . . at liberty to press for further discovery including electronically stored information.”

Despite complying with Sedona’s Cooperation Proclamation (”The parties have worked amicably throughout the discovery process”) opposing counsel still got to loggerheads when plaintiff found “voids” in the initial paper productions via third party discovery.  The plaintiff brought a motion to compel ESI discovery and the defendant objected, stated two primary arguments: (1) the Magistrate earlier ruled out ESI discovery and (2) if they had to perform ESI discovery it would be unduly burdensome/expensive.

Judge Montgomery summary rejected the first argument, but was concerned about the burden surrounding the proposed ESI discovery.  Here, the calculations get a bit confusing, but plaintiff’s request would have resulted in 361 gigabytes of ESI from employee email sources, as well as shared “J” and “K” drives.  The defendant multiplied the gigabyte number by 75,000 pages per gigabyte, which would have required “approximately seventeen weeks and cost $ 1,150,000, exclusive of vendor collection and processing costs, to review and process the data.”  Assuming a rather modest $1,000 per gigabyte for processing and hosting costs, defendants could’ve added another $400,000 for the project.

Ultimately, the court was not persuaded by the supporting affidavits, nor the attorney’s representations about the resulting burden:

“It is unclear whether Zurn’s cost and time numbers are based on a review of 27 million pages of documents, the 3.6 million pages of documents limited to the J Drive and custodians’ emails, or a smaller sample of document pages likely to be flagged as a result of a search for certain relevant terms pro-posed by Plaintiffs. The affidavit of Ms. Freestone, an attorney and not an expert on document search and retrieval, is not compelling evidence that the search will be as burdensome as Zurn avers.”

The 361 gigabytes apparently resulted from “hits” corresponding to plaintiff’s 26 search terms.  The court correctly identified that those terms had precision issues (”many of Plaintiffs’ proposed search terms will likely produce a large number of ‘hits’ that have limited relevance in the case.”)

Unfortunately, in an effort to increase the search precision, the Judge did not take heed of Judge Grimm’s warning and surprisingly took matters into her own hands: “the Court will limit the search to the following fourteen terms based on the likelihood that they will  produce relevant documents without including a vast number of documents that are likely irrelevant to the litigation.”  Here is the Judge’s list of keywords:

(1) AADFW,
(2) Corrosion,
(3) Corrosive,
(4) Corrosive Water,
(5) Crack,
(6) De-zinc,
(7) Dezincification,
(8) DZR,
(9) Fail,
(10) IMR,
(11) Leak,
(12) MES,
(13) SCC,
(14) Stress corrosion cracking

Without looking at the underlying data, it’s clear from the outset that Judge Montgomery didn’t craft a good search strategy (as Judge Grimm might have predicted).  For example, terms 2, 3, 4 and 14 could’ve been captured by a single stemmed search using the term “corros*.” Without such a stemmed search approach, the terms would probably have been run singly in the proposed protocol, meaning that each one would’ve had tremendous duplication, thereby resulting in wasted attorney review time and processing costs.

Judge Montgomery did recognize the potential error of her ways and gave the parties an out:

“The parties may decide on a different set of fourteen terms if they choose to do so. Additionally, if the search, as ordered by the Court, proves to be overly burdensome or costly, Zurn may renew its objection by presenting the Court with specific information including evidence from computer experts on applying the search terms, the number of documents identified, and the cost and time burdens of vetting documents.”

This “specific evidence” language seems to track notions from Sedona’s search best practices protocol, which prescribes sampling and iterative search term refinement.  What is surprising is that knowing this she would nevertheless blindly proffer the 14 term search strategy.  Instead, she should’ve quoted Victor Stanley and required the parties to come up with a data driven approach that met requisite precision and recall metrics.

A Gross Inability to Craft Electronic Discovery Searches

Thursday, April 9th, 2009

The bashing of our judicial system seems to have reached a fevered pitch.  Groups like the American College of Trial Lawyers (”ACTL”) have proclaimed in a recent report that while the “civil justice system is not broken, it is in serious need of repair.”  The blame game seems to have judges and attorneys alike pointing fingers.  The Fellows of the ACTL (perhaps not surprisingly) seems to pin some of the blame on the judiciary:

“Judges should have a more active role at the beginning of a case in designing the scope of discovery and the direction and timing of the case all the way to trial. Where abuses occur, judges are perceived not to enforce the rules effectively.”

Groups like the Sedona Conference chalk up many of the ills to the failure to cooperate, so much so that they’ve orchestrated a cooperation proclamation – which has picked up enough support by the bench to have garnered several cites in the case law (see e.g., Mancia).

The bench for its part seems to put some of the onus on litigators and their reticence to get with the times.  William A. Gross. Constr. Assocs., Inc. v. Am. Mfrs. Mut. Ins. Co., 2009 WL 724954 (S.D.N.Y. Mar. 19, 2009) is the latest example of such a proclamation.  In this construction defect case, Judge Peck (a Sedona devotee) issues what he hopes will be a “wake-up” call to the bar about the need for “careful thought, quality control, testing, and cooperation with opposing counsel in designing search terms or ‘keywords’ to be used to produce emails or other electronically stored information (‘ESI’).”  In Gross, the court had to mediate an e-discovery dispute where the requesting party propounded a blatantly over-inclusive search request crafted by the requesting parties.  Unfortunately, the responding entity was a non-party and they simply dig their heads in the sand.  In order to facilitate a resolution this left the Court in the “uncomfortable position” of having to craft a “keyword search methodology for the parties, without adequate information from the parties (and Hill).”

Judge Peck’s exasperation with these antics was palpable.  Summing up the problem by citing Judge Grimm and Victor Stanley he stated: “This case is just the latest example of lawyers designing keyword searches in the dark, by the seat of the pants, without adequate (indeed, here, apparently without any) discussion with those who wrote the emails.”  He further noted: “[w]hile this message has appeared in several cases from outside this Circuit, it appears that the message has not reached many members of our Bar.”

After noting both Sedona and Judge Facciola (of O’Keefe and Equity Analytics fame) Peck’s opinion reached a crescendo:

“Electronic discovery requires cooperation between opposing counsel and transparency in all aspects of preservation and production of ESI. Moreover, where counsel are using keyword searches for retrieval of ESI, they at a minimum must carefully craft the appropriate keywords, with input from the ESI’s custodians as to the words and abbreviations they use, and the proposed methodology must be quality control tested to assure accuracy in retrieval and elimination of ‘false positives.’ It is time that the Bar-even those lawyers who did not come of age in the computer era-understand this.”

While it’s easy to see who Peck blames in this brouhaha, it takes (at least) two to tango.  Meaning that litigants on both sides of the “v” must move beyond the typical “seat of the pants” electronic discovery wrangling.  And, judges need to be savvy enough to spot the issues to help/force the parties into such an enlightened/cooperative state.  Nothing short will get the job done.

The Electronic Discovery Sheriff Is Back In Town

Thursday, January 29th, 2009

As Tiger Woods is to golf, the honorable Shira A. Scheindlin is to electronic discovery.  She has unquestionably been the most dominant/visible/outspoken jurist in the electronic discovery realm over the past decade, penning amongst others, the Zubulake opinion, which is commonly referred to as the gold standard in electronic discovery.

But, like Woods, who recently took a sabbatical to mend his surgically repaired knee, Judge Scheindlin has recently been eclipsed by several other notable electronic discovery jurists, namely Judge Grimm (of Victor Stanley and Mancia fame) and Judge Facciola (aka “the Italian Stallion“) both of whom made numerous “best of the year” electronic discovery case law lists.

With Securities and Exchange Commission v. Collins & Aikman Corp., 2009 WL 94311 (S.D.N.Y., Jan. 13, 2009) Judge Scheindlin serves notice that the sheriff is back in town.  She not only tackles a number of thorny electronic discovery topics, but ambitiously takes on the US government in the process.  It’s fairly lengthy opinion, well worth the read, so I’ll just excerpt out a few of the notable takeaways.

As a bit of background…  the Collins case centered around a securities fraud complaint brought by the SEC against the Collins & Aikman Corp. and its former CEO David A. Stockman.  The crux of the dispute surrounded questions concerning the government’s discovery obligations in civil discovery (versus in a purely SEC investigation per se).

There were four distinct but interrelated disputes, namely:

“(1) Whether identifying responsive documents that have been organized by the producing party invades the protection accorded to attorney work-product and how a government agency-acting in its investigative capacity-must respond to a request for the production of documents. (2) Whether a government agency may unilaterally restrict the scope of its search based on an assertion of an “undue burden” on limited public resources. (3) How much information the Government must disclose in order to allow an adversary-and the court-to assess an objection based on the deliberative process privilege. (4) Whether a government agency may unilaterally exclude its own e-mail from document production on the ground that most-but not all-will be privileged.”

Addressing the work product claims, the court found against the government, again reinforcing several recent opinions about electronic discovery search:

“The SEC contends that Stockman can search through the ten million pages and find substantially the same documents identified by the SEC without impinging on the thought processes of the SEC attorneys. Indeed-at significant expense and delay-Stockman could search the document databases using appropriate search terms, but the inaccuracy of such searches is by now relatively well known.  A page-by-page manual review of ten million pages of records is strikingly expensive in both monetary and human terms and constitutes “undue hardship” by any definition.” [Citing, George L. Paul and Jason R. Baron's article: Information Inflation: Can the Legal System Adapt?

After losing the first battle, the SEC argued that even if the compilations were not protected as work product, it could produce the "complete, unfiltered, and unorganized investigatory file" since this was how the documents were "maintained in the usual course of its business."  This second attempt was similarly unpersuasive as Judge Scheindlin held that the "usual course of business" exemption did not apply:

"[C]onducting an investigation-which is by its very nature not routine or repetitive-cannot fall within the scope of the “usual course of business.” While the SEC routinely collects and maintains regulatory submissions such 10-K reports, in its investigative capacity the agency conducts tailored probes of a company or an industry, requiring the gathering of records from diverse sources. Many if not most of the 1.7 million documents in the SEC production here were likely collected in the agency’s investigatory role. Thus it is no surprise that the complete collection is maintained as it was collected-in large disorderly databases. The documents can only be provided in a useful manner if the agency organizes or labels them to correspond to each demand.”

Next, Judge Scheindlin addressed the SEC’s decision to “unilaterally” limit its search to “centralized compilations” which ultimately “turned up nothing.”  She found that the SEC’s “blanket refusal to negotiate a workable search protocol” was “patently unreasonable” citing both Mancia and the Sedona Conference’s Cooperation Proclamation:

“Rule 26(f) requires the parties to hold a conference and prepare a discovery plan. … Had this been accomplished, the Court might not now be required to intervene in this particular dispute. I also draw the parties’ attention to the recently issued Sedona Conference Cooperation Proclamation, which urges parties to work in a cooperative rather than an adversarial manner to resolve discovery issues in order to stem the ‘rising monetary costs’ of discovery disputes.”

As the coup de gras, Judge Scheindlin addressed and rejected out of hand the SEC’s most untenable claim that it would not produce e-mail “generated or received by the Commission itself” because “nearly all responsive e-mails will be privileged, protected, or non-substantive.”

“Because e-mails are inherently searchable, the SEC’s blanket refusal to produce any in-coming or outgoing e-mails is unacceptable. Without even an attempt to negotiate search terms that would weed out privileged, protected, or irrelevant e-mails, the SEC cannot reasonably assert that a routine aspect of modern discovery-search and review of a party’s e-mail-is beyond its capability. Essentially, the SEC’s position is that the cost of such a search is simply too high, but it has made no effort to document the cost or the likelihood that it would produce relevant, nonprivileged material. The concept of sampling to test both the cost and the yield is now part of the mainstream approach to electronic discovery.”

At the end of the day, the Collins opinion seems to make statement the Judge Scheindlin is back with a vengeance and she’s serving notice that the government isn’t above the law:

“Like any ordinary litigant, the Government must abide by the Federal Rules of Civil Procedure.”

Besides knocking the government down a peg, Judge Scheindlin throws her judicial weight behind a number of important but nascent trends, including the Sedona Cooperation Proclamation, the related need to meet & confer, the use of sampling and the challenges of electronic discovery search. While none of these notions are groundbreaking, her substantial backing means increasing clarity for lawyers and litigation support practitioners everywhere.  And, that’s certainly welcome.

Top 5 Cases That Shaped Electronic Discovery in 2008

Friday, December 12th, 2008

Picking five out of the sea of electronic discovery cases isn’t as easy as it sounds.  Sure, a few, like our “Case of the Year” will be no-brainers, but others aren’t as clear cut.  And, they’re certainly open to debate.  But, in my humble opinion here’s THE list, counting down David Letterman style:

5) Mancia v. Mayflower Textile Servs. Co., 2008 WL 4595175 (D. Md. Oct. 15, 2008)

If there ever was an opinion written by a judge to make a larger societal point, Mancia was certainly it.  Judge Paul Grimm, who’ll appear on this list in another slot as well, has clearly taken the mantle from Judge Scheindlin as the leading electronic discovery jurist.  He’d heretofore authored a number of significant opinions in this area, including Hobson and Thompson. Now, in Mancia he used a garden variety discovery dispute, which was typically rife with boilerplate objections and other obstreperous tactics, to highlight the Sedona Conference’s Cooperation Proclamation.

The lasting takeaway from the opinion is the notion that “[c]ourts repeatedly have noted the need for attorneys to work cooperatively to conduct electronic discovery, and sanctioned lawyers and parties for failing to do so.” To support this notion he cites the Sedona Conference Proclamation and the little used FRCP 26(g).  This opinion is noteworthy because it gives precedent to bolster the Sedona initiative and should provide a ready citation for all those counsel who aren’t getting the level of cooperation they need from the opposition.  It remains to be seen if other judges will follow suit, but this could be the beachhead for a more cooperative electronic discovery process in 2009 and beyond.

4) Flagg v. City of Detroit, 252 F.R.D. 346 (E.D. Mich. 2008)

Flagg highlights the growing need to reconcile the electronic discovery landscape, which typically focuses somewhat myopically on email, with the larger informational trends which are now categorized by the use of blogs, social networking sites, instant messaging, and text messaging.  Flagg was one of the first to determine text messages (e.g., messages exchanged among certain officials and employees of the City of Detroit via city-issued text messaging devices) were discoverable under the standards of FRCP 26(b)(1).  The holding further demonstrated the challenges of conducting electronic discovery across information systems that mix personal information with business communications.  This type of information commingling will continue to escalate, causing significant long term electronic discovery challenges due to thorny privacy, privilege and policy implications.

3) Rhoads Indus., Inc. v. Bldg. Materials Corp. of Am., 2008 WL 4916026 (E.D. Pa. Nov. 14, 2008)

Rhoads is one of the first cases post Federal Rule of Evidence (FRE) 502, which recently created a national standard (versus the previous split in jurisdictions) and now states a “middle ground” for the determining of inadvertent disclosure during electronic discovery.  The key provision is (b)(2) which provides protection only if “the holder of the privilege or protection took reasonable steps to prevent disclosure.”  So, Rhoads took that “reasonableness” question head on in a scenario where the plaintiff Rhoads admittedly (yet inadvertently) produced over eight hundred privileged, electronic documents.  The decision is significant because it used the five-factor test stated in Fidelity, but put an undue weighting on the final test which was: “whether the overriding interests of justice would be served by relieving the party of its errors.”   This approach potentially threatens the development of sound case law that will be necessary to help the deployment of FRE 502 into practice because it casts too much uncertainty with its weighting of “fairness” (a problematically vague notion) in the analysis.  It will be interesting to see if/how this approach is subsequently adopted as we enter the New Year.

2) Qualcomm Inc. v. Broadcom Corp., 2008 WL 66932 (S.D. Cal. Jan. 7, 2008)

This for many was the case of the year given it’s far reaching implications for the legal community.  Some have argued that this isn’t an e-discovery abuse case per se, but more of an example of discovery abuses that just so happened to be centered around ESI.  In either case, the fraud, resulting cover-up, sanctions, ethical issues and privilege discussions made for insightful and thought provoking reading throughout 2008.  The lasting takeaway from Qualcomm appears to be the implications of not just committing discovery abuses, but the failure of having a well thought out e-discovery plan that is actively executed/monitored by outside counsel.  The resulting tension between outside counsel, inside counsel and the internal IT department may continue to escalate if more cases like this make the headlines in 2009.

1)  E-Discovery Case of the Year: Victor Stanley, Inc. v. Creative Pipe, Inc., 2008 WL 2221841 (D. Md. May 29, 2008)

Judge Grimm’s hallmark opinion has had the legal community buzzing over the past several months and the reason appears pretty straight forward.  In Victor Stanley Grimm builds on the holdings in Seroquel, O’Keefe and Equity Analytics, to boldly cast doubt on a practice so routine that it’s literally shocked the legal community into reevaluation:

(”[D]etermining whether a particular search methodology, such as keywords, will or will not be effective certainly requires knowledge beyond the ken of a lay person (and a lay lawyer) . . . .”

The notion that electronic discovery search is beyond the ability of most attorneys has caused tremors within the litigation support community who had a long history of blindly receiving keywords from counsel, running them and turning back over the results – often blissfully unaware of the extent to which those keyword searches actually located relevant information.  Victor Stanley’s analysis of the “reasonableness” of search protocols also has impact on the FRE 502 and therefore cements its place alongside other e-discovery “must reads” such as Zubulake and Morgan Stanley.

The cases above are my Top 5.  What additional cases do you think were important?  Please let me know by commenting on the cases you think shaped electronic discovery in 2008 and why.

All Electronic Discovery Rhoads Lead to FRE 502 “Reasonableness”

Tuesday, December 9th, 2008

With the recent implementation of Federal Rules of Evidence (FRE) 502 litigants have been waiting to see what kind of impact this rule will have in practice – particularly with the anticipated reduction of attorney review costs during electronic discovery.  In Rhoads Indus., Inc. v. Bldg. Materials Corp. of Am., 2008 WL 4916026 (E.D. Pa. Nov. 14, 2008) we see an early indication that things aren’t quite as clear as people had hoped.

In this breach of contract and negligent misrepresentation action plaintiff Rhoads admittedly (yet inadvertently) produced over eight hundred privileged, electronic documents during eDiscovery.  After returning the documents, Defendants filed a motion claiming that Rhoads waived privilege because:

  • its production was careless,
  • its response in seeking the return of the documents was delayed, and
  • it failed to produce complete and accurate privilege logs.

The court began its analysis by focusing on FRE 502 which recently created a national standard (versus the previous split in jurisdictions) and now states a “middle ground” for the determining of inadvertent disclosure during eDiscovery.  The key provision being (b)(2) which provides protection if “the holder of the privilege or protection took reasonable steps to prevent disclosure.”

As the court began its legal analysis it quickly noted the similarity to Victor Stanley, Inc. v. Creative Pipe, Inc., which had “analogous facts” despite being decided pre-FRE 502.  Both Rhoads and Victor Stanley leveraged similarly the five-factor test stated in Fidelity which were:

  1. the reasonableness of the precautions taken to prevent inadvertent disclosure in view of the extent of the document production,
  2. the number of inadvertent disclosures,
  3. the extent of the disclosure,
  4. any delay in measures taken to rectify the disclosure, and
  5. whether the overriding interests of justice would be served by relieving the party of its errors.

The Rhoads court indicated its belief that “the most appropriate approach is to first determine whether the producing party has at least minimally complied with the three factors stated in Rule 502, i.e., that the waiver was inadvertent, the party took reasonable steps to prevent disclosure, and attempted to rectify the error.”  Acknowledging that the reasonableness of Rhoads’ review was the crux of the dispute, the court then concluded, “that once the producing party has shown at least minimal compliance with the three factors in Rule 502, but ‘reasonableness’ is in dispute, the court should proceed to the traditional five factor test.”

Factor 1 (the reasonableness of the precautions)

Despite the unfortunate results, Rhoads actually started out on the right foot.  First, they recognized that with extensive electronic discovery on the horizon they needed an IT consultant to research software for the in-house processing and searching effort.  The consultant tested and then purchased a tool to perform the necessary electronic data searches, although it wasn’t clear how they selected that product or whether they reviewed any other similar solutions.

“The fact that Rhoads retained a consultant who recommended and used a fairly sophisticated screening device shows that Rhoads substantially complied with the following Explanatory Note to Rule 502: ‘A party that uses advanced analytical software applications and linguistic tools in screening for privilege and work product may be found to have taken “reasonable steps” to prevent inadvertent disclosure. The implementation of an efficient system of records management before litigation may also be relevant.’”

After picking out the software tool, the IT consultant identified a large volume of potentially responsive documents after consulting Rhoads’ attorneys to identify keyword searches intended to filter the privileged material and removed those documents from the group.  The search was run a second time to verify its accuracy.  Given the large volume of documents remaining even after removing materials hit by the privilege search, Rhoads’ counsel modified the original search terms and reduced the volume of potentially responsive documents to 78,000.   Rhoads’ counsel then manually reviewed a separate group of emails from specific accounts to identify and remove privileged documents, which were then added to separate privilege logs.

On the other side of the ledger, there were a number of things the court found lacking in Rhoads’ methodology, citing Victor Stanley, including a failure in crafting a viable search strategy: “Plaintiff produced documents that its limited search should have caught. Therefore Plaintiff not only failed to craft the right searches, but the searches it ran failed. Plaintiff has no explanation for this.  … Here there was no testing [read: no sampling] of the reliability or comprehensiveness of the keyword search. Plaintiff’s only testing of its search was to run the same search again.”

Factor 2 (The Number of Inadvertent Disclosures)

While 800 inadvertently produced documents was only 1-2% of the data set it still was a large number standing alone, especially compared to Victor Stanley, which had 165 at issue.  So, the court found that this issue favored the Defendants.

Factor 3 (The Extent of the Disclosure)

Read on.

Factor 4 (Any Delay in Measures Taken to Rectify the Disclosure)

The court skipped factor 3 and went instead to factor four, finding that this too favored Defendants.  Significantly the court found fault with the resources Plaintiff brought to bear on the issue and also noted that “Defendants had to bring Plaintiff’s error to its attention instead of Plaintiff catching its own mistake” (as in Victor Stanley).

Factor 5 (Fairness)

Now here’s where things get interesting.  Despite finding for the Defendants on the previous 4 (really 3) factors – meaning that they weren’t on balance “reasonable” – the court puts an unbalanced weighting on this final fairness factor:

“Although Rhoads took steps to prevent disclosure and to rectify the error, its efforts were, to some extent, not reasonable…. The most significant factor, …, is that Rhoads failed to prepare for the segregation and review of privileged documents sufficiently far in advance of the inevitable production of a large volume of documents.”

And yet, “I find that the fifth factor, the interest of justice, strongly favors Rhoads. Loss of the attorney-client privilege in a high-stakes, hard-fought litigation is a severe sanction and can lead to serious prejudice. … [D]enying these documents to Defendants is not prejudicial to Defendants because, in the first place, they have no right or expectation to any of Rhoads’ privileged communications.”

The judge went on to further shore up his over reliance on the “fairness” prong by taking a crack at Judge Grimm’s analysis in Victor Stanley: “I believe that Judge Grimm’s analysis reflects, to a more significant degree than I believe appropriate, application of hindsight, which should not carry much weight, if any, because no matter what methods an attorney employed, an after-the-fact critique can always conclude that a better job could have been done.”

Interesting….  It seems that Rhoads stands for a fairness weighted approach that effectively eviscerates the entire reasonableness analysis mandated by FRE 502 as applied in Victor Stanley and Fidelity.  It seems to me that waiver of privilege is always going to be a “severe sanction” leading to “serious prejudice.”  That’s why inadvertent disclosure is called the third rail of e-discovery.  But, if you want the newly articulated reasonableness standard to mean anything, the “fairness” prong can’t trump the rest of the analysis.

I’m sure this will play out in the near future, but it’s my guess that “reasonable” minds will prevail…

The Sedona Cooperation Proclamation and the Case for Collaboration

Monday, November 17th, 2008

Without getting in Dutch with the key Sedona Conference principle that “what happens at Sedona, stays at Sedona” I thought I’d nevertheless write a post that focuses on the core topic at this year’s annual meeting, namely the case for cooperation in e-discovery.

According to the “Cooperation Proclamation” e-discovery is facing an unprecedented crisis:

“The costs associated with adversarial conduct in pre-trial discovery have become a serious burden to the American judicial system. This burden rises significantly in discovery of electronically stored information (”ESI”). In addition to rising monetary costs, courts have seen escalating motion practice, overreaching, obstruction, and extensive, but unproductive discovery disputes – in some cases precluding adjudication on the merits altogether – when parties treat the discovery process in an adversarial manner. Neither law nor logic compels these outcomes. With this Proclamation, The Sedona Conference launches a national drive to promote open and forthright information sharing, dialogue (internal and external), training, and the development of practical tools to facilitate cooperative, collaborative, transparent discovery.”

These sentiments about the “broken” nature of the discovery process echo in many ways the draft findings from the Interim Report & 2008 Litigation Survey from the Fellows of the American College of Trial Lawyers which stated:

“The joint study grew out of a concern that discovery is increasingly expensive and that the expense and burden of discovery are having substantial adverse effects on the civil justice system. There is a serious concern that the costs and burdens of discovery are driving litigation away from the court system and forcing settlements based on the costs, as opposed to the merits, of cases.”

In both instances, the core notion is that “we’ve met the enemy and the enemy is us” because it’s the participants in the process have collectively perverted the discovery process to the point it’s at today.

Sedona’s focus on this front has received at least some traction from the bench, as echoed in Mancia v. Mayflower Textile Servs. Co., 2008 WL 4595175 (D. Md. Oct. 15, 2008).  Mancia, written by leading e-discovery jurist Judge Grimm, was a fairly pedestrian employment litigation case where the parties had come to loggerheads over the e-discovery process.  Judge Grimm held that “[c]ourts repeatedly have noted the need for attorneys to work cooperatively to conduct discovery, and sanctioned lawyers and parties for failing to do so” citing both the Sedona Cooperation Proclamation and the Survey.

Judge Grimm also observed that the these recent lamentations about the costs of civil litigation aren’t terribly dissimilar to those voiced eighteen years ago when the Civil Justice Reform Act of 1990, 28 U.S.C. §§ 471 et seq., was passed:

“Perhaps the greatest driving force in litigation today is discovery. Discovery abuse is a principal cause of high litigation transaction costs. Indeed, in far too many cases, economics-and not the merits-govern discovery decisions. Litigants of moderate means are often deterred through discovery from vindicating claims or defenses, and the litigation process all too often becomes a war of attrition for all parties.”

Given the fundamentally adversarial nature of litigation, the Sedona initiative is either dramatically ambitious or simply tilting at windmills.  While generally a skeptic by nature, I think that the bench’s early participation and downstream behavior modification is the linchpin to reforming the litigating masses.  Given the long term “sales” cycle involved here, I doubt if we’ll know whether this effort will gain real traction for at least several years.

Federal Rule of Evidence 502: Help or Hype?

Thursday, November 13th, 2008

There’s a lot of excitement (and corresponding uncertainty) about the recent passing of Federal Rule of Evidence 502 (FRE 502), which was signed into law on Sept 19th.  The main reason that the legal community is excited about FRE 502 is because of the potential for cost savings by reducing the amount of money associated with the e-discovery review process, which is routinely viewed as the most expensive area in the entire e-discovery process.

In combination with the codification of a national standard to determine when a privilege has been waived, FRE 502 is primarily designed to make the use of claw-back agreements a truly viable prospect when doing e-discovery privilege review.  It should provide some panacea (ideally) for rapidly escalating e-discovery costs.  Or, at least that was the impetus behind the rule’s creation – according to the Comments:

“The proposed new rule facilitates discovery and reduces privilege-review costs by limiting the circumstances under which the privilege or protection is forfeited, which may happen if the privileged or protected information or material is produced in discovery. The burden and cost of steps to preserve the privileged status of attorney-client information and trial preparation materials can be enormous. Under present practices, lawyers and firms must thoroughly review everything in a client’s possession before responding to discovery requests. Otherwise they risk waiving the privileged status not only of the individual item disclosed but of all other items dealing with the same subject matter. This burden is particularly onerous when the discovery consists of massive amounts of electronically stored information.”

In short, FRE 502 is designed to establish uniform, nationwide standards for waiver of attorney-client privilege and work product protection, with the main goal being to protect producing parties against the inadvertent disclosure of privileged materials or work product in either federal or state proceedings.  The salient section is subsection (b) which states that when a disclosure of privileged information is made in a federal proceeding or to a federal agency, the disclosure does not constitute a waiver if:

  1. the disclosure is inadvertent;
  2. the holder of the privilege or protection took reasonable steps to prevent disclosure; and
  3. the holder promptly took reasonable steps to rectify the error, including (if applicable) following Federal Rule of Civil Procedure 26(b)(5)(B).

The end game here is presumably to increasingly leverage automated review methodologies to save costs.  But, in order to facilitate this type of review methodology without taking on unhealthy levels of risk means that claw-back provisions must be as airtight at possible to prevent inadvertent electronically stored information (ESI) productions.  And yet, exactly how FRE 502 will work in practice is up to debate since there isn’t any case law interpreting it yet.

One area that’s top of mind is how this new Rule will impact the recent decisions on e-discovery search, including the Victor Stanley case authored by Chief Magistrate Judge Grimm.  Since FRE 502 contains a core “reasonableness” prong in section (b) it’s likely that Grimm’s proclamation about e-discovery search will still be controlling.  Grimm fundamentally had to evaluate whether the producing party’s search protocols and procedures were in fact reasonable.

“Defendants, who bear the burden of proving that their conduct was reasonable for purposes of assessing whether they waived attorney-client privilege by producing the 165 documents to the Plaintiff, have failed to provide the court with information regarding: the keywords used; the rationale for their selection; the qualifications of M. Pappas and his attorneys to design an effective and reliable search and information retrieval method; whether the search was a simple keyword search, or a more sophisticated one, such as one employing Boolean proximity operators; or whether they analyzed the results of the search to assess its reliability, appropriateness for the task, and the quality of its implementation.” (footnotes omitted).

In Victor Stanley, the producing party wasn’t able to demonstrate reasonableness because they didn’t strategically craft out their strategy nor conduct any sampling to make sure that the e-discovery search worked as designed.  This type of analysis would still seem to come into play under FRE 502 and so, as Grimm states, the use of either a best practices or collaborative approach to e-discovery would seem to be as important as ever.

Given that backdrop it’s just as important as ever that parties “show their work” when it comes to e-discovery search.   Whether FRE 502 will really make parties feel safe enough to use automated review processes (thereby reducing costs) will remain to be seen.  But, this first step which unifies standards and expectations is at least a very positive step.

Demystifying Concept Search in Electronic Discovery

Tuesday, October 28th, 2008

Concept or content search continues to be a hot topic within the e-discovery community.  There’s a continuous stream of articles that discuss it.  Some that point out the positive.  Others that point out the limitations.  The courts have also gotten involved in the discussion.  Judge Grimm refers to concept search in e-discovery in Victor Stanley, Inc. v. Creative Pipe, Inc., 2008 WL 2221841 (D. Md. May 29, 2008).  Judge Facciola discusses concept search in Disability Rights Council of Greater Washington v. Washington Metropolitan Transit Authority, 242 F.R.D. 139 and other opinions.  Despite (or maybe because of) all the commentary on this topic, I find that while a lot of people think that concept search in e-discovery is good, many are not fully sure of exactly what concept search is, and how it is practically useful in e-discovery.   It’s pretty clear that after several years of commentary and hype, concept search has become something of a buzzword associated with many myths and misconceptions.  In an effort to better understand what concept search is and how it can help in e-discovery, I want to dispel two of the most common myths I have heard.

The “Concept Search is Concept Search” Myth

The first myth around concept search actually revolves around what it is.  In my experience, people tend to lump two different technologies together when talking about concept search: concept search and concept categorization.  It’s very common, for example, to see commentators say concept search even when what they are really talking about is concept categorization.  To make matters more confusing, people also use a plethora of other names including content search, content clustering or concept clustering when what they really mean is concept categorization.

So, what are the differences between concept search and concept categorization?  First, let’s start with concept search.  Concept search technologies find documents containing “concepts”.  I think that the Sedona Conference’s “Best Practices Commentary on the Use of Search & Information Retrieval Methods in E-Discovery“, provides a good definition of “concept” when used in a search context: “the combination of [a] query term and the additional terms identified by the thesaurus.”  In other words, concept search technologies find documents containing a specified term plus additional terms with similar meanings derived from a thesaurus.

Concept categorization, on the other hand, is actually not a search technology at all.  Concept categorization technologies do not “find” documents.  Rather, they categorize or group documents based on their similarity.   There are many different ways to group documents based on similarity.  Techniques include statistical (which assesses similarity based on word frequency), Bayesian classification (which weights words differently depending on factors in addition to statistical frequency, such as where the terms appear in a document), and semantic indexing (which takes into account the fact that many words used in a similar context may have a similar meaning).  It would take more time to describe these technologies in detail but the Sedona commentary has a good summary of these different technologies if you are interested in learning more.

As should now be apparent, these technologies are very different and using the same words to describe them is confusing.  It’s why it’s not surprising that a lot of the users of e-discovery services and software don’t have a strong understanding of what these technologies are or what benefits they can actually provide in practice.  Dispelling the myth that they can be lumped together is a critical first step in any conversation about concept search and how it can help in e-discovery.  This leads us to a second myth, that Concept Search is better than Keyword Search.  I’ll discuss this in my next blog post.

Why Transparent Search In E-Discovery Is The Answer To Victor Stanley

Tuesday, August 26th, 2008

In my last post, I discussed how the “black box” design of enterprise search engines makes it challenging to defensibly use keyword search in e-discovery and follow Judge Grimm’s guidance in Victor Stanley, Inc. v. Creative Pipe, Inc., 2008 WL 2221841 (D. Md. May 29, 2008).  In Victor Stanley, Judge Grimm notes that because keyword search technology is prone to producing over- and under-inclusive results, attorneys using keyword search should adopt one of two approaches: either collaborate with the opposing party to agree on keyword search methodology, or utilize best practices that demonstrate they have taken reasonable measures to reduce over- and under-inclusiveness.  However, the black box search technologies that are used in e-discovery today make following this guidance difficult.  They can’t reduce under-inclusiveness without increasing over-inclusiveness.  And they make it expensive to utilize collaborative or best practices methodologies including testing, sampling, refining and documenting searches.  All of which begs an obvious question: what can be done to improve search for e-discovery?

In my opinion, the answer is simple: e-discovery search needs to become more transparent.  Instead of being forced to feed one search query at a time into a “black box” search engine and then getting results  with no idea how those results were generated, lawyers and litigation support professionals need technology that provides them with greater visibility into the search process. They need to understand how the results were obtained, so they can reduce both the over- and under-inclusiveness of keyword search, and easily follow Judge Grimm’s advice to improve the defensibility of their search methodology.

A transparent search solution should have four key elements:

  1. Transparent query expansionQuery expansion is the process by which search engines take the query that the user submitted and expand or convert it into a new and improved form.  Wildcard, stemming, concept and fuzzy searches all follow this query expansion process.  For example, the search “divers*,” would be expanded to search for all the words that start with “divers” in the data set, such as “diverse,” “diversity,” “diversion,” “diversification,” etc.  In transparent search, query expansion would be exposed to users, allowing them to include or exclude expanded keywords. To continue with the previous example, a user that is searching for documents related to diversity would then have the ability to exclude false positive expanded terms, such as “divers”, “diversion,” and “diversification” from the search.  Making query expansion transparent can significantly reduce the over-inclusiveness of keyword search.  It also makes it practical to use technologies, such as concept and fuzzy search, that have not been used to date because of their complexity and tendency to produce massively over-inclusive results.
  2. Multiple query support. When a search contains multiple keyword queries, such as “hiring” and “interview,” transparent search should provide visibility into the results for each individual query as well as the combination of all the queries. For example, with the search “hiring OR interview,” users should have separate visibility into the results for “hiring” and “interview” as well as “hiring OR interview.”  They should know that out of the 100 documents that match “hiring OR interview”, only 5 match interview and 95 match hiring.  This kind of visibility is critical if you want to either collaborate or follow search testing, sampling, and refinement best practices when there are a large number of queries.
  3. Rapid sampling. Transparent search should support the ability to rapidly sample the results from all of the individual queries, such as “hiring” and “interview”, contained within a search. It should also be easy to take a random sample of non-matching documents in order to assess whether one or more searches have identified as many of the relevant documents as possible.  As Judge Grimm states in Victor Stanley when assessing keyword searches used to find privileged documents, “The only prudent way to test the reliability of the keyword search is to perform some appropriate sampling of the documents determined to be privileged and those determined not to be in order to arrive at a comfort level that the categories are neither over-inclusive nor under-inclusive.”
  4. Automated documentation. Transparent search technology needs to document all aspects of the search process including (but not limited to) any keyword that has been excluded during transparent query expansion, the combined results of a search containing multiple individual queries, and the results for each of the individual queries within that search.  Automatically documenting the search methodology used and the results obtained is critical so that users can “show their work” if their search methodology is ever called into question.

Benefits of Transparent Search

By addressing the main technology challenges of keyword search, transparent search provides significant benefits to attorneys and litigation support professionals using search for e-discovery. First, parties that adopt transparent search can improve the defensibility of their e-discovery search practices. By enabling iterative testing, sampling and refinement, transparent search allows users to adopt the approaches recommended by Judge Grimm when it was previously impractical to do so.  At the end of the day, this means less risk.

Second, the use of transparent search can substantially reduce downstream production and review costs by removing false positives. For example, it is not uncommon for certain wildcard searches to generate results where 20-40% of the included documents are false positives that can be removed by transparent query expansion.  This can result in thousands of dollars of savings on a single search query.

Finally, transparent search can dramatically reduce the time and cost required to complete the search and culling stage of e-discovery. Currently, it can take hundreds of hours to run a significant number of searches one at a time, document the results of each search, and sample and refine each individual query. With transparent search, running multiple queries and documenting each of the individual results takes minutes. Sampling each of the individual queries takes seconds.

When it comes to e-discovery search, it’s important to recognize that there are no “silver bullets.”  Search will remain an imperfect science with the possibility of over- and under-inclusive results.  But equally, there is no doubt that search remains the best solution for reducing the vast quantities of electronic information that are a part of every e-discovery process down to a reasonable level for human review. While attorneys and litigation support professionals can’t completely remove the imperfections of keyword search, they can, with transparent search, take action to minimize the impact of these imperfections and defensibly meet the requirements of new case law.  In doing so, they will be able to turn their attention to where it should be: the substance of the case.

Judge Grimm, Victor Stanley, And The Problem Of “Black-Box” E-Discovery Search

Friday, August 22nd, 2008

Judge Paul Grimm’s recent opinion in Victor Stanley, Inc. v. Creative Pipe, Inc., 2008 WL 2221841 (D. Md. May 29, 2008) provides valuable guidance on one of the most important issues in e-discovery: how to conduct keyword searches in a defensible manner given that keyword searches are prone to produce over- and under-inclusive results.  The ruling suggests one of two approaches: either producing parties should adopt a “collaborative” approach to conducting keyword searches, whereby each party agrees on a search methodology; or, they should use a “best practices” approach, such as the one suggested by Sedona, where the producing party tests, samples, and iteratively refines searches so that they can demonstrate they have taken reasonable measures to reduce over- and under-inclusive results.

While the guidance is clear, following the guidance in practice is very difficult.  The primary reason for this is that the search technology being used in e-discovery today is not up to the task.  Specifically, today’s search technology suffers from three problems:

  1. The over- and under-inclusive tradeoff. Many technologies have been developed to address the tendency of keyword searches to miss relevant documents and produce under-inclusive results.  Wildcard and stemming technology has been developed in order to address the issue of finding common word variations in specified keywords.  Concept search has been designed to find documents containing words with similar meanings to the keywords in a search.  And fuzzy search technologies have been put in place to find misspellings of words. However, all of these suffer from the same problem: they produce too many non-relevant or “false positive” documents thus driving up the cost of review. For example, if someone runs the wildcard search “divers*”, then he or she not only gets the desired documents containing “diverse” and “diversity”, but also gets a large number of false positive documents containing “diversion”, “diversification”, and so on.  In the case of concept and fuzzy search, the problem is so great that these technologies to date have rarely been used in e-discovery.
  2. Too expensive to test, sample and refine searches. Today’s search technologies are largely designed to run one search at a time, not the dozens of searches that are typical in e-discovery. As a result, anyone trying to follow the best practices of testing, sampling, and refining each search will find themselves missing deadlines and running over budget because it takes so long. This also makes collaboration with the opposing party close to impossible, since there’s little time to iterate on – and agree upon – a set of keyword searches.
  3. Manual documentation. It’s not enough for producing parties to use best practices, they have to document them so that they can “show their work” to the court. Currently, documenting the search refinement process is mostly manual, with the result that it is either done inadequately or not at all.

The reason why the search technology used for e-discovery has these problems is surprisingly simple: it’s because the technology was not designed for e-discovery in the first place. Rather, it was built for enterprise search, and was only later repurposed towards e-discovery.

The “Black Box” Of Enterprise Search

The core issue is that enterprise search technology has been designed to be a “black box”. Users enter a single search query into one end, and get results at the other, with no visibility into what happens in between. Going back to our previous example, when a user searches for “divers*” intending to find documents related to “diversity” or “diverse”, enterprise search engines give the user no visibility into the crucial step of query expansion and how it expands the search query into relevant and non-relevant terms like “diversion” and “diversification”. As a result, the user has no ability to minimize the false positives.

In the same vein, when a user enters multiple queries into a “black box” enterprise search engine, all of the queries run as a single search, and the user has no visibility into which results are associated with which query. For example, a user that searches for “hiring OR interview” will get the results for the combination of the queries “hiring” and “interview”. He or she won’t know that only 5 of documents contained “hiring” while 100 documents contained “interview.”  This limitation makes analyzing, sampling and refining searches costly and time consuming.

That’s not say that enterprise search products like Autonomy or Endeca are flawed. Far from it.  Their “black box” design works exceedingly well for the simple and quick queries that people want to run across the enterprise for general business purposes. If a sales manager is looking for a single proposal for her meeting the following day, then she doesn’t care how the search was performed or if it’s over-inclusive.  She’s only interested in the first page of relevant results, and for that use case enterprise search engines do a great job.

But e-discovery is a whole different world.  In e-discovery, users typically must review every single document in the search results, not just the most relevant ones.  As a result, over-inclusive searches can dramatically increase the costs of downstream production and review.  And under-inclusive searches raise the issue of defensibility.  Finally, e-discovery users have to run a lot of search queries and understand which documents are associated with each of those queries.

So, going back to the original problem, if current search technologies cannot help lawyers and litigation support professionals follow Judge Grimm’s guidance and address the “well-known limitations” of keyword search, what can? That will be the subject of my next post.