Posts Tagged ‘Sedona Conference’

Top Ten Trends in Electronic Discovery

Wednesday, November 11th, 2009

Since I’ve finished off the last of the Halloween candy and tossed out the moldy, squirrel ravaged pumpkins, it occurred to me that now might be a good time to think about what 2010 will hold for the electronic discovery industry.  My 2009 list seems to have been fairly prescient and many of those notions still hold true since the legal industry (as we know) doesn’t move at the most blistering pace.

Again, doing my best Nostradamus impersonation, here are my top ten trends for 2010:

  1. Early case assessment (ECA) moves from a “nice to have” to a “must have” requirement for any matter involving electronically stored information (ESI).  In 2009, we saw ECA move into the mainstream as a methodology to quickly understand case facts, assess risk and lower both review and data processing costs.  But, in 2010, with the advancement of the tools and the increased socialization within the bar and the litigation support community, ECA will graduate into a core methodology for savvy litigators regardless of matter type or size.
  2. Appetites for broad information lifecycle management initiatives diminish as organizations realize these programs are far too complex to solve specific pain points, and they often take too much time (measured in years) to execute.  The economic reality is that these holistic, cross data, cross enterprise pipe dreams really can’t demonstrate the ROI that’s needed in today’s challenging economy.
  3. Staffing roles continue to evolve with a newfound focus on project management. The role of an in-house e-discovery coordinator will emerge as more of a project management and analyst versus pure legal or IT. This shift will become increasingly necessary as e-discovery evolves from an ad-hoc fire drill to a standard business process that is repeatable, measurable, and defensible.
  4. Data analytics and statistical methodologies gain traction to augment the type of subjective decision making approaches that have historically formed the backbone of the e-discovery search and review processes.  These objective methodologies have long been called on as best practices by the likes of the Sedona Working Group. In 2010, they now will start to move from theoretical to practical task as e-discovery tools increasingly move in-house and departments enhance defensibility and add elements such as sampling into the workflow.
  5. Platform e-discovery solutions finally become a reality as customers finally graduate from painfully stitching point solutions together, thus requiring less physical document hand-offs (i.e., exports and imports) between applications, cutting costs and lowering the risk of data loss.
  6. Associate-based review gradually goes extinct, as both clients and law firms tire of expensive, linear review processes.  More review work becomes either insourced or is managed with specialized contract attorneys, who are both cheaper and better trained for this type of work.
  7. Similarly, FRE 502 and “clawback” agreements will be increasingly used to reduce the need for any manual, eyes-on review, although many litigators will resist this trend because of the fears of “un-ringing the bell” when privileged information is disclosed in any context.
  8. While perhaps anathema, alternatives to the much lauded EDRM model will gain traction, as practitioners strive to find an even better, and perhaps more practical, project management framework, in many cases acknowledging the role that the EDRM has taken in forming *the* lingua franca of the e-discovery industry.
  9. The push for cooperation in the e-discovery process, will make incremental progress despite reticence by old school litigators.  Increasingly, this type of cooperation, as strongly advocated by the Sedona Working Group, will be ironically forced by judges and local rules.
  10. “Cloud” computing starts to really impact how e-discovery data preservation/collection is done, both in terms of social media and traditional ESI.  More and more companies block social media applications and file types in the workplace because of fears surrounding the inability to preserve and collect.

7th Circuit Launches an Electronic Discovery Pilot Program

Thursday, October 15th, 2009

Recently, I attended the Sedona Conference’s annual meeting in Atlanta and, amongst other interesting topics, was the discussion of local rules developments and in particular the Seventh Circuit’s new Electronic Discovery Pilot Program (“Pilot Program”).  The Pilot Program was launched October 1, 2009 and seems to be a model for collaboration, since it was developed by eliciting input from a number of disparate groups:

“(a) continuing comments by business leaders and practicing attorneys, regarding the need for reform of the civil justice pretrial discovery process in the United States, (b) the release of the March 11, 2009 Final Report on the Joint Project of the American College of Trial Lawyers Task Force on Discovery (“Task Force”) and the Institute for the advancement of the American Legal System at the University of Denver (“IAALS”), and (c) The Sedona Conference® Cooperation Proclamation.”

The impetus of the Pilot Program was the “broken” nature of the electronic discovery process with the belief that better collaboration and cooperation would certain help remediate the situation.

“The goal of the Principles is to incentivize early and informal information exchange on commonly encountered issues relating to evidence preservation and discovery, paper and electronic, as required by Rule 26(f)(2). Too often these exchanges begin with unhelpful demands for the preservation of all data, which often are followed by exhaustive lists of types of storage devices. Such generic demands lead to generic objections that similarly fail to identify specific issues concerning evidence preservation and discovery that could productively be discussed and resolved early in the case by agreement or order of the court. As a result, the parties often fail to focus on identifying specific sources of evidence that are likely to be sought in discovery but that may be problematic or unduly burdensome or costly to preserve or produce.”

What I really like about the Pilot Program is that it strives to be both prescriptive and practical, which should hopefully avoid the type of ambiguity often exploited by obstreperous counsel.  For example, there is an entire section on early case assessment (ECA) principles, which require discussion of:

  • Production issues
  • Identification of electronically stored information (ESI)
  • The scope of preservation
  • The meet & confer process

There’s also the relatively novel requirement that counsel designate an e-discovery “liaison” to work with the parties to coordinate and flesh out germane e-discovery issues.  Regardless of whether the e-discovery liaison is an attorney, a third party consultant, or an employee of the party, the e-discovery liaison(s) must:

“(a) be prepared to participate in e-discovery dispute resolution;

(b) be knowledgeable about the party’s e-discovery efforts;

(c) be, or have reasonable access to those who are, familiar with the party’s electronic systems and capabilities in order to explain those systems and answer relevant questions; and

(d) be, or have reasonable access to those who are, knowledgeable about the technical aspects of e-discovery, including electronic document storage, organization, and format issues, and relevant information retrieval technology, including search methodology.”

Needless to say, this requirement alone should make marked improvements in the e-discovery dialogue, which unfortunately seems like it’s occurring (literally) among participants who both speak different languages and don’t realize it.

Finally, what makes the Pilot Program unique is that its Principles will be subjected to testing during the phases of the Pilot Program, which is scheduled to end on May 1, 2010 (for the first phase).

This project certainly seems like it’s on the right track and pending feedback from the bench and bar, it could serve as a model for local jurisdiction everywhere.

Electronic Discovery Services: The Price is Right?

Wednesday, June 17th, 2009

Maybe this will show my age, but I’ve been around the electronic discovery business since the days when pricing was both simple and very expensive. Terabytes were at the mythical high-end of the spectrum and gigabytes of “e-docs” (not “ESI”) cost $3,000 – $4,000 to process. Understandably (and fortunately for most), pricing models have evolved, thanks in part to more educated consumers and initiatives such as Sedona’s RFP + Vendor Panel.

Leaving the WABAC machine and moving into present times, we’ve starting to see some variance from traditional pricing models that primarily focus on data “into” the processing machine. More and more companies (such as Kroll Ontrack) are moving to models that price on data “out” of the process. Since that’s a bit nebulous, an example might illustrate:

Traditionally, in a somewhat simplified fashion, an electronic discovery project would be priced by the amount of data in the initial corpus (say 100 gigabytes) and processing would be priced at $500 a gigabyte (for round numbers purposes). Leaving out the sometimes significant caveat that the 100 gigabytes would likely increase due to expansion of compressed files, this would mean that the bulk of the project expenses would be $50,000 ($500 x 100), plus relatively nominal costs for monthly hosting and user access rights.

At the end of the day, after elimination of system files, deduplication and application of search terms (reducing the initial corpus by say 70% collectively) there would be 30 gigabytes remaining for hosting and possible production, both of which are most often priced separately.

Given rampant commoditization there’s an arms race underway among certain service providers where they’re now changing the above model to give away initial processing as a loss leader – pricing only on the data that comes out the end of the processing/search step. In this approach the above workflow would largely stay the same, but the vendor would charge a higher rate for what ultimately is hosted on the back-end. If this back-end fee was $2,000 per resulting gigabyte and the same 30 gigabytes was seen out the back end, then the customer would pay $60,000 for the project. But, if the deduplication, searching, culling, etc. was more effective (at say 80%) then the resulting 20 gigabytes would only cost $40,000.

The question then, as Clint Eastwood would put it, is: “Do you feel lucky?” This pricing model forces attorneys and litigation support managers to guesstimate what culling, search, and de-duplication rates they’ll likely get on the data corpus. Guess right and they save the end client money, guess wrong and they’re way over budget.

The dynamics of this purchasing decision are a bit atypical because the buyer (usually counsel) doesn’t pay the bills, so the decision can often be more vexing than most. When a direct consumer gambles on pricing things will ideally balance out over time, with money being saved in some instances and some being overspent in others. But, when the buyer doesn’t pay the bills the motivation is less clear.

Thoughts run to Maslow’s hierarchy of needs to determine which pricing model is ultimately more compelling: (a) price certainty/adherence to budget, or (b) cost variability and the opportunity to save money. While it’s never good to understate the upside of saving money (Esteem), I think ultimately there’s a more fundamental need (Safety) to stay within budget and avoid the painful (sometimes client imperiling) call to discuss how a given e-discovery project has gone way over budget.

This calculation is made further vexing because it not only pits the purchasing party against unknown data culling/searching rates, but it also puts the vendor in an ethical bind where they make less money if they’re supremely effective at data reduction, whereas if they’re either intentionally or accidentally beneficiaries of relatively little data reduction then they stand to make a ton of upside.

It’s like you went to Vegas to gamble your kid’s college fund and on top of the already questionable house odds you knew that the dealer stood to profit by your losses. So, as for myself, no, I don’t feel lucky.

Adams v. Dell Questions Custodian-Based Retention and Litigation Hold Practices in Electronic Discovery

Thursday, May 28th, 2009

I was at the Sedona Conference Working Group’s Mid Year meeting last week where 80 or so electronic discovery practitioners and judges met to discuss hot topics in bucolic Denver, Colorado.  Without getting into the particulars of any discussion, several themes continue to stay on the front burner, including the progress of the cooperation proclamation and the relatively newer issue of proportionality (as highlighted recently by The American College of Trial Lawyers Task Force on Discovery).

Aside from those overarching themes I was struck by how polarizing the discussion was around one recent case in particular.  While many notable commentators have already made this the most talked about cases of the year, Phillip M. Adams & Assoc., LLC v. Dell, Inc., 2009 WL 910801 (D. Utah Mar. 30, 2009) continues to stimulate discussion.   Adams v. Dell is a patent infringement case where the plaintiff, alleged that one of the defendants (ASUS) destroyed critical pieces of evidence and should be sanctioned accordingly.

The underlying facts and timelines are fairly complex, but in summary the dispute centered around the alleged infringement of several patents developed to resolve defects in floppy disks during in the late 80’s.  What makes this decision so vexing is that it starts out as a preservation case, but quickly confuses that concept with data retention and information management practices/policies.

So, starting with the preservation angle…  Both sides fortunately agreed about the definition for the duty to preserve evidence, which in the 10th circuit begins when a party “knows or should know [it] is relevant to imminent or ongoing litigation.”  The triggering of the preservation duty was not surprisingly much more complicated and ASUS (the responding party) claimed that its duty to preserve wasn’t triggered until early 2005, when they received a letter warning it of potential litigation because of the alleged patent infringement.  But, the Magistrate held that “counsel’s letter is not the inviolable benchmark” and the duty to preserve was triggered much earlier (in the 1999-2000 time frame) because similar litigation was rampant in the industry, highlighted by a late 1999 suit where Toshiba paid billions of dollars in a class action settlement related to similar floppy disk issues.

Leaving the murky preservation issue by the wayside for a bit, the Magistrate then moved into ASUS’ claims that FRCP 37(e) provided a safe harbor for its alleged destruction.

“ASUS claims it can find a safe harbor against sanctions because of the recently adopted rule that sanctions may not be generally imposed for ‘failing to provide electronically stored information lost’ if a party can show the loss was ‘a result of the routine, good-faith operation of an electronic information system.’”

Nice try, but strike two for ASUS…

“ASUS provided an extensive declaration from an experienced consultant in e-discovery. While he stated the reasons for and history of ASUS’ ‘distributed information architecture,’ he did not state any opinion as to the reasonableness or good-faith in the system’s operation. And while he says ‘ASUSTeK’s data architecture relies predominantly on storage on individual user’s workstations,’ his 31-page declaration does not show he is familiar with the precise practices pointed out in the declarations of employees. Those employees’ declarations describe the practice of ASUS’ email system to overwrite old data regardless of its significance; ASUS’ reliance on employees for all email and data archiving; and the process of replacement of computers, which also relies on employees to transfer data from their old to their new computers. Neither the expert nor ASUS speak of archiving ‘policies;’ they speak of archiving ‘practices.’

The court’s distinction between “policies” and “practices” seems like a convenient (perhaps “Deus ex machina”) way to discount ASUS’ data retention activities and prevent the use of the FRCP 37(e) safe harbor.  Since in most instances, “bona fide, consistent and reasonable” document retention “policies” have been found to be presumptively valid by everyone ranging from Sedona (Guideline 3) to Carlucci v. Piper Aircraft Corp. and Arthur Andersen LLP v. United States, 125 S.Ct. 2129 (2005).  It’s not clear how he draws the important “practices” distinction and why said practices are exponentially different from presumptively valid “policies.”

It’s precisely this line of thinking that confuses the alleged failure of the duty to preserve (discussed at the outset of the opinion) with the duty to retain information.  The court seems to think it’s an “unreasonable” practice to have custodians responsible for compliance with data retention and this deficiency made the safe harbor unavailable.

“ASUS has explained that it has no centralized storage of electronic documents, email or otherwise, and relies on individual employees to archive email (which will be deleted if left on the server) and electronic documents (which reside only on individual workstations).”

Not only is this custodian-based retention practice, in and of itself, reasonable; it’s probably the most common form of data retention practices seen at corporations today.  While a number of vendors have promised intelligent retention systems that work without any significant human intervention, for the most part those solutions are still in their infancy.  Additionally, there are significant technical challenges to have an application manage *all* ESI (Electronically Stored Information) that exist for a given custodian (including desktop files, instant messaging, text messaging, social media, etc.) As such, most companies must inherently rely upon their custodians to both retain and preserve data pursuant to company policies.  The court not only seems to miss this point, but also attempts to impose an obligation that corporations must prevent the “loss of data” above and beyond specific preservation obligations.

“ASUS’ practices invite the abuse of rights of others, because the practices tend toward loss of data. The practices place operations-level employees in the position of deciding what information is relevant to the enterprise and its data retention needs. ASUS alone bears responsibility for the absence of evidence it would be expected to possess. While Adams has not shown ASUS mounted a destructive effort aimed at evidence affecting Adams or at evidence of ASUS’ wrongful use of intellectual property, it is clear that ASUS’ lack of a retention policy and irresponsible data retention practices are responsible for the loss of significant data.”

Although the exact rationale was unclear, the court held that ASUS violated their duty to preserve and that the loss of evidence could not be excused as a “routine, good faith operation of electronic information systems.” While the court ruled that sanctions were appropriate, it reserved final sanctions pending the close of discovery.   Depending on what those ultimate sanctions look like, it seems pretty likely that this decision will be subject to appellate review.  Until then, it’s probably too soon to treat this questionable holding as gospel.  Wary corporations however should continue to bolster the “reasonableness” of their information management/retention/destruction policies and practices so that in hindsight a court won’t be able to take away the FRCP 37(e) safe harbor by casting those “practices” as being unreasonable.

Top 5 Cases That Shaped Electronic Discovery in 2008

Friday, December 12th, 2008

Picking five out of the sea of electronic discovery cases isn’t as easy as it sounds.  Sure, a few, like our “Case of the Year” will be no-brainers, but others aren’t as clear cut.  And, they’re certainly open to debate.  But, in my humble opinion here’s THE list, counting down David Letterman style:

5) Mancia v. Mayflower Textile Servs. Co., 2008 WL 4595175 (D. Md. Oct. 15, 2008)

If there ever was an opinion written by a judge to make a larger societal point, Mancia was certainly it.  Judge Paul Grimm, who’ll appear on this list in another slot as well, has clearly taken the mantle from Judge Scheindlin as the leading electronic discovery jurist.  He’d heretofore authored a number of significant opinions in this area, including Hobson and Thompson. Now, in Mancia he used a garden variety discovery dispute, which was typically rife with boilerplate objections and other obstreperous tactics, to highlight the Sedona Conference’s Cooperation Proclamation.

The lasting takeaway from the opinion is the notion that “[c]ourts repeatedly have noted the need for attorneys to work cooperatively to conduct electronic discovery, and sanctioned lawyers and parties for failing to do so.” To support this notion he cites the Sedona Conference Proclamation and the little used FRCP 26(g).  This opinion is noteworthy because it gives precedent to bolster the Sedona initiative and should provide a ready citation for all those counsel who aren’t getting the level of cooperation they need from the opposition.  It remains to be seen if other judges will follow suit, but this could be the beachhead for a more cooperative electronic discovery process in 2009 and beyond.

4) Flagg v. City of Detroit, 252 F.R.D. 346 (E.D. Mich. 2008)

Flagg highlights the growing need to reconcile the electronic discovery landscape, which typically focuses somewhat myopically on email, with the larger informational trends which are now categorized by the use of blogs, social networking sites, instant messaging, and text messaging.  Flagg was one of the first to determine text messages (e.g., messages exchanged among certain officials and employees of the City of Detroit via city-issued text messaging devices) were discoverable under the standards of FRCP 26(b)(1).  The holding further demonstrated the challenges of conducting electronic discovery across information systems that mix personal information with business communications.  This type of information commingling will continue to escalate, causing significant long term electronic discovery challenges due to thorny privacy, privilege and policy implications.

3) Rhoads Indus., Inc. v. Bldg. Materials Corp. of Am., 2008 WL 4916026 (E.D. Pa. Nov. 14, 2008)

Rhoads is one of the first cases post Federal Rule of Evidence (FRE) 502, which recently created a national standard (versus the previous split in jurisdictions) and now states a “middle ground” for the determining of inadvertent disclosure during electronic discovery.  The key provision is (b)(2) which provides protection only if “the holder of the privilege or protection took reasonable steps to prevent disclosure.”  So, Rhoads took that “reasonableness” question head on in a scenario where the plaintiff Rhoads admittedly (yet inadvertently) produced over eight hundred privileged, electronic documents.  The decision is significant because it used the five-factor test stated in Fidelity, but put an undue weighting on the final test which was: “whether the overriding interests of justice would be served by relieving the party of its errors.”   This approach potentially threatens the development of sound case law that will be necessary to help the deployment of FRE 502 into practice because it casts too much uncertainty with its weighting of “fairness” (a problematically vague notion) in the analysis.  It will be interesting to see if/how this approach is subsequently adopted as we enter the New Year.

2) Qualcomm Inc. v. Broadcom Corp., 2008 WL 66932 (S.D. Cal. Jan. 7, 2008)

This for many was the case of the year given it’s far reaching implications for the legal community.  Some have argued that this isn’t an e-discovery abuse case per se, but more of an example of discovery abuses that just so happened to be centered around ESI.  In either case, the fraud, resulting cover-up, sanctions, ethical issues and privilege discussions made for insightful and thought provoking reading throughout 2008.  The lasting takeaway from Qualcomm appears to be the implications of not just committing discovery abuses, but the failure of having a well thought out e-discovery plan that is actively executed/monitored by outside counsel.  The resulting tension between outside counsel, inside counsel and the internal IT department may continue to escalate if more cases like this make the headlines in 2009.

1)  E-Discovery Case of the Year: Victor Stanley, Inc. v. Creative Pipe, Inc., 2008 WL 2221841 (D. Md. May 29, 2008)

Judge Grimm’s hallmark opinion has had the legal community buzzing over the past several months and the reason appears pretty straight forward.  In Victor Stanley Grimm builds on the holdings in Seroquel, O’Keefe and Equity Analytics, to boldly cast doubt on a practice so routine that it’s literally shocked the legal community into reevaluation:

(”[D]etermining whether a particular search methodology, such as keywords, will or will not be effective certainly requires knowledge beyond the ken of a lay person (and a lay lawyer) . . . .”

The notion that electronic discovery search is beyond the ability of most attorneys has caused tremors within the litigation support community who had a long history of blindly receiving keywords from counsel, running them and turning back over the results – often blissfully unaware of the extent to which those keyword searches actually located relevant information.  Victor Stanley’s analysis of the “reasonableness” of search protocols also has impact on the FRE 502 and therefore cements its place alongside other e-discovery “must reads” such as Zubulake and Morgan Stanley.

The cases above are my Top 5.  What additional cases do you think were important?  Please let me know by commenting on the cases you think shaped electronic discovery in 2008 and why.

The Sedona Cooperation Proclamation and the Case for Collaboration

Monday, November 17th, 2008

Without getting in Dutch with the key Sedona Conference principle that “what happens at Sedona, stays at Sedona” I thought I’d nevertheless write a post that focuses on the core topic at this year’s annual meeting, namely the case for cooperation in e-discovery.

According to the “Cooperation Proclamation” e-discovery is facing an unprecedented crisis:

“The costs associated with adversarial conduct in pre-trial discovery have become a serious burden to the American judicial system. This burden rises significantly in discovery of electronically stored information (”ESI”). In addition to rising monetary costs, courts have seen escalating motion practice, overreaching, obstruction, and extensive, but unproductive discovery disputes – in some cases precluding adjudication on the merits altogether – when parties treat the discovery process in an adversarial manner. Neither law nor logic compels these outcomes. With this Proclamation, The Sedona Conference launches a national drive to promote open and forthright information sharing, dialogue (internal and external), training, and the development of practical tools to facilitate cooperative, collaborative, transparent discovery.”

These sentiments about the “broken” nature of the discovery process echo in many ways the draft findings from the Interim Report & 2008 Litigation Survey from the Fellows of the American College of Trial Lawyers which stated:

“The joint study grew out of a concern that discovery is increasingly expensive and that the expense and burden of discovery are having substantial adverse effects on the civil justice system. There is a serious concern that the costs and burdens of discovery are driving litigation away from the court system and forcing settlements based on the costs, as opposed to the merits, of cases.”

In both instances, the core notion is that “we’ve met the enemy and the enemy is us” because it’s the participants in the process have collectively perverted the discovery process to the point it’s at today.

Sedona’s focus on this front has received at least some traction from the bench, as echoed in Mancia v. Mayflower Textile Servs. Co., 2008 WL 4595175 (D. Md. Oct. 15, 2008).  Mancia, written by leading e-discovery jurist Judge Grimm, was a fairly pedestrian employment litigation case where the parties had come to loggerheads over the e-discovery process.  Judge Grimm held that “[c]ourts repeatedly have noted the need for attorneys to work cooperatively to conduct discovery, and sanctioned lawyers and parties for failing to do so” citing both the Sedona Cooperation Proclamation and the Survey.

Judge Grimm also observed that the these recent lamentations about the costs of civil litigation aren’t terribly dissimilar to those voiced eighteen years ago when the Civil Justice Reform Act of 1990, 28 U.S.C. §§ 471 et seq., was passed:

“Perhaps the greatest driving force in litigation today is discovery. Discovery abuse is a principal cause of high litigation transaction costs. Indeed, in far too many cases, economics-and not the merits-govern discovery decisions. Litigants of moderate means are often deterred through discovery from vindicating claims or defenses, and the litigation process all too often becomes a war of attrition for all parties.”

Given the fundamentally adversarial nature of litigation, the Sedona initiative is either dramatically ambitious or simply tilting at windmills.  While generally a skeptic by nature, I think that the bench’s early participation and downstream behavior modification is the linchpin to reforming the litigating masses.  Given the long term “sales” cycle involved here, I doubt if we’ll know whether this effort will gain real traction for at least several years.

Demystifying Concept Search in Electronic Discovery

Tuesday, October 28th, 2008

Concept or content search continues to be a hot topic within the e-discovery community.  There’s a continuous stream of articles that discuss it.  Some that point out the positive.  Others that point out the limitations.  The courts have also gotten involved in the discussion.  Judge Grimm refers to concept search in e-discovery in Victor Stanley, Inc. v. Creative Pipe, Inc., 2008 WL 2221841 (D. Md. May 29, 2008).  Judge Facciola discusses concept search in Disability Rights Council of Greater Washington v. Washington Metropolitan Transit Authority, 242 F.R.D. 139 and other opinions.  Despite (or maybe because of) all the commentary on this topic, I find that while a lot of people think that concept search in e-discovery is good, many are not fully sure of exactly what concept search is, and how it is practically useful in e-discovery.   It’s pretty clear that after several years of commentary and hype, concept search has become something of a buzzword associated with many myths and misconceptions.  In an effort to better understand what concept search is and how it can help in e-discovery, I want to dispel two of the most common myths I have heard.

The “Concept Search is Concept Search” Myth

The first myth around concept search actually revolves around what it is.  In my experience, people tend to lump two different technologies together when talking about concept search: concept search and concept categorization.  It’s very common, for example, to see commentators say concept search even when what they are really talking about is concept categorization.  To make matters more confusing, people also use a plethora of other names including content search, content clustering or concept clustering when what they really mean is concept categorization.

So, what are the differences between concept search and concept categorization?  First, let’s start with concept search.  Concept search technologies find documents containing “concepts”.  I think that the Sedona Conference’s “Best Practices Commentary on the Use of Search & Information Retrieval Methods in E-Discovery“, provides a good definition of “concept” when used in a search context: “the combination of [a] query term and the additional terms identified by the thesaurus.”  In other words, concept search technologies find documents containing a specified term plus additional terms with similar meanings derived from a thesaurus.

Concept categorization, on the other hand, is actually not a search technology at all.  Concept categorization technologies do not “find” documents.  Rather, they categorize or group documents based on their similarity.   There are many different ways to group documents based on similarity.  Techniques include statistical (which assesses similarity based on word frequency), Bayesian classification (which weights words differently depending on factors in addition to statistical frequency, such as where the terms appear in a document), and semantic indexing (which takes into account the fact that many words used in a similar context may have a similar meaning).  It would take more time to describe these technologies in detail but the Sedona commentary has a good summary of these different technologies if you are interested in learning more.

As should now be apparent, these technologies are very different and using the same words to describe them is confusing.  It’s why it’s not surprising that a lot of the users of e-discovery services and software don’t have a strong understanding of what these technologies are or what benefits they can actually provide in practice.  Dispelling the myth that they can be lumped together is a critical first step in any conversation about concept search and how it can help in e-discovery.  This leads us to a second myth, that Concept Search is better than Keyword Search.  I’ll discuss this in my next blog post.

How Good Are Your E-Discovery Tools?

Monday, April 7th, 2008

SpicoliJeff Spicoli, after crashing a car in Fast Times at Ridgemont High, quipped:

“It’s okay. My dad is a TV repairman. He has the ultimate set of tools. I can fix it.”

Clearly, Spicoli’s tools (no matter how “ultimate”) weren’t going to get the car repaired. Never mind the fact that he was probably under the influence and shouldn’t have been operating anything more than a Barcalounger. His quote did get me thinking about a post I read recently that probably would have advised Spicoli against talking about how good his tools were. The post in question trumpeted the value proposition of early case assessments in E-Discovery (a viewpoint I wholly endorse). And yet, during the blog the author posited an interesting viewpoint that I think needs a bit of deconstruction:

“In legal, the less information your opponent has, the better off you are. … Using commodity based early case assessment tools may introduce legal risk your company may not want to manage. For example, if the opposing counsel has foreknowledge of the products you use, such as Autonomy/Aungate, Attenex or Clearwell Systems, they know your capability to identify concepts, custodians, etc. Using software to create legal leverage without sharing to the world how you do it, can improve your competitive advantage in the early phases of litigation.”

As a former practicing litigator, I’ll be the first to admit that I’ve seen my share of scorched earth discovery tactics. And, I’m not so much of a Pollyanna to think that a certain amount of this zero sum mentality doesn’t still exist. And yet, there’s an emerging trend (some might say a nascent best practice) to increase the amount of transparency and collaboration in the E-Discovery world.

I was at the Sedona Conference’s recent “Program on Getting Ahead of the eDiscovery Curve” where one of the hot topics was how the fledgling amendments to the FRCP were playing out in practice. One key discussion area centered around how the new Rules required a much more collaborative meet and confer process:

“Rule 26(f) is about cooperation and working together. By coming together early, defining what is important and what is not, and working with your adversary, not against them, means less risk, less cost and more certainty.” [Emphasis Added]. A Practitioner’s Guide to Rule 26(f) Meet & Confer: A Year After the Amendments. John Rosenthal, Howrey LLP and Moze Cowper, Amgen Inc.

Similarly, recent case law has also championed this collaborative approach:

“Identifying relevant records and working out technical methods for their production is a cooperative undertaking, not part of the adversarial give and take. … It is not appropriate to seek an advantage in the litigation by failing to cooperate in the identification of basic evidence.” In re Seroquel Prods. Liab. Litig., 2007 WL 2412946 (M.D. Fla. Aug. 21, 2007)

As part of this proposed transparency and collaboration, the authors (above) point out that a number of topics should proactively be discussed during the meet and confer session(s) including preservation, date ranges, custodians, systems, categories or types of ESI, and the use of search terms. In my experience this level of discussion and transparency really does pay dividends. Anything that resembles a “hide the ball” approach will ultimately take up needless attorney cycles, and will in turn drive up the cost of resolving the matter.

Now, I will concede that a party shouldn’t take the transparency notion too far. For example, it’s probably not necessary to immediately discuss the brand(s) of tools that are working behind the scenes to deliver the promised results. And yet, disclosing the type of functionality that will be brought to bear on the E-Discovery process can help:

  • Facilitate discussions about ESI “inaccessibility” – see FRCP 26(b)(2)(B)
  • Dispel the frequent myth that one party has the type of uber tool that can instantly, cheaply and automatically grab every piece of relevant data from the most remote corners of an enterprise
  • Set the stage for limitations in the E-Discovery process so that all parties (including the Court) can have their expectations firmly grounded in reality
  • Eliminate “black box” technology concerns by showing the opposition how your tools work to process files, handle metadata, etc.

So, back to the Spicoli reference,… having a killer set of tools may help your enterprise (or client) achieve fast, accurate and predictable results. But, does the opponent’s knowledge of the type of tools and features you’re going to use increase your risk profile?

While there aren’t any absolutes, I’d certainly say “no.” And, even if this type of gamesmanship did yield a temporary advantage, it’s probably outweighed by a collaborative E-Discovery approach that is quickly becoming a best practice.

If we could only get the E-Discovery tools to fix Spicoli’s car…

New Writers And A New Look As E-Discovery 2.0 Enters Its Second Year

Wednesday, April 2nd, 2008

Regular readers of E-Discovery 2.0 will notice a new look to the blog today (thanks Sean!). But that’s not the biggest change. As we enter the blog’s second year, I have decided to take your feedback to heart and invite 3 exceptional people to join me as regular bloggers.

In the 12½ months since I wrote my first post, it has been exciting to see the blog’s readership grow rapidly (see charts for trends in page views and email subscribers). I would like to profoundly thank everyone who has either read this blog, linked to it, submitted comments, or even just come up to me at various parties and events to say that you have been reading it. Without your input, I would have nothing to write. It is tremendous fun to interact with a community of people who share my interests, and I’m grateful to you all for engaging.

Blog Stats

As part of my ongoing dialogue with readers, I consistently got 2 requests: can you post more often, and cover a broader set of e-discovery issues? True, over the past year, I have covered the big deals that mattered, the small ones that didn’t and the ones in between; I wrote about analyst rankings of different e-discovery vendors, prompting a lively discussion involving the analysts themselves; I highlighted shifts in the landscape, such as enterprises bringing e-discovery in-house, partnerships with archive vendors, and the changing role of service providers. I even had a little fun, every now and then.

But there’s no denying that e-discovery presents far too rich a set of legal, business, and technology issues than I can cover alone – especially given the demands of my day job. That’s why I decided that the best way to develop the blog as a resource for the e-discovery community is to have more people writing, especially if those people are both more intelligent than me and have their own perspectives on e-discovery.

So it is with great pleasure that I welcome three new bloggers to E-Discovery 2.0:

  • Dean Gonsowski is a lawyer who has spent the past 10 years advising corporations and law firms on how to improve their e-discovery processes. He teaches a series of continuing legal education courses on e-discovery, and is a member of The Sedona Conference Working Group on Electronic Document Retention and Production (WG1).
  • Kurt Leafstrand is a rocket-scientist from MIT (I’m not kidding!) who is now very active in EDRM, and was a key contributor towards the recently published XML standards. He spends his days designing e-discovery solutions and has posted before on several topics.
  • Will Uppington is also active in EDRM and Sedona. His particular passion is developing new search, analysis and web 2.0 technologies and applying these to reducing the costs and risks associated with e-discovery. He has also posted before.

I am thrilled to be joined by such a bright bunch, and hope you enjoy the new “e-discovery team” approach!