Posts Tagged ‘ediscovery’

Demystifying Concept Search in Electronic Discovery

Tuesday, October 28th, 2008

Concept or content search continues to be a hot topic within the e-discovery community.  There’s a continuous stream of articles that discuss it.  Some that point out the positive.  Others that point out the limitations.  The courts have also gotten involved in the discussion.  Judge Grimm refers to concept search in e-discovery in Victor Stanley, Inc. v. Creative Pipe, Inc., 2008 WL 2221841 (D. Md. May 29, 2008).  Judge Facciola discusses concept search in Disability Rights Council of Greater Washington v. Washington Metropolitan Transit Authority, 242 F.R.D. 139 and other opinions.  Despite (or maybe because of) all the commentary on this topic, I find that while a lot of people think that concept search in e-discovery is good, many are not fully sure of exactly what concept search is, and how it is practically useful in e-discovery.   It’s pretty clear that after several years of commentary and hype, concept search has become something of a buzzword associated with many myths and misconceptions.  In an effort to better understand what concept search is and how it can help in e-discovery, I want to dispel two of the most common myths I have heard.

The “Concept Search is Concept Search” Myth

The first myth around concept search actually revolves around what it is.  In my experience, people tend to lump two different technologies together when talking about concept search: concept search and concept categorization.  It’s very common, for example, to see commentators say concept search even when what they are really talking about is concept categorization.  To make matters more confusing, people also use a plethora of other names including content search, content clustering or concept clustering when what they really mean is concept categorization.

So, what are the differences between concept search and concept categorization?  First, let’s start with concept search.  Concept search technologies find documents containing “concepts”.  I think that the Sedona Conference’s “Best Practices Commentary on the Use of Search & Information Retrieval Methods in E-Discovery“, provides a good definition of “concept” when used in a search context: “the combination of [a] query term and the additional terms identified by the thesaurus.”  In other words, concept search technologies find documents containing a specified term plus additional terms with similar meanings derived from a thesaurus.

Concept categorization, on the other hand, is actually not a search technology at all.  Concept categorization technologies do not “find” documents.  Rather, they categorize or group documents based on their similarity.   There are many different ways to group documents based on similarity.  Techniques include statistical (which assesses similarity based on word frequency), Bayesian classification (which weights words differently depending on factors in addition to statistical frequency, such as where the terms appear in a document), and semantic indexing (which takes into account the fact that many words used in a similar context may have a similar meaning).  It would take more time to describe these technologies in detail but the Sedona commentary has a good summary of these different technologies if you are interested in learning more.

As should now be apparent, these technologies are very different and using the same words to describe them is confusing.  It’s why it’s not surprising that a lot of the users of e-discovery services and software don’t have a strong understanding of what these technologies are or what benefits they can actually provide in practice.  Dispelling the myth that they can be lumped together is a critical first step in any conversation about concept search and how it can help in e-discovery.  This leads us to a second myth, that Concept Search is better than Keyword Search.  I’ll discuss this in my next blog post.

Electronic Data Discovery at ACC

Thursday, October 23rd, 2008

I was in Seattle this week for the annual Association of Corporate Counsel conference.  And, from all external perspectives it seems like the dour economic climate hasn’t dampened the spirits of the legal and litigation support communities.  There were lavish parties, including an extravaganza thrown by Womble, Carlysle at the Space Needle, along with no shortage of the usual tchochkies, giveaways and over-the-top promotions - even though the general consensus from exhibitors was that actual attendance was down from last year.

Maybe the legal community is in denial.  Or perhaps, the sentiment instead is that tough economic times will result in more litigation and governmental regulation.  While this is certainly the optimistic viewpoint, the recent Fulbright & Jaworski Litigation Trends Survey at least provides some foundation for this rosy notion.

In Fulbright & Jaworski’s fifth annual survey, corporate counsel stated that they anticipate a litigation spike next year in both lawsuits and regulatory proceedings.  Among U.S. respondents to the most recent survey, 34 percent expect an increase in lawsuits involving their company and 25 percent anticipate more regulatory proceedings.

Speaking on behalf of the glass half full contingent, Stephen C. Dillard, who chairs Fullbright’s global litigation practice, believes that the survey results illustrate the shift from a long period of prosperity to the start of “a period of serious economic challenge that is likely to fuel litigation over who is to blame and who should pay for the consequences.”

Whether this prediction comes to pass remains to be seen, but at least the participants at the ACC conference seem to drinking the same Kool-Aid.  Whether that sugary drink is actually good for you or not, will be the question.

Let me know what you think.  Do you think the financial crisis will force litigation to increase, decrease, or stay the same and why?

The “Artful” E-Discovery Dodger

Monday, October 13th, 2008

E-Discovery search has become a hot topic of late (in blogs and in the news), and I think it’s pretty clear that the unwashed (attorney) masses still don’t really grok the importance of using a defensible search protocol.  Neither do they seem to understand the enhanced scrutiny that’s being applied by the judiciary.

Kipperman v. Onex Corp., 2008 WL 4372005 (N.D. Ga. Sept. 19, 2008) is another in what will assuredly be a long string of cases that demonstrate how easy it is for litigators to get wrapped around the axel of e-discovery search.  In Kipperman, the defendant (Onex) presented several motions to the court, including attempts to obtain relief from the need to produce email identified after searching several backup tapes.

During a previous hearing the court ordered Onex to search all the mailboxes on two tapes, as well as on an additional tape selected by Plaintiff. The court determined that despite Onex’s objections and representations, the backup tapes were “producing meaningful discoverable information.”  The court was nevertheless sympathetic to Onex’s burden and therefore weighed in with some guidance:

“The court did suggest, … , that Plaintiff be more artful with its search terms and that Plaintiff utilize a list of the people, provided by Defendants, to review whether all mailboxes needed to be searched.”

The court also gave Onex the chance to narrow the search terms.  Unfortunately, they didn’t seize the opportunity to provide a narrower list or a refinement of their search terms.  “As such, they agreed to search and restore all the mailboxes with the search terms provided by Plaintiff.”

Not surprisingly, Onex then sought relief from having to review and produce all of the results from the search because the “broad search terms resulted in thousands and thousands of irrelevant hits.”  For example, the search terms included the word “republic” which used to elicit emails regarding Republic Builders Products, one of the companies involved in this matter.

“Defendants claim that the search captured thousands of irrelevant pages due to one occurrence of the word ‘republic’ often related to Onex business interests having nothing to do with Magnatrax in the ‘Republic of France,’ ‘Republic of Ireland,’ and ‘Czech Republic’.”

Again the court reaffirmed their sympathy with Onex’s burden and yet denied the requested relief, in large part because Onex was warned about not being more “artful”:

“[T]he court is not unsympathetic to the massive amount of discovery involved in this matter, the considerable burden of working with it, and the overproduction that often comes with e-mail production. Therefore, the court gave Defendants numerous tools by which to reduce the burden of e-mail discovery, including an opportunity to limit Plaintiff’s search terms and an opportunity to provide a list by which the number of peoples and the number of boxes being searched could be reduced. Defendants did not take advantage of these opportunities. Defendants must now lie in the bed that they have made. Thus, Defendants’ objections on the basis of relevancy and volume are DENIED.” (emphasis added).

Needless to say, Kipperman is probably not all that atypical.  Attorneys everywhere have historically used blunt e-discovery search instruments and haven’t often run afoul of the judiciary.  Now, post Victor Stanley, et al, the playing field has changed dramatically.  It’s important to leverage best practices (from Sedona and others), craft a defensible search strategy, sample the results and “show your work.”  Missteps along the way, especially ones that the court has tried to help the parties avoid won’t be met with much tolerance

E-Discovery In The Press

Thursday, October 2nd, 2008

Last month, for the first time, friends of mine who do NOT work in the legal industry starting talking to me about e-discovery. In the past, they had always taken on the glazed look of a bored 8th-grader whenever I spoke about what I do. But suddenly, they were strangely interested and full of questions.

The reason was two articles about e-discovery in the mainstream media which appeared within a week of each other. The first was in the Wall Street Journal, which wrote about how tech firms are at war with lawyers. According to the Journal, the fact that companies are saving money by using e-discovery software is bad news for lawyers, since they are “facing the loss of lucrative client fees.” In response, the lawyers are fighting back: “The attorneys counter that there are pitfalls to replacing them. Early this year, a federal judge required chip maker Qualcomm to pay rival Broadcom more than $8 million after it failed to uncover and share emails relevant to a case.”

I am sure there are lawyers who see technology as a threat, but the firms I deal with are actively embracing e-discovery technology, not fighting it. They see it as another way they can add value to their clients, and would prefer to have their staff focused on practicing law, not mindlessly reading irrelevant documents. So I ended up spending a lot of time explaining to my non-legal friends that there are two sides to the coin. As for my friends who do happen to be lawyers, they focused on the Qualcomm case, pointing out (as we have written before) that the problem was not technology, but rather poor processes and bad judgment on the part of the attorneys concerned.

The second article appeared in the Economist and took a different tack. It argued that the stratospheric cost of e-discovery is gumming up the court system and preventing justice from being served. According to one former justice from Colorado quoted in the article, even mundane landlord-tenant disputes “are now digital wars of attrition”; there are “cases that are settled only because one party cannot afford the costs of e-discovery”; and, many “plaintiffs cannot afford to sue at all, for fear of the e-discovery costs.”

I love the Economist’s tongue-in-cheek style and thought the article made many valid points. My one disappointment was that its spin was unequivocally negative, as though e-discovery is a self-inflicted wound on the American judicial system. Nowhere was there mention of the fact that electronic evidence often helps litigants get at the truth. Rather than incomplete recollections or “he said-she said” claims and counter-claims, there’s no disputing an email that captures a person’s words and actions in black-and-white. Nor was there any mention of how technology is solving the problems that it inadvertently created: today, there are many products that rapidly sift through electronic information, dramatically lowering the cost of e-discovery.

It is great for everyone in the e-discovery community for our domain to get more ink in mainstream, quality publications. I expect that the trend will continue as the industry grows, and especially once the investigations start into our current financial meltdown.

“Aggressive Culling”: The E-Discovery Buzz Cut

Tuesday, September 30th, 2008

Ralph Losey, never one to mince words, recently analyzed a recent litigation survey from the elite Fellows of the American College of Trial Lawyers. The survey highlights the fact that one of the main problems facing the U.S. legal system today is (surprise!) e-discovery. Also (not) a surprise is that the study “places the blame squarely on poor rules, bad law, and judges”, while overlooking the role that lawyers play in the problem.

In his analysis, Ralph makes a number of insightful observations that should help lawyers move from being e-discovery troublemakers to being part of the solution. However, one of his key critiques is targeted not at lawyers but rather at the vendor community: “[E-discovery] is too expensive because lawyers and judges do not know what they are doing, and do not know how to properly cull and review email, and because clients are disorganized pack-rats. Many of the e-discovery vendors are also misinformed, but often they do know better; they just have no pecuniary interest in aggressive culling. Some may even seek to line their own pockets in inflated discoveries.”

As Ralph bluntly points out, pecuniary interests (translation: money) plays a big role here, but so does risk reduction. Imagine you’re given the opportunity to process a 2 terabyte case all the way through to review. With the “funnel” of e-discovery costs placing the highest dollar per gigabyte value on the end of the process (i.e. review), what’s your incentive to cull aggressively at the beginning? Not much from a revenue perspective, certainly, but also not much from a risk perspective: particularly when you have sanctions and lawsuits on your mind and are thinking about the potential liability that you incur by excluding potentially relevant documents by using too broad a brush (or pair of garden clippers) in your pruning.

How do we move forward? As document volumes continue to grow, it’s clear that aggressive culling (with a few caveats which we’ll get to in a minute) is a critical tool for managing costs and improving case outcomes (let’s go out on a limb and define “improving” as producing fairer and more equitable rulings). However, in order to adopt more aggressive culling as a standard part of the electronic discovery process, the community has to come to terms with three things:

  • The Myth of Perfection: There may be perfect abs, but there is no perfect e-discovery. Organizations like the E-Discovery Institute are doing fantastic work to measure and improve the accuracy of electronic discovery efforts, but in the end it’s tough to make the argument that having 100 contract attorneys manually reviewing 10 million documents will necessarily produce a better overall e-discovery outcome than  10 specialized attorneys reviewing 200,000 documents that were aggressively (but thoughtfully) culled from initial 10 million document set. There simply is no black and white set of rules that will lead to a perfect process.
  • The Benefit of Cost Control: Given that, it is in the best interest of everyone involved (yes, even vendors) to choose the most cost-effective process that provides a high likelihood of producing the information relevant to the case.  This means “saving your bullets” by not spending all of your e-discovery dollars up front in a case pursing the perfection myth, but instead approaching discovery in an incremental fashion which can adapt to changing facts and circumstances as the matter unfolds. How, you may ask, do vendors benefit? They can become more strategic e-discovery advisors by working with counsel over the full lifecycle of a case, providing higher-value (and, by the way, more interesting and intellectually challenging) consulting services to help incrementally adjust and adapt the course of e-discovery. As Ralph puts it: “…Trial lawyers should accept that specialists in the field of e-discovery are a necessary evil. If an e-discovery specialist knows the field, they can save you money and take you out of the e-discovery morass faster and more reliably than a dozen new rules. The world today is too complex for one man or woman to do it all.”
  • The Value of Defensibility: Many of you likely winced at the term “high likelihood” in the previous point. “Sacrilege!” you cried. “I demand certainty!” First, go back and re-read the first point about the Myth of Perfection. Then, consider that a better way forward may be an approach to e-discovery that involves more aggressive culling early in the process to focus on the most important documents first, more iterations to adapt to changing facts and circumstances, and, all along the way, a complete audit trail that provides defensibility in the event that any aspect of the process is ever questioned. Such defensibility would include specific documentation about the culling decisions that were made, down to the keyword and “sub-keyword” (i.e. wildcard expansion) level, so all the cards are on the table for everyone to see.  The value of defensibility when performing aggressive culling is enormous, in that it adds an additional measure of safety and trust to the process, minimizing the amount of doubt and second-guessing that so often plagues e-discovery negotiations.

By coming to terms with the fundamental imperfections of the e-discovery process and embracing the promise of lower costs and the agility and responsiveness that can be gained with a more iterative approach, everyone stands to gain from the safe and controlled adoption of aggressive culling – yes, even the vendors (at least the smart ones) and their ever-present pecuniary interests.