Posts Tagged ‘legal discovery’

Demystifying Concept Search in Electronic Discovery

Tuesday, October 28th, 2008

Concept or content search continues to be a hot topic within the e-discovery community.  There’s a continuous stream of articles that discuss it.  Some that point out the positive.  Others that point out the limitations.  The courts have also gotten involved in the discussion.  Judge Grimm refers to concept search in e-discovery in Victor Stanley, Inc. v. Creative Pipe, Inc., 2008 WL 2221841 (D. Md. May 29, 2008).  Judge Facciola discusses concept search in Disability Rights Council of Greater Washington v. Washington Metropolitan Transit Authority, 242 F.R.D. 139 and other opinions.  Despite (or maybe because of) all the commentary on this topic, I find that while a lot of people think that concept search in e-discovery is good, many are not fully sure of exactly what concept search is, and how it is practically useful in e-discovery.   It’s pretty clear that after several years of commentary and hype, concept search has become something of a buzzword associated with many myths and misconceptions.  In an effort to better understand what concept search is and how it can help in e-discovery, I want to dispel two of the most common myths I have heard.

The “Concept Search is Concept Search” Myth

The first myth around concept search actually revolves around what it is.  In my experience, people tend to lump two different technologies together when talking about concept search: concept search and concept categorization.  It’s very common, for example, to see commentators say concept search even when what they are really talking about is concept categorization.  To make matters more confusing, people also use a plethora of other names including content search, content clustering or concept clustering when what they really mean is concept categorization.

So, what are the differences between concept search and concept categorization?  First, let’s start with concept search.  Concept search technologies find documents containing “concepts”.  I think that the Sedona Conference’s “Best Practices Commentary on the Use of Search & Information Retrieval Methods in E-Discovery“, provides a good definition of “concept” when used in a search context: “the combination of [a] query term and the additional terms identified by the thesaurus.”  In other words, concept search technologies find documents containing a specified term plus additional terms with similar meanings derived from a thesaurus.

Concept categorization, on the other hand, is actually not a search technology at all.  Concept categorization technologies do not “find” documents.  Rather, they categorize or group documents based on their similarity.   There are many different ways to group documents based on similarity.  Techniques include statistical (which assesses similarity based on word frequency), Bayesian classification (which weights words differently depending on factors in addition to statistical frequency, such as where the terms appear in a document), and semantic indexing (which takes into account the fact that many words used in a similar context may have a similar meaning).  It would take more time to describe these technologies in detail but the Sedona commentary has a good summary of these different technologies if you are interested in learning more.

As should now be apparent, these technologies are very different and using the same words to describe them is confusing.  It’s why it’s not surprising that a lot of the users of e-discovery services and software don’t have a strong understanding of what these technologies are or what benefits they can actually provide in practice.  Dispelling the myth that they can be lumped together is a critical first step in any conversation about concept search and how it can help in e-discovery.  This leads us to a second myth, that Concept Search is better than Keyword Search.  I’ll discuss this in my next blog post.

Electronic Data Discovery at ACC

Thursday, October 23rd, 2008

I was in Seattle this week for the annual Association of Corporate Counsel conference.  And, from all external perspectives it seems like the dour economic climate hasn’t dampened the spirits of the legal and litigation support communities.  There were lavish parties, including an extravaganza thrown by Womble, Carlysle at the Space Needle, along with no shortage of the usual tchochkies, giveaways and over-the-top promotions - even though the general consensus from exhibitors was that actual attendance was down from last year.

Maybe the legal community is in denial.  Or perhaps, the sentiment instead is that tough economic times will result in more litigation and governmental regulation.  While this is certainly the optimistic viewpoint, the recent Fulbright & Jaworski Litigation Trends Survey at least provides some foundation for this rosy notion.

In Fulbright & Jaworski’s fifth annual survey, corporate counsel stated that they anticipate a litigation spike next year in both lawsuits and regulatory proceedings.  Among U.S. respondents to the most recent survey, 34 percent expect an increase in lawsuits involving their company and 25 percent anticipate more regulatory proceedings.

Speaking on behalf of the glass half full contingent, Stephen C. Dillard, who chairs Fullbright’s global litigation practice, believes that the survey results illustrate the shift from a long period of prosperity to the start of “a period of serious economic challenge that is likely to fuel litigation over who is to blame and who should pay for the consequences.”

Whether this prediction comes to pass remains to be seen, but at least the participants at the ACC conference seem to drinking the same Kool-Aid.  Whether that sugary drink is actually good for you or not, will be the question.

Let me know what you think.  Do you think the financial crisis will force litigation to increase, decrease, or stay the same and why?

The “Artful” E-Discovery Dodger

Monday, October 13th, 2008

E-Discovery search has become a hot topic of late (in blogs and in the news), and I think it’s pretty clear that the unwashed (attorney) masses still don’t really grok the importance of using a defensible search protocol.  Neither do they seem to understand the enhanced scrutiny that’s being applied by the judiciary.

Kipperman v. Onex Corp., 2008 WL 4372005 (N.D. Ga. Sept. 19, 2008) is another in what will assuredly be a long string of cases that demonstrate how easy it is for litigators to get wrapped around the axel of e-discovery search.  In Kipperman, the defendant (Onex) presented several motions to the court, including attempts to obtain relief from the need to produce email identified after searching several backup tapes.

During a previous hearing the court ordered Onex to search all the mailboxes on two tapes, as well as on an additional tape selected by Plaintiff. The court determined that despite Onex’s objections and representations, the backup tapes were “producing meaningful discoverable information.”  The court was nevertheless sympathetic to Onex’s burden and therefore weighed in with some guidance:

“The court did suggest, … , that Plaintiff be more artful with its search terms and that Plaintiff utilize a list of the people, provided by Defendants, to review whether all mailboxes needed to be searched.”

The court also gave Onex the chance to narrow the search terms.  Unfortunately, they didn’t seize the opportunity to provide a narrower list or a refinement of their search terms.  “As such, they agreed to search and restore all the mailboxes with the search terms provided by Plaintiff.”

Not surprisingly, Onex then sought relief from having to review and produce all of the results from the search because the “broad search terms resulted in thousands and thousands of irrelevant hits.”  For example, the search terms included the word “republic” which used to elicit emails regarding Republic Builders Products, one of the companies involved in this matter.

“Defendants claim that the search captured thousands of irrelevant pages due to one occurrence of the word ‘republic’ often related to Onex business interests having nothing to do with Magnatrax in the ‘Republic of France,’ ‘Republic of Ireland,’ and ‘Czech Republic’.”

Again the court reaffirmed their sympathy with Onex’s burden and yet denied the requested relief, in large part because Onex was warned about not being more “artful”:

“[T]he court is not unsympathetic to the massive amount of discovery involved in this matter, the considerable burden of working with it, and the overproduction that often comes with e-mail production. Therefore, the court gave Defendants numerous tools by which to reduce the burden of e-mail discovery, including an opportunity to limit Plaintiff’s search terms and an opportunity to provide a list by which the number of peoples and the number of boxes being searched could be reduced. Defendants did not take advantage of these opportunities. Defendants must now lie in the bed that they have made. Thus, Defendants’ objections on the basis of relevancy and volume are DENIED.” (emphasis added).

Needless to say, Kipperman is probably not all that atypical.  Attorneys everywhere have historically used blunt e-discovery search instruments and haven’t often run afoul of the judiciary.  Now, post Victor Stanley, et al, the playing field has changed dramatically.  It’s important to leverage best practices (from Sedona and others), craft a defensible search strategy, sample the results and “show your work.”  Missteps along the way, especially ones that the court has tried to help the parties avoid won’t be met with much tolerance

E-Discovery In The Press

Thursday, October 2nd, 2008

Last month, for the first time, friends of mine who do NOT work in the legal industry starting talking to me about e-discovery. In the past, they had always taken on the glazed look of a bored 8th-grader whenever I spoke about what I do. But suddenly, they were strangely interested and full of questions.

The reason was two articles about e-discovery in the mainstream media which appeared within a week of each other. The first was in the Wall Street Journal, which wrote about how tech firms are at war with lawyers. According to the Journal, the fact that companies are saving money by using e-discovery software is bad news for lawyers, since they are “facing the loss of lucrative client fees.” In response, the lawyers are fighting back: “The attorneys counter that there are pitfalls to replacing them. Early this year, a federal judge required chip maker Qualcomm to pay rival Broadcom more than $8 million after it failed to uncover and share emails relevant to a case.”

I am sure there are lawyers who see technology as a threat, but the firms I deal with are actively embracing e-discovery technology, not fighting it. They see it as another way they can add value to their clients, and would prefer to have their staff focused on practicing law, not mindlessly reading irrelevant documents. So I ended up spending a lot of time explaining to my non-legal friends that there are two sides to the coin. As for my friends who do happen to be lawyers, they focused on the Qualcomm case, pointing out (as we have written before) that the problem was not technology, but rather poor processes and bad judgment on the part of the attorneys concerned.

The second article appeared in the Economist and took a different tack. It argued that the stratospheric cost of e-discovery is gumming up the court system and preventing justice from being served. According to one former justice from Colorado quoted in the article, even mundane landlord-tenant disputes “are now digital wars of attrition”; there are “cases that are settled only because one party cannot afford the costs of e-discovery”; and, many “plaintiffs cannot afford to sue at all, for fear of the e-discovery costs.”

I love the Economist’s tongue-in-cheek style and thought the article made many valid points. My one disappointment was that its spin was unequivocally negative, as though e-discovery is a self-inflicted wound on the American judicial system. Nowhere was there mention of the fact that electronic evidence often helps litigants get at the truth. Rather than incomplete recollections or “he said-she said” claims and counter-claims, there’s no disputing an email that captures a person’s words and actions in black-and-white. Nor was there any mention of how technology is solving the problems that it inadvertently created: today, there are many products that rapidly sift through electronic information, dramatically lowering the cost of e-discovery.

It is great for everyone in the e-discovery community for our domain to get more ink in mainstream, quality publications. I expect that the trend will continue as the industry grows, and especially once the investigations start into our current financial meltdown.

Opening Moves in E-Discovery

Friday, September 19th, 2008

I was recently asked: “what are the first things you do when your client calls you about a case requiring e-discovery?”  So, for the benefit of all, I’ll post my answer.

My first caveat to the advice was context.  Since, while a lot of attorneys have attended CLEs or have read about e-discovery, it’s not the same in the real world.  As the old Spanish Proverb goes:

It’s not the same to talk of bulls as to be in the bullring.

Keeping in mind that reality may differ significantly from academics, here are some things to consider when the next e-discovery case comes up.   Please also keep in mind that these steps (like the EDRM workflow) aren’t linear and may in fact occur cyclically or in parallel:

1. Preserve, preserve, preserve

Nothing is more important than meeting the initial preservation obligation, which begins when litigation is “reasonably likely” – as opposed to just when the complaint is filed.  This first step in the long journey can easily be a trap for the unwary/unprepared.

The challenge once you’re past the trigger issue is to then identify the boundaries of the duty to preserve, i.e., what evidence must be preserved?   This inquiry is often initially comprised of identifying key players, date ranges and data types.

Another significant challenge in this step is to monitor and update the legal hold process.  And, given that litigation more often than not spans years, it’s easy to initially succeed at the preservation effort, but then later fail on execution.  The best way to minimize risk in this step is to move quickly from preservation to collection.  See Is Preservation in E-Discovery Overrated?

2. Work backwards

Once preservation (and ideally collection) is adequately covered, the next step is to start thinking about the end of the process and what success (or lack of failure) looks like.  The exposure and profile of the matter are important to consider when you embark upon an e-discovery project since it’s critical to scale discovery efforts appropriately.

One thing, in particular, that is very important to consider early in the process is the type of production format that will be preferred by reviewing counsel and the opposition.  TIFF-based image productions (which are historically well accepted) are often pitted against native file ESI reviews.  Either format may or may not be acceptable given the situation and the applicability of FRCP Rule 34.

3. Understand the technical landscape

Most attorneys, but for a rare few, aren’t capable of really comprehending technical nuances of the complex and interrelated IT systems found at most Fortune 2,500 enterprises.  Fortunately, they are quite adept at working with experts (either consulting or testifying) to help them get to the bottom of difficult to comprehend and explain issues.  The key is find the right technical people who understand IT systems and who can explain it to judges, juries, and attorneys alike, especially for some of the most common ESI repositories like: email servers, archival systems, shared network drives, instant messaging servers, archival repositories (e.g., tape libraries, real time back-up systems, etc.), records management systems, knowledge management systems, proprietary, but highly leveraged, internal applications, offsite repositories (e.g., hosted IT or email systems) and significant partner or subsidiary data stores.  In many instances it will make sense to leverage or create a map of the data universe so that nothing is missed and inaccessibility arguments can be cogently detailed.

4. Get your lingo straight

Assumptions, whether in e-discovery or not, are often dangerous.  In the complex undertaking where multiple parties are handling ESI it’s critical to make sure that everyone is on the same page especially since every company handles IT, records management, ILM and information security differently.  So, when working with these disparate constituents the outset of an engagement is the right time to make sure everyone is on the same page.  Therefore, standardize on a set of commonly used terms. Examples of potentially ambiguous topics include “imaging” ,“archive”, and “records.”

5. Don’t assume your client will really be helpful

I’ve been involved with hundreds of e-discovery engagements and I’ve found that almost universally the end client professes a profound willingness to help out.  And yet, actual “help” is relatively rare.  To qualify this, it may be prudent to ask several additional questions:

  • Does the Client have the time to actually help?  Everyone at the client’s site has a day job that they’re tasked with above and beyond transient e-discovery needs.  So, while bandwidth generally is important, what’s more critical is the ability to comply with aggressive judicial deadlines.
  • Are the people helping the ones you’d want to see on the stand?  It’s often not realistic to have internal folks (especially IT and Records Managers) stay isolated during the various pre-trial events - meet & confer conferences and potentially 30(b)(6) depositions so it’s important to evaluate how a given witness will fare when providing testimony.
  • How likely is it that you client would throw you under the bus if things went wrong?  In my opinion, there is now more reason for outside counsel to manage the risks of an e-discovery project going awry.  See, Sullivan and Cromwell’s suit against EED.  Some will wisely bring in 3rd party consultants/experts to have a neutral, unbiased constituent in the process.

6. Build a budget and team (internal/external)

Everyone is probably now aware of how expensive e-discovery can be if managed improperly.  This makes it all that more imperative to work quickly to get a rough sense of the scope (which will lead to a budget) and the client’s willingness to absorb associated charges.  The most important step is to right-size the e-discovery effort with the risks inherent in the corresponding litigation/investigation.  Otherwise, there’s a high likelihood that e-discovery process will be over-engineered (too expensive) or under-scoped (cutting dangerous corners).

7. Figure out your risk profile

Similar to right-sizing the budget, it also makes sense to adopt a “horses for courses” approach to e-discovery since there is no singular way to handle a given matter.  For example, in one case you make take forensic images, restore backup tapes, capture instant messaging data, harness metadata, or decide to do an automated review with a with a “clawback” provision. In either case, the only mistake is to assume that an approach from another, dissimilar matter is warranted in the instant case.

8. Assume the opposition is better informed than you are

While this actually may not be the case, it’s a safer bet that assuming a level of naiveté that may not exist.  What is certain is that the Plaintiff’s bar is increasingly well informed and can be very aggressive.  They’ve seen the playbook that calls for baiting the opposition into a discovery misstep that can result in significant, case altering sanctions.  According to a recent survey, 63% of the polled attorneys said that e-discovery is being abused by counsel, so it’s important to be wary initially.

It’s also important to consider the potential reciprocity of a given matter and adjust your position accordingly.  In many instances it’s easy to consider your role only as a producing party, but with cross/counter claims it may be possible to simultaneously be propounding discovery and in the opposition’s shoes.

9. Prepare for an early case assessment

A recent industry survey found that effective early case assessment (ECA) approaches reduced overall litigation in half of the cases evaluated, and resulted in favorable outcomes for 76 percent of the cases.   The key to this methodology is to use the available next generation case analysis solutions earlier in the process, not just to review data for relevancy and privilege, but to:

  • Identify the key players. This is critical in order to have a defensible legal hold process
  • Evaluate the posture of the case to determine how it looks on the merits
  • Diagnose potential outliers in the e-discovery process to facilitate meet and confer discussions and help create “inaccessibility” arguments
  • Conduct a search term analysis for keyword negotiations during meet and confer discussions.  Objectively demonstrating the results of proposed search queries can go a long way in speeding up keyword negotiations

10. Don’t take search for granted

For many attorneys, e-discovery search is just like Lexis or Google.  Unfortunately, that isn’t the case.  Instead, it’s become highly complex and is now receiving significant judicial scrutiny.  In Victor Stanley v. Creative Pipe Judge Grimm suggested that attorneys need to rethink how they’ve traditionally managed the search process:  “[F]or lawyers and judges to dare opine that a certain search term or terms would be more likely to produce information than the terms that were used is truly to go where angels fear to tread.”  It’s now important to devise (and share at early meet & confer conferences) a defensible search strategy that can withstand judicial scrutiny.