Posts Tagged ‘documents’

Demystifying Concept Search in Electronic Discovery

Tuesday, October 28th, 2008

Concept or content search continues to be a hot topic within the e-discovery community.  There’s a continuous stream of articles that discuss it.  Some that point out the positive.  Others that point out the limitations.  The courts have also gotten involved in the discussion.  Judge Grimm refers to concept search in e-discovery in Victor Stanley, Inc. v. Creative Pipe, Inc., 2008 WL 2221841 (D. Md. May 29, 2008).  Judge Facciola discusses concept search in Disability Rights Council of Greater Washington v. Washington Metropolitan Transit Authority, 242 F.R.D. 139 and other opinions.  Despite (or maybe because of) all the commentary on this topic, I find that while a lot of people think that concept search in e-discovery is good, many are not fully sure of exactly what concept search is, and how it is practically useful in e-discovery.   It’s pretty clear that after several years of commentary and hype, concept search has become something of a buzzword associated with many myths and misconceptions.  In an effort to better understand what concept search is and how it can help in e-discovery, I want to dispel two of the most common myths I have heard.

The “Concept Search is Concept Search” Myth

The first myth around concept search actually revolves around what it is.  In my experience, people tend to lump two different technologies together when talking about concept search: concept search and concept categorization.  It’s very common, for example, to see commentators say concept search even when what they are really talking about is concept categorization.  To make matters more confusing, people also use a plethora of other names including content search, content clustering or concept clustering when what they really mean is concept categorization.

So, what are the differences between concept search and concept categorization?  First, let’s start with concept search.  Concept search technologies find documents containing “concepts”.  I think that the Sedona Conference’s “Best Practices Commentary on the Use of Search & Information Retrieval Methods in E-Discovery“, provides a good definition of “concept” when used in a search context: “the combination of [a] query term and the additional terms identified by the thesaurus.”  In other words, concept search technologies find documents containing a specified term plus additional terms with similar meanings derived from a thesaurus.

Concept categorization, on the other hand, is actually not a search technology at all.  Concept categorization technologies do not “find” documents.  Rather, they categorize or group documents based on their similarity.   There are many different ways to group documents based on similarity.  Techniques include statistical (which assesses similarity based on word frequency), Bayesian classification (which weights words differently depending on factors in addition to statistical frequency, such as where the terms appear in a document), and semantic indexing (which takes into account the fact that many words used in a similar context may have a similar meaning).  It would take more time to describe these technologies in detail but the Sedona commentary has a good summary of these different technologies if you are interested in learning more.

As should now be apparent, these technologies are very different and using the same words to describe them is confusing.  It’s why it’s not surprising that a lot of the users of e-discovery services and software don’t have a strong understanding of what these technologies are or what benefits they can actually provide in practice.  Dispelling the myth that they can be lumped together is a critical first step in any conversation about concept search and how it can help in e-discovery.  This leads us to a second myth, that Concept Search is better than Keyword Search.  I’ll discuss this in my next blog post.

E-Discovery In The Press

Thursday, October 2nd, 2008

Last month, for the first time, friends of mine who do NOT work in the legal industry starting talking to me about e-discovery. In the past, they had always taken on the glazed look of a bored 8th-grader whenever I spoke about what I do. But suddenly, they were strangely interested and full of questions.

The reason was two articles about e-discovery in the mainstream media which appeared within a week of each other. The first was in the Wall Street Journal, which wrote about how tech firms are at war with lawyers. According to the Journal, the fact that companies are saving money by using e-discovery software is bad news for lawyers, since they are “facing the loss of lucrative client fees.” In response, the lawyers are fighting back: “The attorneys counter that there are pitfalls to replacing them. Early this year, a federal judge required chip maker Qualcomm to pay rival Broadcom more than $8 million after it failed to uncover and share emails relevant to a case.”

I am sure there are lawyers who see technology as a threat, but the firms I deal with are actively embracing e-discovery technology, not fighting it. They see it as another way they can add value to their clients, and would prefer to have their staff focused on practicing law, not mindlessly reading irrelevant documents. So I ended up spending a lot of time explaining to my non-legal friends that there are two sides to the coin. As for my friends who do happen to be lawyers, they focused on the Qualcomm case, pointing out (as we have written before) that the problem was not technology, but rather poor processes and bad judgment on the part of the attorneys concerned.

The second article appeared in the Economist and took a different tack. It argued that the stratospheric cost of e-discovery is gumming up the court system and preventing justice from being served. According to one former justice from Colorado quoted in the article, even mundane landlord-tenant disputes “are now digital wars of attrition”; there are “cases that are settled only because one party cannot afford the costs of e-discovery”; and, many “plaintiffs cannot afford to sue at all, for fear of the e-discovery costs.”

I love the Economist’s tongue-in-cheek style and thought the article made many valid points. My one disappointment was that its spin was unequivocally negative, as though e-discovery is a self-inflicted wound on the American judicial system. Nowhere was there mention of the fact that electronic evidence often helps litigants get at the truth. Rather than incomplete recollections or “he said-she said” claims and counter-claims, there’s no disputing an email that captures a person’s words and actions in black-and-white. Nor was there any mention of how technology is solving the problems that it inadvertently created: today, there are many products that rapidly sift through electronic information, dramatically lowering the cost of e-discovery.

It is great for everyone in the e-discovery community for our domain to get more ink in mainstream, quality publications. I expect that the trend will continue as the industry grows, and especially once the investigations start into our current financial meltdown.

If You Think E-Discovery Does Not Matter, Think Again

Thursday, September 27th, 2007

In my experience, e-discovery does not make the radar screen of most corporate General Counsels (GCs). Typically, it is one many issues left to others (e.g., Chief of Litigation, Director of Litigation Support) within the GC’s group. That may change after the recent verdict in the case of Broadcom vs. Qualcomm.

See below for the story, as told by Corporate Counsel in their October issue, with additional commentary from me [added in brackets]:

Collateral Damage

After a string of punishing legal defeats, Qualcomm Incorporated has switched general counsel. On August 13 the company announced that Carol Lam would replace Louis Lupin as its legal chief [Sounds like he got fired]. The move came a week after a federal judge issued a scorching order accusing Qualcomm and its outside lawyers of “gross litigation misconduct.” [Sounds like a pretty good reason why he got fired]

Emily Kilpatrick, Qualcomm’s director of corporate communications, says Lupin is leaving for personal reasons [Isn’t that what they always say?]. “He has been an outstanding leader and contributor to Qualcomm’s success over the past 12 years,” according to Kilpatrick. “However, he has decided to step down as general counsel and take a personal leave.” [a decision most likely made at the request of his boss]

Lam, who was hired in February to supervise Qualcomm’s worldwide litigation, will take over as interim GC, according to a company statement. Lam is one of the U.S. Attorneys fired by the U.S. Department of Justice this past winter. [oh, the irony…]

Based in San Diego, Qualcomm licenses semiconductor technology and system software to cell phone makers. For several years it’s been engaged in a pitched battle with rival Broadcom Corporation over who has infringed whose patents.

Qualcomm’s biggest problems have come in a case in San Diego federal district court. In January a jury ruled that the company had violated Broadcom’s patents. But even before the verdict, Qualcomm suffered a major setback as the trial drew to a close. One of the company’s witnesses revealed the existence of email that Broadcom said should have been produced during discovery. [Yet again, email is the smoking gun]

In April general counsel Lupin and one of Qualcomm’s outside attorneys sent letters of apology to the court, saying they failed to do a detailed enough keyword search of the company’s email. [No big deal, right? After all, we are saying sorry]

But that wasn’t enough for Judge Rudi Brewster, who has been hearing the San Diego case. On August 6 he issued a blistering 54-page ruling. He accused Qualcomm not only of failing to turn over more than 200,000 pages of relevant email and electronic documents during discovery, [i.e., this is a case of a deeply flawed e-discovery process, not of a simple missing email] but of engaging in a years-long campaign to deliberately mislead a technological standards body. Brewster ordered Qualcomm to pay Broadcomm’s litigation costs, and voided two of its patents. (David Rosmann, vice president of intellectual property litigation at Broadcom, estimates that its fees could be around $10 million). [The legal costs alone are several times what it would have cost Qualcomm to purchase an e-discovery solution and avoid this whole situation in the first place]

In a statement, Qualcomm said it “respectfully disagrees” with Brewster’s ruling and intends to appeal. “Qualcomm acknowledges the seriousness of the court’s findings and reiterates its previous apology to the court for the errors made during discovery and for the inaccurate testimony of certain of its witnesses,” the statement read. [We said sorry, isn’t that enough for you guys?]

The company’s problems aren’t over, however. Federal magistrate judge Barbara Major is now considering whether to levy sanctions against Qualcomm’s attorneys. [Don’t think you can hide behind your deep-pocketed employer. If you screw up e-discovery, it will be your neck on the line] Major has given “any and all…attorneys who signed discovery responses, signed pleadings and pretrial motions, and/or appeared at trial on behalf of Qualcomm” until September 21 to file a statement explaining why they shouldn’t be penalized. [For the lawyers in question, it’s guilty unless their arguments convince the judge they are innocent]

From Web 2.0 To E-Discovery 2.0

Thursday, April 19th, 2007

If there’s one idea that has captivated Silicon Valley in the past 3 years, it is Web 2.0. People may debate its meaning and definition, but the gist of it is clear: a handful of powerful forces have coalesced to make the internet of today fundamentally different to what it was 5 years ago. Opinions vary on which of these forces is most important: the growth of broadband to the home; open source, ajax and other technologies which lower the cost and increase the functionality of web applications; the power of community in a world where more people are on the web. Whichever you choose, there is no doubt that collectively these forces have had a huge impact, powering the growth of now-household names such as Google, MySpace, and YouTube.

I believe that an analogous set of changes is transforming the way companies do e-discovery. Ten years ago, e-discovery was an after-thought – a necessary, but incidental, part of corporate legal expenses. Today, it is a huge line-item in the legal budget, a headache for corporate IT, and the foundation upon which many cases are built.

E-discovery 1.0 was an ad hoc activity; e-discovery 2.0 is a core business process. E-discovery 1.0 was barely noticed; e-discovery 2.0 is driving the news cycle, affecting everyone from Intel to the US Attorney General. In the legal world, e-discovery 2.0 has had every bit as big an impact on enterprises as Web 2.0 has had on the dating lives of teenagers.

What happened? A series of fundamental changes have made e-discovery far more important, expensive, and complex than it was in the 1990s. Chief among these changes are:

1. Email, Not Voicemail: In the past 10 years, companies have switched from voicemail to email as the primary way they communicate. This has created a written record where none previously existed. Just as oral histories eventually die out, every voicemail eventually gets deleted; but emails and the written word live forever. Whatsmore, the convenience and time-efficiency of email makes it addictive, with the result that every meaningful conversation is captured, time-stamped, and attached to a person’s name. Given that many legal cases turn on intent, and proving who knew what when, this makes email a virtual treasure trove for anyone building a case.

2. Electronic Files, Not Paper: Electronic files are fundamentally different to paper documents: they reproduce like rabbits and are far cheaper to store. For example, one laptop is the equivalent of 2,000 boxes of paper; one server corresponds to 8,000-40,000 boxes of paper. The number of servers and laptops holding vast quantities of email is only increasing as the cost of hard disk storage falls, down from $2.04 per GB in 2004 to $0.77 per GB in 2006. Net net: going electronic has vastly increased the amount of data that must be analyzed as part of the discovery process.

3. Sooner, Not Later: Recent changes to the FRCP guidelines have moved e-discovery up in the process, forcing companies to have an e-discovery plan within 99 days of a suit being filed. Since disputes rarely settle that quickly, that means enterprises must now incur the expense of e-discovery on every case, not just the small number that actually make it to court. The result is a massive increase in e-discovery expenses and workload.

Anecdotal evidence of e-discovery 2.0 is everywhere. A few years back, no one would have guessed that every major analyst firm would have people dedicated to tracking e-discovery. Nor would you have expected to find a litigation support manager at every major enterprise.

So what exactly is e-discovery 2.0? Well, I will talk about that in a future post.

Go Ahead, Sue Me!

Wednesday, April 4th, 2007

It is a truism to say that it is easier to dispense advice than to follow it, and with good reason. How many venture capital firms practice the financial discipline they preach to their portfolio companies? How many management consulting companies employ the innovative management theories they advocate to their clients? And how many technology companies actually leverage leading-edge technology to solve their own business problems?

The answer, at least based on my experience, is “not very many”. For example, if you look at Silicon Valley’s leading technology companies, the vast majority do not have an e-discovery solution in place. Yes, there are some exceptions but for the most part, when it comes to e-discovery, the likes of eBay, Google, Yahoo, and (until recently) Intel have preferred to muddle through with manual, error-prone, expensive processes.

The justifications are typically the same. Some technology companies argue that they don’t need a legal discovery solution because theirs is not a litigious industry; others say they delete everything off their Exchange servers within three weeks and so don’t have any email to discover; all agree that things like email and document retention policies are needlessly bureaucratic.

The danger of this “we- don’t- need- car- insurance- because- we- will- never- have- an- accident” approach has been brutally exposed in the past few weeks by the painful experience of Intel. In case you missed the press coverage: AMD sued Intel for anti-trust violations. Like any company on the receiving end of a subpoena, Intel was obliged to provide opposing counsel with all email and documents relevant to the case.

If Intel had an e-discovery solution, that would have been a straightforward process. Intel’s IT group would simply identify a group of messages by date range, person, and perhaps keyword within their larger email archive. The legal group would then use an analysis product to cull down the messages to only those relevant to the case. The whole thing would take a few days. But that’s not what happened. Since Intel did not have an e-discovery solution, the company had no simple way to preserve and analyze the relevant data. Intel’s legal department was obliged to inform over a thousand employees that they could no longer delete data at will. Somewhere along the line, the message did not get through and employees kept on deleting. As a result, Intel was forced to go back to the judge with the proverbial “the dog ate my homework” defense, while AMD cried foul.

How much this costs Intel is yet to be determined. But my guess is that they will end up spending more on lawyers to fix the mess than they would have spent on an e-discovery solution that would have avoided the problem to begin with.

While I have given up on venture capitalists and management consultants, I remain optimistic that the technology industry will practice what it preaches and leverage technology to solve its own business problems in e-discovery. As Intel discovered, it is not enough to have smart lawyers on staff. You also need to equip them with an e-discovery solution that allows them to preserve and analyze information relevant to the case.

To do otherwise is an open invitation to your competitors to sue you. Just ask Larry Ellison – or better yet, SAP.