Posts Tagged ‘e-discovery 2.0’

What is E-Discovery 2.0?

Saturday, May 26th, 2007

In a previous post, I wrote about the forces transforming e-discovery, a phenomenon that has received increasing attention from the press, most recently in this week’s Economist magazine. While everyone agrees that something big has changed, and (generally speaking) on the reasons why, people struggle to put their finger on exactly what e-discovery has become.

That’s why I think the concept of “E-Discovery 2.0” is so helpful. Analogous to Web 2.0, E-Discovery 2.0 is a set of new processes, technologies, and services that enable companies to manage huge volumes of data, lower costs, and meet tight deadlines.

New Processes

When e-discovery meant handing over a few boxes of paper, companies did not need much of a process. But in today’s world, where it involves terabytes of data, teams of reviewers, and precious little time, it is a very different story. To cope with the growing volume and complexity of e-discovery issues, companies have had no choice but to adopt new processes. These include:

  • Collect and Preserve: Most companies have now established procedures so that, when the need arises, they can collect all data relevant to a case and ensure that it cannot be changed or deleted.
  • Analyze Up Front: When presented with more work than can be done, a company’s only option is to work smarter, not harder. That means analyzing the collected data up front, to cull it down to only those emails and documents directly relevant to the case at hand.
  • Collaborate Efficiently: E-Discovery has become a team sport. And whenever you have a team, you need a playbook, or a process, to ensure work is not repeated and that everyone is marching towards the same goal.

New Technologies

If technology created this problem, by making electronic communication so pervasive and voluminous, then it can also solve it. In recent years, several new technologies have arisen that enable companies to store and sift through their data to fulfill e-discovery obligations. The most significant of these trends include:

  • From tape to disk: As the cost of disk storage has continued to decline, more and more companies are abandoning tapes and instead keeping their data online. Email archiving software optimizes for storage efficiency, allowing companies to keep hundreds of terabytes of data readily available for e-discovery.
  • From search to analysis: Basic keyword search has evolved into sophisticated analysis technology that mines email meta-data for relevance, links messages together into discussion threads, and groups them by topics. These analysis applications allow users to sift through millions of messages in minutes, to rapidly identify, tag, and export relevant data.
  • From closed systems to open standards: Until recently, technology providers made no effort to integrate their applications, leaving customers to fend for themselves. But that has started to change. Symantec Enterprise Vault and HP RISS now have open APIs, creating pressure on others to follow suit. George Socha’s Electronic Discovery Reference Model (EDRM), a standards body, has received widespread support, accelerating progress towards creation of an open e-discovery platform.

To anyone working in litigation support, legal, or information security, all this is quite unremarkable. Of course they use technology to address e-discovery. Obviously, there has to be a process. From the company’s perspective, e-discovery has become no different to HR or finance – it is a core competency, part of doing business.

And that, perhaps, is the most remarkable thing about E-Discovery 2.0 – in only a few short years, it has become so widespread and deeply entrenched within the enterprise, that people barely notice it.

The White House And The Problem of A Billion Emails

Sunday, May 13th, 2007

The other day, Michael Clark of EDDix sent me a fascinating academic paper (thanks, Michael!) about “information inflation” at its impact on the legal system. I had never really thought of it this way, but there have really only been 3 significant events in the evolution of information:

  1. Writing (c. 5,000 years ago): Pre-historic man started to etch his markings on clay tablets, stone, wax, papyrus, bark, cloth, wood, paper, cave walls and anything else that came to hand.
  2. Printing (c. 1450): Gutenberg’s movable type printing press enabled mass production of information, contributing to (among other things) the Renaissance and the Scientific Revolution.
  3. Digitization (c. late 20th Century): The personal computer, wide area networks, internet, email, have all led to a massive explosion of information in the past 50 years. As the article points out, “close to 100 billion emails are sent daily…In a small business, whereas formerly there was usually 1 four-drawer file cabinet full of paper records, now there is the equivalent of 2,000 four-drawer file cabinets full of such records, all contained in a cubic foot or so in the form of electronically stored information.”

How can the legal profession cope, given that a lawyer’s job is often to synthesize this mind-boggling amount of data? Fortunately, the authors have a solution:

“A family of computer technology employing new types of search methods and techniques beyond use of mere keywords should now be considered for use in litigation….Litigators can no longer depend on manual review alone. It is too time-consuming and expensive – with cost often exceeding the amounts in dispute.”

To illustrate its point, the paper tells the story of the White House and the problem of a billion emails. During the Clinton administration, the White House agreed to a form of electronic record keeping called ARMS (Automated Records Management System). At the end of each administration, these records are handed over to the National Archives and Records Administration (NARA). The table below shows the number of stored emails NARA has, or expects to receive at the end of each administration.

Now assume that, like previous administrations, the Next President’s administration is subject to a lawsuit that requires e-discovery. The paper calculates:

“Without employing any automated computer process to generate potentially responsive documents, the review effort for this litigation would take 100 people, working 10 hours a day, 7 days a week, 52 weeks a year, over 54 years to complete. And the cost of such a review, at an assumed billing rate of $100/hour, would be $2 billion. Even, however, if present day search methods are used to initially reduce the email universe to 1% of its size (i.e., 10 million documents out of 1 billion), the case would still cost $20 million for a first pass review conducted by 100 people over 28 weeks, without accounting for any additional privilege review.”

This is a great example of why companies and government agencies are adopting e-discovery 2.0 technologies that go far beyond keyword search. In the face of information inflation, what choice do they have?

From Web 2.0 To E-Discovery 2.0

Thursday, April 19th, 2007

If there’s one idea that has captivated Silicon Valley in the past 3 years, it is Web 2.0. People may debate its meaning and definition, but the gist of it is clear: a handful of powerful forces have coalesced to make the internet of today fundamentally different to what it was 5 years ago. Opinions vary on which of these forces is most important: the growth of broadband to the home; open source, ajax and other technologies which lower the cost and increase the functionality of web applications; the power of community in a world where more people are on the web. Whichever you choose, there is no doubt that collectively these forces have had a huge impact, powering the growth of now-household names such as Google, MySpace, and YouTube.

I believe that an analogous set of changes is transforming the way companies do e-discovery. Ten years ago, e-discovery was an after-thought – a necessary, but incidental, part of corporate legal expenses. Today, it is a huge line-item in the legal budget, a headache for corporate IT, and the foundation upon which many cases are built.

E-discovery 1.0 was an ad hoc activity; e-discovery 2.0 is a core business process. E-discovery 1.0 was barely noticed; e-discovery 2.0 is driving the news cycle, affecting everyone from Intel to the US Attorney General. In the legal world, e-discovery 2.0 has had every bit as big an impact on enterprises as Web 2.0 has had on the dating lives of teenagers.

What happened? A series of fundamental changes have made e-discovery far more important, expensive, and complex than it was in the 1990s. Chief among these changes are:

1. Email, Not Voicemail: In the past 10 years, companies have switched from voicemail to email as the primary way they communicate. This has created a written record where none previously existed. Just as oral histories eventually die out, every voicemail eventually gets deleted; but emails and the written word live forever. Whatsmore, the convenience and time-efficiency of email makes it addictive, with the result that every meaningful conversation is captured, time-stamped, and attached to a person’s name. Given that many legal cases turn on intent, and proving who knew what when, this makes email a virtual treasure trove for anyone building a case.

2. Electronic Files, Not Paper: Electronic files are fundamentally different to paper documents: they reproduce like rabbits and are far cheaper to store. For example, one laptop is the equivalent of 2,000 boxes of paper; one server corresponds to 8,000-40,000 boxes of paper. The number of servers and laptops holding vast quantities of email is only increasing as the cost of hard disk storage falls, down from $2.04 per GB in 2004 to $0.77 per GB in 2006. Net net: going electronic has vastly increased the amount of data that must be analyzed as part of the discovery process.

3. Sooner, Not Later: Recent changes to the FRCP guidelines have moved e-discovery up in the process, forcing companies to have an e-discovery plan within 99 days of a suit being filed. Since disputes rarely settle that quickly, that means enterprises must now incur the expense of e-discovery on every case, not just the small number that actually make it to court. The result is a massive increase in e-discovery expenses and workload.

Anecdotal evidence of e-discovery 2.0 is everywhere. A few years back, no one would have guessed that every major analyst firm would have people dedicated to tracking e-discovery. Nor would you have expected to find a litigation support manager at every major enterprise.

So what exactly is e-discovery 2.0? Well, I will talk about that in a future post.