Posts Tagged ‘analysis’

What a Difference a Year (or Two) Makes in Electronic Discovery

Thursday, August 5th, 2010

August just wouldn’t be August without lazy days at the beach spent playing in the sand, frolicking in the surf, and immersing yourself in the LTN executive summary of the latest Socha-Gelbmann Electronic Discovery report (in this case, the hot-off-the-presses 2010 edition).

Even with the lure of the big waves beckoning you out into the water, if you follow electronic discovery you likely have a hard time pulling yourself away from the report, and this year is no exception. In fact, this year’s report is especially insightful, as George and Tom seem to have done a particularly impressive job of getting the pulse of not just what’s going on in the law firm and service provider parts of the market, but the enterprise as well.

This is a big change from just a couple of years ago. Go back and review the executive summary from 2008, and you’ll notice a very different feel to the findings. In 2008, much of the talk was around the dynamics of the service provider market, with relatively little discussion of trends related to the e-discovery process and technological innovation in the space. In 2008, it felt like e-discovery was something you had other people do for you: the word “consumer” appeared 12 times in the executive summary. In 2010, two short years later? Just five times. Why? The language may be telling. “Cost” appeared seven times in the 2008 report. In the 2010 report? 16… more than twice as often.

What seems to have happened is that the recession has been something of a refining fire for the electronic discovery market. In order to reduce costs and manage risks, enterprises are behaving much less like consumers and more like real customers with skin (and money) in the game. Not surprisingly, they’ve gotten extremely aggressive about bringing  innovative cost-containing measures to bear on the process. Socha and Gelbmann highlight three:

  • More targeted preservation and collection of ESI
  • More focused review and analysis of the data
  • More effective use of technology to speed up the efforts, improve quality, and reduce costs

This is great news for innovative software companies in the e-discovery space — and their customers. What one would expect to occur in a maturing market is that it would move from a period of rapid innovation to a lower-innovation, consolidation phase. However, that’s not the case here. While there is consolidation occurring,  what’s remarkable about e-discovery right now isn’t really all the acquisition press releases in your twitter feed (mainly from vendors saddled with prior-generation point solutions who are trying to acquire their way toward a complete offering). Rather, it’s how leading enterprises are increasingly seeking, and finding, cutting-edge solutions to solve cost, efficiency, and risk management problems associated with e-discovery that simply weren’t available prior to the meltdown.

As in-house legal and IT e-discovery spending starts to gain steam, look for enterprises purchasing in-house solutions to demand many of the innovations that have been developed over the last couple of years (most of which are highlighted by the Socha-Gelbmann survey):

  • Targeted collection: Products better able to strategically target the collection of ESI, rather than attempting to boil the ocean, are more suited to the mindset and approach of cost-conscious enterprises
  • Iterative discovery: Products that are able to provide “to the left” functionality while still providing enterprise-class, intuitive processing, analysis, review, and production functionality
  • Support for small and big cases: In discussing “small is the new big”, Socha and Gelbmann highlight how “the aggregate of small cases dwarfs the combined large cases.” Successful products must simultaneously handle high numbers of smaller cases while still scaling to the largest matters
  • Integrated analytics: Products must bring to bear powerful analytics across all stages of the e-discovery process, focused not just on document review, but also looking at aggregates of data from many different angles and allowing you to see the big picture across the entire case for effective information and cost management

Is the EDD space maturing? Yes, as Socha and Gelbmann rightfully point out. But it’s doing so in surprising, innovative ways that, when it’s all over, may well prove to be a silver lining to the cloud of challenges the industry has faced over the last two years.

Automated Review in Electronic Discovery Re-Visited

Monday, June 28th, 2010

e-discovery Almost two years ago I wrote one of my first blog posts entitled “Review-less E-Discovery Review.”  Despite the tongue twister of a title, the post posited that “there is a very real possibility that we’re on the cusp of computers taking over a significant e-discovery task for attorneys.” I’d like to take a look and see how much (if at all) my prognostications have materialized.

A cynic might think that this is the moment where E-Discovery 2.0 jumps the shark.  But no, this isn’t one of those sitcom episodes where they flashback to previous shows as an easy way to recycle content.  Instead, it seems useful to see how the legal market has evolved from a litigation workflow perspective, particularly with some vendors touting the benefits of review-less technologies like predictive coding.

In the original blog, I noted that there was a “scenario where a non-manual review methodology may make sense” (while importantly noting that “this approach is not without risk”).  Since my last post there has been the successful adoption of Evidence Rule 502,which makes this methodology (at least conceptually) safer.

But again (imagine dreamy flashback mode), here were the guidelines I previously proffered:

  1. Large data set.  This may sound a bit obvious, but a non-manual approach is best suited for large, unwieldy data sets.  The corpus doesn’t need to be in the terabytes, but the data set should be evaluated in term of discovery processing costs and attorney review estimates.
  2. Short Production Timelines.  Once the above calculations are conducted, the next step is to determine if a human based review could even conceivably be conducted in the given time frame.  In many instances, an eyes-on review process just won’t be feasible since there won’t be enough bodies to throw at the problem.
  3. Next Gen “PAR” Tools.  In order to pull this “review-less” review process off, both safely and quickly, the responding party needs to have access to fast, robust processing, analysis and review (“PAR”) tools.  Certainly, it’s possible to have this scenario work with an e-discovery service provider, if they have the capability.
  4. Relatively Small Amount in Controversy.  For the time being, this approach should not be considered for any “bet the company” litigation, nor anything with significant downside risk (governmental inquiries, punitive damages, class actions, 2nd requests, etc.).  Yet, for many standard commercial lawsuits, corporate investigations, HR claims, etc. this review-less approach may be worth considering.
  5. Ability to Use a Clawback Provision.  Entering into a clawback provision with the opposition is mandatory in this methodology since the chances of an inadvertent production are statistically ever-present.  Yet, until Evidence Rule 502 is resolved, there will always be a risk that the clawback won’t be enforceable against 3rd parties.
  6. Non-governmental Production.  Most information in governmental productions becomes part of the public record, meaning that a clawback isn’t going to be feasible.  Here, trade secret information, personally identifiably data and the like would be disastrous if pushed out into the public domain.

The goal of this post is to see if this dog is any more ready to hunt than it was two years ago.  The short answer (right now) appears to be: No.

We all know that litigators are both risk adverse and generally slow to adopt new technology approaches.  This is particularly true when there’s a perception that they won’t have insight into the technological black box behind automated coding/tagging decisions.  Litigators are understandably sensitive about the ability to prove up the reasonability of their search and review processes.  This “reasonableness” requirement lines up both with the Victor Stanley requirements and FRE 50(b), which eliminates the chance of a waiver only “if the holder of the privilege or work product protection took reasonable precautions to prevent disclosure.”

Given this ongoing hesitancy, the question remains shouldn’t we be seeing more movement in automated review than the glacial progress that’s been achieved to date, particularly with the known shortcomings of the eyes-on review process?  Most are familiar with the 1985 STAIRS study by Blair and Marion where the percentage of relevant documents lawyers thought they had found using Boolean Keyword searches was 75% – when the percentage they actually found was 20%.

But, despite the known deficiencies of eyes-on review it follows into the “go with the devil you know” mindset that often makes sense when dealing with judges and juries who aren’t likely to grok newer-fangled approaches.

In addition to these high-level, almost dogmatic challenges, there is one other tactical element I’d add to my previous list (of 6 factors).

7. All documents processed up-front (no rolling collection). I’ve heard some in the trenches e-discovery experts claim that they’ve never had a case that didn’t involve at least some level of incremental data collections.  Whether this is an overstatement is immaterial.  The fact is that a large number of e-discovery projects involve ESI that is collected (and then processed) in dribs and drabs.  This if often a good thing, largely attributable to the incremental (start slowly) nature of a well thought out e-discovery project where a smaller number of initial custodians are processed, then ECA is conducted and only then is the additional ESI added to the corpus.  This common methodology causes some significant heartburn for a review-less methodology since the ever changing nature of the corpus makes it difficult/impossible for a sample to be truly extensible to what will eventually be the entire data set.  For this reason, the review-less approach should be limited to where the entire corpus is collected and processed at once.

In sum, the seven foregoing factors appear to still be largely valid and create an environment where an automated, review-less methodology will only make sense in a relatively rare set of circumstances.  This may change in the future, but given the risk adverse DNA of most litigators I can’t imagine this tipping point happening any time soon.

Go With the (Work)flow in Electronic Discovery

Thursday, June 10th, 2010

Recently, I attended a conference in Washington DC with a large number of government agencies, including (I must confess) many Clearwell customers like the Department of Health and Human Services, the Department of Homeland Security, and the Veterans Administration. It will probably come as no surprise that, during our conversations, it became abundantly clear that they had substantial electronic discovery technology needs. Many were still reviewing PST files manually in Outlook; others were TIFFing millions of pages of documents prior to directly loading into a traditional review application for eyes-on review. That’s right, nary a trace of early case assessment, transparent search, or culling to be found.

Sadly, no news there. What was fascinating for us was the reaction to the latest release of the Clearwell E-Discovery Platform, Version 5.5. Version 5.5 contains significant new functionality, including dramatically increased performance and scalability along with a number of substantial processing, analysis, review, and production enhancements. But, in addition to these features, we have rolled out a set of e-discovery best practices templates designed to make it vastly easier for organizations to implement a formal e-discovery methodology that builds on the integrated nature of our platform. And it was the prospect of such a methodology, even more than the technology, that people were buzzing about at the summit.

Why? With all of the activity going on in the e-discovery space around product and technology innovation, there was some strong feedback that process and methodology may have gotten lost in the shuffle. And, if you think about it, it’s process and methodology that are likely to be most carefully assessed when the courts are considering the reasonableness (or lack thereof) of e-discovery for a case.

The importance of putting process and methodology front and center (along with a commitment to making the necessary organizational changes to make it happen) is not exactly a new concept. Ralph Losey has been talking about it for years over on his groundbreaking and irreverent e-discovery team blog, and it’s a frequent topic of keynote speakers on the e-discovery lecture circuit. However, like eating your vegetables or exercising, putting in place the right e-discovery process in an organization is something that people realize the benefit of, but still ignore.

This cannot continue, as the stakes are escalating. Take the recent case of Mt. Hawley Ins. Co. v. Felman Prod., Inc. Dean will dive into this case in much greater detail in an upcoming post, but it is very relevant to the methodology versus technology discussion in that it highlights how a methodology problem can cause a fateful technology problem to be overlooked. In this case, a lack of sufficient quality control processes caused the plaintiff to inadvertently produce a number of privileged emails. The court found the inadvertent production was not “solely attributable” to a problem with a Concordance index, and that the plaintiff “failed to perform critical quality control sampling” to determine whether the production was appropriate. Privilege was waived.

What’s the solution? We believe that we’re on to something with Clearwell 5.5, in that we can, uniquely among e-discovery products, marry together methodology and technology in a single platform that allows for the entire e-discovery process to be documented and defended, end-to-end. We have particularly focused on the most critical part of the process which seems to come up over and over again in sanction and privilege waiver decisions, which is the way that an organization moves from an initial pool of documents to a set of defensibly-culled, potentially responsive documents, on through to tagging and production. Our unique workflow capabilities allow the entire process to be documented and instantly recalled with the click of a mouse, letting you see each decision that was made during the course of the case in a step-by-step fashion, and then to structure additional quality control audits on top of those decisions to ensure that every “i” is dotted and every “t” crossed.

It’s a good thing for everyone involved in litigation that e-discovery technology is maturing rapidly to the point where it can start to help solve these sorts of process problems rather than being the cause of them (as was the unfortunately case in Mt. Hawley). This is a major focus for us at Clearwell and you’ll see a lot more exciting news from us on this front over the next few months, so stay tuned!

EMC Acquires Kazeon For $75 million To Round-Out SourceOne Archiving & E-Discovery Solution

Tuesday, September 1st, 2009

“Large storage vendor buys small electronic discovery software company to round-out broader corporate initiative.” That was the story in December 2007, when Seagate bought e-discovery company Metalincs for its i365 solution; and, it’s the same story today as EMC announced its acquisition of Kazeon for its SourceOne archiving solution. The terms of the EMC-Kazeon deal were not disclosed, but sources with knowledge of the transaction tell me that the acquisition price is approximately $75 million. That’s slightly less than what Seagate paid for Metalincs ($82 million), and less than what FTI Consulting paid for Attenex ($88 million). But it’s well within the usual range of $50-100 million that most acquirers pay for technology that has not yet matured into a business.

The deal will come as a relief to Kazeon’s long-suffering shareholders. The company was founded in 2003 and, over the past 6 years, it raised over $60 million in equity financing, double the amount it usually takes successful software companies to reach profitability. But despite all that investment, revenue has been hard to come by. According to former Kazeon employees, the company’s revenue totaled only $7 million over the past 12 months. Perhaps as a result, there’s been a lot of management turnover, and last year the board retained a recruiter to find a new CEO. In light of all that, selling the company for $75 million, or 10 times trailing revenue, is a great outcome for Kazeon’s shareholders. It also provides some level of job security for Kazeon’s employees, many of whom have been offered retention bonuses to stick around.

On the other side of the coin, the deal also makes sense for EMC, which needed to flesh out SourceOne, its recent re-branding of the Email Extender archive. In launching SourceOne in April 2009, EMC described it as an integrated portfolio of products: SourceOne Email Management for email archiving; Discovery Manager for legal holds of email; Celerra and Centera for storage; and Discovery Collector for identifying and collecting data from desktops and file shares. EMC owned all of those products except one: Discovery Collector, which instead was to come from EMC Select Partner, StoredIQ. It is widely known that EMC tried repeatedly to acquire StoredIQ but was rebuffed. So instead, it purchased Kazeon (i.e., the Kazeon Information Server) so that it now owns all aspects of SourceOne and does not have to rely on partners.

Will this eDiscovery deal be successful? We will have to wait and see, but Seagate’s experience is not encouraging. A year after it acquired Metalincs, Seagate laid off most of the staff and hired UBS to help it sell what was left of the electronic discovery company. There have not been any takers.

Foreign Corrupt Practices Act (FCPA) Drives Increased Electronic Discovery Overseas

Tuesday, May 5th, 2009

Ask a European about e-discovery, or e-disclosure as it is called in the UK, and you will often be met with a look of distaste. Much like SUVs or obesity, electronic discovery is viewed as an unpleasant, uniquely American phenomenon. But, in reality, there are fat people in Paris, Range Rovers all over London, and a lot of electronic discovery happening all across Continental Europe – whether people like to admit it or not.

One reason for that is the Foreign Corrupt Practices Act (FCPA). This US law, which has inspired similar legislation in other countries, prohibits companies from engaging in corruption, such as bribing government officials to win large contracts. That sounds simple enough, but it’s not always easy to do. For example, an American friend of mine runs a travel website in China. To advertise, he hired people to hand out flyers at all the major train stations. But after a few weeks, his employees began to get hassled by station officials who said they needed an official “permit”. So he did what anyone would do and paid the “permit fees” even though no paperwork for this “permit” was ever produced. When his US auditors looked at that, they immediately cried foul. He was then compelled to end the practice and bring in a law firm to conduct a full FCPA investigation. The result: lots of legal bills, no more advertising in train stations, and a more powerful Chinese-run competitor who has no such qualms about paying “permit fees”.

In speaking to Daniel Dorsky, Tyco’s Compliance Counsel and an expert in FCPA issues, I discovered that my friend’s experience is no longer the exception. From what Daniel described, enforcement of the FCPA has been stepped up dramatically in the past couple of years. Apparently, 2007 was the watershed. Prior to that, no one really worried about the FCPA too much. But two years ago, the Department of Justice (DoJ) under Mark Mendelsohn, began to take a different approach. First, the fines became much stiffer as, for example, Baker Hughes got hit with a $44 million penalty, by far the largest ever at the time. Second, the DoJ started to prosecute executives personally, bringing 15 criminal cases against individuals. Nothing focuses the mind like the threat of jail time, and FCPA compliance suddenly took on greater urgency.

The number of FCPA enforcement actions continued to increase in 2008, most notably with the infamous Siemens case. By the time the dust settled, the CEO of Siemens had been fired and the company was reeling from a $1.4 billion fine. Nor do things look like they are slowing down in 2009. In the first few months of this year, ABB took an $800 million accounting reserve for FCPA issues, Halliburton got fined $177 million, KBR $502 million, and the KBR CEO, Albert Stanley, got 7 years in jail to go along with his $11 million personal fine. These companies are also now vulnerable to civil suits. While there’s no private right of action under the FCPA, that does not stop securities fraud class actions or shareholder lawsuits, which charge that defendants either understated the risks or overstated the controls in their disclosures.

There are a number of reasons why FCPA enforcement actions will likely increase further in the coming months and years. The FBI recently created an FCPA taskforce of 8-12 agents, bringing all the standard law enforcement tools to FCPA compliance (e.g., wire-taps, subpoenas, informants, warrants, etc.). Many other countries are starting to enforce similar laws, with much encouragement from the US which does not want to see American businesses disadvantaged by doing the right thing. And international law enforcement agencies are cooperating more than ever before. For example, last summer in Paris, international agencies held their first FCPA conference to share information.

All of this is driving a boom in e-discovery as General Counsels and Compliance Officers regularly conduct investigations of their overseas subsidiaries to ensure FCPA compliance. These investigations often center on “red flag” countries like China, Brazil, or Russia, where compliance is most difficult. They almost always involve outside counsel, and require the processing, analysis and review of large volumes of electronic information. This applies to European companies as much as it does to American ones. Non-US nationals can be prosecuted if either communications or money goes via the US, and many European countries are following the DoJ’s lead (e.g., $600 million of Siemens’ $1.4 billion fine came from German authorities).

So no matter how Europeans feel about e-discovery, or e-disclosure, they will be doing more of it in the coming years, much like their American counterparts. It’s fair to say that, in this domain, as perhaps in others, Europeans and Americans have much more in common than they might think.

Cutting Through The Confusion: A Buyer’s Guide To Electronic Discovery Software

Sunday, April 19th, 2009

Over the past 4 years, I have had hundreds of conversations with corporate counsel and “legal IT”, meaning technical folks charged with supporting the legal team. More and more of them are looking to lower their costs by bringing e-discovery in-house. But as they work through that process, there’s one question that consistently comes up, even today – namely, “When [insert name of software company] says they “do” e-discovery, what exactly does that mean?”

There has been progress towards answering this question, thanks mainly to the analyst community. George Socha and Tom Gelbmann’s EDRM framework has been immensely helpful in breaking down electronic discovery into its component steps. Other analysts, like Debra Logan at Gartner, were quick to embrace the framework, prompting every software provider to follow suit. As a result, there is today a common language that everyone uses to describe the e-discovery process.

The Electronic Discovery Reference Model (EDRM) breaks down the e-discovery process into a series of steps. Companies looking to buy e-discovery software to lower costs typically map different software products to each of these steps, to make sure that they cover the entire process.
The Electronic Discovery Reference Model (EDRM) breaks down the e-discovery process into a series of steps. Companies looking to buy e-discovery software to lower costs typically map different software products to each of these steps, to make sure that they cover the entire process.

But having a universally-agreed framework is only half the answer. To eliminate customer confusion, there also needs to be agreement on how different software products fit into the framework. This is especially important since there is no single, end-to-end solution for e-discovery which covers all aspects of EDRM. So customers are forced to think about how different software solutions fit together. And that is where things begin to fall apart.

Many software vendors feel it is advantageous to claim that they do everything, even though they do not. Customers are rightly suspicious of those claims, and so press vendors to provide more detailed information – hence the question, “when you say you do e-discovery, what exactly does that mean?”

In light of that, how can litigation support teams, corporate counsel, or legal IT people figure out which e-discovery solution best meets their needs? From observing this decision-making process hundreds of times, I have found 3 simple steps are incredibly helpful.

Step 1: Read the analyst reports

Two reports in particular make for required reading. One is Gartner’s MarketScope Report, which is available for free at certain sites; the other is the 451Group’s recent e-discovery report, which is summarized in a publicly available presentation. The helpful thing about the 451 Group’s report is that it tells you which software companies do which parts of the EDRM process. You do have to buy the report to get the full picture (it’s well worth it!), but the publicly available presentation will give you a flavor for their analyis, and I have drawn from that presentation in the figure below:

Analyst firms like the 451 Group map software vendors to the EDRM framework according to what they actually do, which is often different from what software vendors claim they do.
Analyst firms like the 451 Group map software vendors to the EDRM framework according to what they actually do, which is often different from what software vendors claim they do.

The 451 Group’s analysis highlights several important points. First, it shows that there is no single end-to-end solution. Even the products of giants like EMC (SourceOne), HP (IAP), and IBM (CommonStore) only solve one piece of the puzzle, information management. Second, it shows that customers have choices at each stage of the EDRM process. For example, to solve the problem of identification, collection, and preservation of electronic information, customers can choose from solutions as diverse as Guidance EnCase (forensic collection), Index Engines (back-up tapes) and Mimosa NearPoint (email archive). Third, it provides an independent assessment of what vendors do, as opposed to what they may claim. For example, Kazeon claims analysis and review capabilities, whereas the report shows its product does identification, collection, and preservation; Recommind claims its Axcelerate eDiscovery and MindServer products do processing, whereas the report finds that they do not.

Step 2: Evaluate the products prior to purchase

Just as anyone would test-drive a car prior to purchase, it’s critical to test-drive e-discovery software. Any vendor should be willing to provide their software free of charge for an evaluation on-premise. The most effective evaluations are when the customer uses the product themselves, either on a live case or test data. This is far preferable to just sending the data to the vendor who then loads it into their system, as in that scenario there are too many opportunities for the vendor to hide their product’s shortcomings.

Step 3: Check references carefully

The trick with references is to insist on relevant references. It’s not good enough for the vendor to dredge up some random person who says nice things; or even a credible knowledgeable person who is using the product in a completely different way. For example, if a company is happy with Autonomy’s IDOL for enterprise search, that does not tell you much about what Autonomy might be like for e-discovery. What really counts are references from other customers who are using the product for the same application that you are.

All this can sound like a lot of work, but I have seen people go through the process in as little as a month, and be much happier for it. A little work up front can save a lot of time (and heart-ache!) later on.

Time to Work Together on Electronic Discovery

Friday, February 27th, 2009

Cheesy Successories posters aside (for an alternative take, go here), the need to work together is much more than just a cliché in today’s environment.

In its recent brief on the five major trends that will shape business technology in 2009, leading management consultancy McKinsey and Company noted one trend in particular which highlights the urgent need for an organization’s IT and legal groups to forge better, faster, and more efficient ways of collaborating on electronic discovery issues:

Regulators demand more from IT

Government scrutiny of business will intensify in many developed countries. Already, in the United States, the Office of the Comptroller of the Currency weighs in on the resiliency of banking systems, the Food and Drug Administration (FDA) requires that many pharmaceutical systems be “validated,” and Sarbanes-Oxley drives decisions about accounting systems in every industry. In the future, policy makers and regulators will probably demand that IT systems capture more and better data in order to gain greater insight into and control over how banks manage risk, pharma companies manage drugs, and industrial companies affect the environment. Government officials also will monitor many legal and business rules more closely to ensure compliance with mandates. Successful CIOs should enhance their relationships with internal legal and corporate-affairs teams and be prepared to engage productively with regulators. They will need to seek solutions that meet government mandates at manageable cost and with minimal disruption.

- McKinsey Quarterly, February 2009

The current economic environment is creating a “Double Whammy” within almost every enterprise that has ongoing or pending electronic discovery issues (and are there many organizations left out there that don’t?):

  • As the McKinsey article notes, regulators will increasingly be demanding more from IT as government scrutiny of business intensifies. Just look at the just-launched recovery.gov site to see the level of transparency and accountability that the government is aiming for with regard to the stimulus package. The bailout will not directly affect every business, but there is a new sheriff in town who will likely set the tone across the entire business landscape.
  • At the same time, there is relentless pressure on controlling costs. When times are tough, dollars that can be saved on the expense side are much more valuable that top-line revenue, since 100% of every dollar of cost savings goes directly to the bottom line.

The net-net: Enterprises will be forced to do more, with less.

How? With regard to electronic discovery, there is a lot of low-hanging fruit to be picked in the area of IT and legal cooperation:

  • In-house legal teams should meet with IT (if they aren’t already) to help them better understand the nature of electronic discovery, particularly as it applies to the more “upstream” parts of the process (specifically, identification, preservation, and collection) which IT tends to be more responsible for. Through a better understanding of the nature of electronic discovery, IT can improve its ability find the right documents, avoiding over-collection and reducing downstream processing costs. In addition, new electronic discovery technologies are making it increasingly easy for legal to own more of the process, reducing the electronic discovery burden on IT.
  • Conversely, IT should coordinate with in-house legal teams to provide advice and mentoring as legal seeks to bring e-discovery platforms in-house to assist with early case assessment, search, culling, and analysis. To many legal teams, bringing e-discovery in-house may seem like a daunting proposition, but enterprise software has been around for a long time, and learning from IT’s experiences can make the process far less intimidating.

Yes, regulators are going to be far more demanding in the future than they have been in the past. But some simple collaboration and coordination between IT and legal will go a long way toward lightening the regulatory burden, especially as it pertains to electronic discovery.

E-Discovery 911: Reducing Enterprise Electronic Discovery Costs in a Recession

Friday, February 20th, 2009

In today’s economy, controlling electronic discovery costs has taken on a new urgency.  Because the financials of many companies have deteriorated so quickly, there is great interest in finding methods to reduce any costs in the short-term.  As  a result, anyone in a company’s IT or legal department that comes up with a plan to substantially reduce their company’s electronic discovery costs in the short-term is likely to become a hero in their company.  So, what’s the best way to reduce electronic discovery costs quickly?

A natural first step is to decide where to focus.  Which electronic discovery activities are the most costly today?  Which have the greatest room for cost reductions?  The EDRM model serves as a good guide for answering such questions by breaking electronic discovery activities into Information Management, Identification, Collection, Preservation, Processing, Analysis, Review, Production and Presentation.  One thing I have noticed when interacting with enterprises is that the IT and legal departments tend to focus on different stages within electronic discovery based on their perspective.  IT managers naturally concentrate on the information management, identification, collection and preservation activities because these are the activities in which they are most involved.  Similarly, legal managers naturally look to preservation, processing, production and review.

Given these different perspectives, it’s important to take an objective approach to calculating electronic discovery costs.  Doing so is not that easy.  Costs can vary significantly depending on each company, the nature of the case, nature of the data, which vendors/technologies that are used and a variety of other factors.  Costs also come in many different forms: direct hard dollar costs, such as spending on legal and electronic discovery fees delivered by third parties; indirect hard dollar costs, such as time spent by company employees; and soft dollar costs, such as increased risk that could lead to adverse judgments and sanctions.  Finally, electronic discovery costs are often buried across both legal operating budgets and IT budgets making it hard to separate these costs from the costs of other activities.

Undertaking an internal analysis to understand your company’s electronic discovery costs is a valuable activity if you want to better control these costs.  However, while costs do vary between companies, most companies will find that the same activities contribute the most direct hard dollar costs and that these are the costs that are easiest to control in the short-term.  To demonstrate this, let’s walk through a generic cost analysis of a typical case.  Fortunately, we don’t have to start from scratch in doing this.  Leonard Deutchman, an author of several excellent electronic discovery articles, has already done most of the work in a May 2007 article, “Get Ready for the Rules Changes, Part VIII“.  In this article, Mr. Deutchman walks the reader through a hypothetical litigation between an Investor and a Venture Capital firm.  He describes the typical electronic discovery activities and calculates the direct hard dollar costs for these activities including:

  • Collection: Mr. Deutchman calculates that it costs $10k to collect 400GB from 8 hard drives and the data of 8 custodians on file and email servers using an outside vendor (doing it in-house can be less expensive).  Note that this excludes any collection from back-up tapes, which can be more costly.
  • Culling & Processing: it costs $4k to reduce the 400GB to 90GB by removing non-relevant file types prior to processing.  Processing 90GB costs $90k at $1000/GB.  De-duplication and the application of search terms reduce the data to 25GB.
  • Production: it costs $4k to produce the 4GB of data that is deemed responsive and not privileged to produce to the other side.

Mr. Deutchman doesn’t identify direct hard dollar costs for Information Management, Identification or Preservation.  These activities are typically not associated with direct hard dollar costs on a per matter basis.  Rather, they involve indirect hard dollar costs such as employee time and software licenses.  Mr. Deutchman also does not provide an estimate for the costs of review.  However, since review does contribute significant direct hard dollar costs for every matter, this gap needs to be filled in order to get a complete sense of the direct hard dollar costs.  The two big buckets of cost in review are: attorney review costs and review software costs.  In Mr. Deutchman’s hypothetical litigation one might imagine the following scenario for these costs:

  • 25GB translates into 195,000 documents using the low end of the documents per GB email (9,000/GB) and documents per GB files (7,000/GB). Industry survey data that is available from EDRM.  This example assumes that 40% of the 25 GBs is email.
  • The attorneys reviewing the data charge $75/hour and make 100 document decisions per hour.  This translates to approximately $146,000.
  • The hosted review service costs $50/GB/month and, in this case, let’s assume we host it for 6 paid months.  This costs $7,500.

If we tabulate these costs and calculate the direct hard dollar cost shares for each stage, the clear take-away is that Processing and Review costs comprise the vast majority of direct hard dollar costs.  Collection and Production direct hard dollar costs are significantly smaller in comparison.

EDRM Stage

Hard Dollar Costs ($k)

Share

Collection

10

4%

Processing

94

36%

Review

153

58%

Production

4

2%

Total

261

100%

Total for Processing & Review

247

94%

Now, it’s possible to come up with many arguments for why Mr. Deutchman or my estimates could be high including different assumptions for attorney hourly review costs, higher document decision rates, cheaper vendor pricing, etc.  Similarly, it’s possible to come up with many arguments for why the estimates could be low including the need to perform multiple review passes, slower document decision rates, more expensive vendor charges, etc.  In addition, each company will have their own unique circumstances that will change this picture.  However, this generic analysis strongly suggests that more customized analyses would come to the same conclusion: if you want to reduce electronic discovery costs quickly, then you need to focus on processing and review costs.  One can also imagine that even if you were to use some form of activity-based costing to allocate indirect hard dollar costs on a per matter basis, it would likely not change the importance of Processing and Review costs.

What does this mean for IT and legal managers in Corporations?  These kinds of analyses make it pretty clear that, even though they are more involved in the Information Management, Identification, and Collection phase of electronic discovery, IT managers need to focus more on helping the legal team optimize Processing and Review activities.  You are not going to get the biggest bang for your buck in the short-term by trying to reduce costs in Information Management, Identification, Preservation, and Collection.  Similarly, legal managers need to work more closely with IT in order to focus on how to reduce processing and review costs.

So, the obvious question coming out of such an analysis is what’s the best way to reduce Processing and Review costs?  We’ll discuss this issue in a future post.

In the meantime, tell me what you think by participating in our first e-discovery 2.0 poll.  See the sidebar here: Which Phase of Electronic Discovery Do You Think is the Most Costly?

Concept Search Versus Keyword Search in Electronic Discovery

Wednesday, November 12th, 2008

In my last post, I started a discussion on the myths surrounding concept search.  The first myth I dispelled was the “concept search is concept search” myth.  The myth is that there is an agreed upon definition of concept search.  In actuality, when people in e-discovery use the term concept search, they don’t always mean the same thing.  Frequently they are not actually talking about concept search technology at all and are actually talking about concept or content categorization technology, which is very different.  The second myth that needs dispelling is that concept search is better than keyword search.

The thinking behind this myth goes something like this:

Keyword search has a lot of problems.  It is prone to being over-inclusive, i.e., finding some non-relevant documents, and under-inclusive, i.e., not finding some relevant documents.  Concept search technologies are new and interesting and using these technologies you can find documents that keyword search can’t find.  Therefore, concept search must be better than keyword search.

Let’s examine this thinking.  The first two statements are accurate.  Keyword search is not perfect and can produce over- and under-inclusive results.  And concept search and content categorization technologies can both help identify documents that keyword search technologies might not find.  However, the conclusion that concept search is better than keyword search is not valid and doesn’t follow from these two statements.  Why?

In order to answer this question, we first need to go back to the difference between concept search and content categorization. Because these are different technologies, we really need to separately compare concept search versus keyword search and content categorization versus keyword search.  Let’s start with content categorization and keyword search.

The issue with this comparison is that keyword search and content categorization do different things.  Keyword search can be used in many ways in e-discovery.  The two most common are: (1) analysis or case assessment: finding the hot documents and understanding the matter by determining who knew what, when, how and why, etc., and (2) culling: removing non-responsive documents and/or identifying potentially privileged documents in order to reduce a large, starting set of documents to a smaller set before review.

Content categorization, on the other hand, has historically been used within the review phase of e-discovery.  Categorization can help reviewers to better understand the documents they are reviewing and thus potentially increase the speed of review.  Practitioners with whom I have worked also find that categorization can be useful during analysis by helping to understand a matter and identify potentially important keywords.

However, content categorization has not been used as part of culling.  First, culling needs to be transparent.  You need to be able to get agreement with or at least explain to the opposing side and the court exactly how you have culled the data set.  If you cull based on categories of documents that have been generated by a proprietary, black-box algorithm, it’s going to be difficult to gain agreement on or explain your culling methodology.  This is why the typical method of culling is still to use keyword search and either agree on the set of search terms with the opposing side or to use e-discovery search best practices to perform keyword searches on your own.

Second, content categorization has its own issues when it comes to being over- and under-inclusive.  There is no guarantee that your group of documents that have been categorized as being related to, for example, a company’s hiring policies include all of the documents in your matter related to hiring policies or that they do not include some documents that may not really be related to hiring policies.  Content categorization, like keyword search and virtually every information retrieval technology, is not perfect.

So what about concept search technology?  Surely, concept search technology is better than old, boring keyword search.  Well, actually it’s not that clear-cut.  The problem with concept search technology is that while it might find more relevant documents than plain keyword search, it will also likely find more false positives.  Imagine searching for documents containing “terminate” in an employment matter and your concept search technology automatically searching for “fire”, “dismiss”, etc. as well.  You’ll find more documents related to the termination of employees, but you’ll also find a lot more non-relevant documents concerning house fires, the fire department, etc.

So concept search can help address the under-inclusive problem with keyword search, (though it won’t solve it) and can be helpful during analysis.  But it can often increase the over-inclusive problem.  In addition, today’s concept search technologies share the transparency problem with concept categorization.  These technologies have largely been designed as “black boxes”, which as I have discussed in the past, makes sense for Enterprise search but not for e-discovery search, and, as a result, could also be potentially difficult to explain and defend.   For these reasons, concept search technology isn’t used very much in e-discovery today.  In order for its use to become widespread, it will need to become more transparent.  But that’s a topic for another day.

The bottom line here is that despite all the hype, concept search and content categorization technologies do not solve all the challenges of e-discovery search.  Both of these technologies can be very useful and the technology behind them is always improving.  However, as most of the experienced practitioners I work with already know, these technologies are generally better thought of as supplements to keyword search, not replacements.  The important question is not whether to use one technology over the other but which technology is best suited to your objectives and how best to use all the available technologies to achieve the desired goal.

“Aggressive Culling”: The E-Discovery Buzz Cut

Tuesday, September 30th, 2008

Ralph Losey, never one to mince words, recently analyzed a recent litigation survey from the elite Fellows of the American College of Trial Lawyers. The survey highlights the fact that one of the main problems facing the U.S. legal system today is (surprise!) e-discovery. Also (not) a surprise is that the study “places the blame squarely on poor rules, bad law, and judges”, while overlooking the role that lawyers play in the problem.

In his analysis, Ralph makes a number of insightful observations that should help lawyers move from being e-discovery troublemakers to being part of the solution. However, one of his key critiques is targeted not at lawyers but rather at the vendor community: “[E-discovery] is too expensive because lawyers and judges do not know what they are doing, and do not know how to properly cull and review email, and because clients are disorganized pack-rats. Many of the e-discovery vendors are also misinformed, but often they do know better; they just have no pecuniary interest in aggressive culling. Some may even seek to line their own pockets in inflated discoveries.”

As Ralph bluntly points out, pecuniary interests (translation: money) plays a big role here, but so does risk reduction. Imagine you’re given the opportunity to process a 2 terabyte case all the way through to review. With the “funnel” of e-discovery costs placing the highest dollar per gigabyte value on the end of the process (i.e. review), what’s your incentive to cull aggressively at the beginning? Not much from a revenue perspective, certainly, but also not much from a risk perspective: particularly when you have sanctions and lawsuits on your mind and are thinking about the potential liability that you incur by excluding potentially relevant documents by using too broad a brush (or pair of garden clippers) in your pruning.

How do we move forward? As document volumes continue to grow, it’s clear that aggressive culling (with a few caveats which we’ll get to in a minute) is a critical tool for managing costs and improving case outcomes (let’s go out on a limb and define “improving” as producing fairer and more equitable rulings). However, in order to adopt more aggressive culling as a standard part of the electronic discovery process, the community has to come to terms with three things:

  • The Myth of Perfection: There may be perfect abs, but there is no perfect e-discovery. Organizations like the E-Discovery Institute are doing fantastic work to measure and improve the accuracy of electronic discovery efforts, but in the end it’s tough to make the argument that having 100 contract attorneys manually reviewing 10 million documents will necessarily produce a better overall e-discovery outcome than  10 specialized attorneys reviewing 200,000 documents that were aggressively (but thoughtfully) culled from initial 10 million document set. There simply is no black and white set of rules that will lead to a perfect process.
  • The Benefit of Cost Control: Given that, it is in the best interest of everyone involved (yes, even vendors) to choose the most cost-effective process that provides a high likelihood of producing the information relevant to the case.  This means “saving your bullets” by not spending all of your e-discovery dollars up front in a case pursing the perfection myth, but instead approaching discovery in an incremental fashion which can adapt to changing facts and circumstances as the matter unfolds. How, you may ask, do vendors benefit? They can become more strategic e-discovery advisors by working with counsel over the full lifecycle of a case, providing higher-value (and, by the way, more interesting and intellectually challenging) consulting services to help incrementally adjust and adapt the course of e-discovery. As Ralph puts it: “…Trial lawyers should accept that specialists in the field of e-discovery are a necessary evil. If an e-discovery specialist knows the field, they can save you money and take you out of the e-discovery morass faster and more reliably than a dozen new rules. The world today is too complex for one man or woman to do it all.”
  • The Value of Defensibility: Many of you likely winced at the term “high likelihood” in the previous point. “Sacrilege!” you cried. “I demand certainty!” First, go back and re-read the first point about the Myth of Perfection. Then, consider that a better way forward may be an approach to e-discovery that involves more aggressive culling early in the process to focus on the most important documents first, more iterations to adapt to changing facts and circumstances, and, all along the way, a complete audit trail that provides defensibility in the event that any aspect of the process is ever questioned. Such defensibility would include specific documentation about the culling decisions that were made, down to the keyword and “sub-keyword” (i.e. wildcard expansion) level, so all the cards are on the table for everyone to see.  The value of defensibility when performing aggressive culling is enormous, in that it adds an additional measure of safety and trust to the process, minimizing the amount of doubt and second-guessing that so often plagues e-discovery negotiations.

By coming to terms with the fundamental imperfections of the e-discovery process and embracing the promise of lower costs and the agility and responsiveness that can be gained with a more iterative approach, everyone stands to gain from the safe and controlled adoption of aggressive culling – yes, even the vendors (at least the smart ones) and their ever-present pecuniary interests.