24h-payday

Archive for the ‘predictions’ Category

Why Half Measures Aren’t Enough in Predictive Coding

Thursday, July 26th, 2012

In part 2 of our predictive coding blog series, we highlighted some of the challenges in measuring and communicating the accuracy of computer predictions. But what exactly do we mean when we refer to accuracy? In this post, I will cover the various metrics used to assess the accuracy of predictive coding.

The most intuitive method for measuring the accuracy of predictions is to simply calculate the percentage of documents the software predicted correctly.  If 80 out of 100 documents are correctly predicted, the accuracy should be 80%. This approach is one of the standard methods used in many other disciplines. For example, a test score in school is often calculated by taking the number of questions answered correctly, dividing that by the total number of questions on the test, then multiplying the resulting number by 100 to get a percentage value. Wouldn’t it make sense to apply the same method for measuring the accuracy of predictive coding? Surprisingly, the answer is actually, “no.”

This approach is problematic because in eDiscovery the goal is not to determine the number of all documents tagged correctly, but rather the number of responsive documents tagged correctly. Let’s assume there are 50,000 documents in a case and each document has been reviewed by a human and computer, resulting in the human-computer comparison chart shown below.

 

Based on this chart, we can see that out of 50,000 total documents, the software predicted 42,000 documents (sum of row #1 and #3) correctly and therefore its accuracy is 84% (42,000/50,000).

However, analyzing the chart closely reveals a very different picture. The results of human review shows that there are 8,000 total responsive documents (sum of row #1 and #2) but the software found only 2,000 of those (row #1). This means the computer only found 25% of the truly responsive documents. This is called Recall.

Also, of the 4,000 documents that the computer predicted as responsive (the sum of row #1 and #4), only 2,000 are actually responsive (row #1), meaning the computer is right only 50% of the time when it predicts a document to be responsive. This is called Precision.

So, why are Recall and Precision so low – only 25% and 50%, respectively – when computer predictions are correct for 84% of the documents? That’s because the software did very well predicting non-responsive documents.  Based on the human review, there are 42,000 non-responsive documents (sum of row #3 and #4), of which the software correctly found 40,000, meaning the computer is right 95% (40,000/42,000) of the time when it predicts a document non-responsive. While the software is right only 50% of the time when predicting a document responsive, it is right 95% of the time when predicting a document non-responsive, meaning that overall  predictions across all documents are right to 84%.

In eDiscovery, parties are required to take reasonable steps to find documents.  The example above illustrates that the “percentage of correct predictions across all documents” metric may paint an inaccurate view of the number of responsive documents found or missed by the software. This is especially true when most of the documents in a case are non-responsive, which is the most common scenario in eDiscovery. Therefore, Recall and Precision, which accurately track the number of responsive documents found and missed, are better metrics for measuring accuracy of predictions, since they measure what the eDiscovery process is seeking to achieve.

However, measuring and tracking both metrics independently could be cumbersome in many situations, especially if the end goal is to achieve higher accuracy on both measures overall.  A single metric called F-measure, which tracks both Precision and Recall and is designed to strike a balance (or harmonic mean) between the two, can be used instead. A higher F-measure typically indicates higher precision and recall, and a lower F-measure typically indicates lower precision and recall.

These three units – Precision, Recall and F-measure – are the most widely accepted standards for measuring the accuracy of computer predictions. As a result, users of predictive coding are looking to solutions that provide a way to measure the prediction accuracy in all three units. The most advanced solutions have built-in measurement workflows and tracking mechanisms.

There is no standard for Recall, Precision or F-measure percentage. It is up to the parties involved in eDiscovery to determine a “reasonable” percentage based on the time, cost and risk trade-offs. The higher percentage means higher accuracy – but it also means higher eDiscovery costs as the software will likely require more training. For high-risk matters, 80%, 90% or even higher Recall may be required, but for lower-risk matters, 70% or even 60% may be acceptable. It should be noted that academic studies analyzing the effectiveness of linear review show widely varying review quality. One study which compared the accuracy of manual review with technology assisted review shows that manual review achieved, on average, 59.3% recall compared with an average recall of 76.7% for technology assisted review such as predictive coding.

LTNY Wrap-Up – What Did We Learn About eDiscovery?

Friday, February 10th, 2012

Now that that dust has settled, the folks who attended LegalTech New York 2012 can try to get to the mountain of emails that accumulated during the event that was LegalTech. Fortunately, there was no ice storm this year, and for the most part, people seemed to heed my “what not to do at LTNY” list. I even found the Starbucks across the street more crowded than the one in the hotel. There was some alcohol-induced hooliganism at a vendor’s party, but most of the other social mixers seemed uniformly tame.

Part of Dan Patrick’s syndicated radio show features a “What Did We Learn Today?” segment, and that inquiry seems fitting for this year’s LegalTech.

  • First of all, the prognostications about buzzwords were spot on, with no shortage of cycles spent on predictive coding (aka Technology Assisted Review). The general session on Monday, hosted by Symantec, had close to a thousand attendees on the edge of their seats to hear Judge Peck, Maura Grossman and Ralph Losey wax eloquently about the ongoing man versus machine debate. Judge Peck uttered a number of quotable sound bites, including the quote of the day: “Keyword searching is absolutely terrible, in terms of statistical responsiveness.” Stay tuned for a longer post with more comments from the General session.
  • Ralph Losey went one step further when commenting on keyword search, stating: “It doesn’t work,… I hope it’s been discredited.” A few have commented that this lambasting may have gone too far, and I’d tend to agree.  It’s not that keyword search is horrific per se. It’s just that its efficacy is limited and the hubris of the average user, who thinks eDiscovery search is like Google search, is where the real trouble lies. It’s important to keep in mind that all these eDiscovery applications are just like tools in the practitioners’ toolbox and they need to be deployed for the right task. Otherwise, the old saw (pun intended) that “when you’re a hammer everything looks like a nail” will inevitably come true.
  • This year’s show also finally put a nail in the coffin of the human review process as the eDiscovery gold standard. That doesn’t mean that attorneys everywhere will abandon the linear review process any time soon, but hopefully it’s becoming increasingly clear that the “evil we know” isn’t very accurate (on top of being very expensive). If that deadly combination doesn’t get folks experimenting with technology assisted review, I don’t know what will.
  • Information governance was also a hot topic, only paling in comparison to Predictive Coding. A survey Symantec conducted at the show indicated that this topic is gaining momentum, but still has a ways to go in terms of action. While 73% of respondents believe an integrated information governance strategy is critical to reducing information risk, only 19% have implemented a system to help them with the problem. This gap presumably indicates a ton of upside for vendors who have a good, attainable information governance solution set.
  • The Hilton still leaves much to be desired as a host location. As they say, familiarity breeds contempt, and for those who’ve notched more than a handful of LegalTech shows, the venue can feel a bit like the movie Groundhog Day, but without Bill Murray. Speculation continues to run rampant about a move to the Javits Center, but the show would likely need to expand pretty significantly before ALM would make the move. And, if there ever was a change, people would assuredly think back with nostalgia on the good old days at the Hilton.
  • Despite the bright lights and elevator advertisement trauma, the mood seemed pretty ebullient, with tons of partnerships, product announcements and consolidation. This positive vibe was a nice change after the last two years when there was still a dark cloud looming over the industry and economy in general.
  • Finally, this year’s show also seemed to embrace social media in a way that it hadn’t done so in years past. Yes, all the social media vehicles were around in years past, but this year many of the vendors’ campaigns seemed to be much more integrated. It was funny to see even the most technically resistant lawyers log in to Twitter (for the first time) to post comments about the show as a way to win premium vendor swag. Next year, I’m sure we’ll see an even more pervasive social media influence, which is a bit ironic given the eDiscovery challenges associated with collecting and reviewing social media content.

The Top Ten “What NOT to Do” List for LegalTech New York 2012

Thursday, January 26th, 2012

As we approach LegalTech New York next week, oft referred to as the Super Bowl of legal technology events, there are any number of helpful blogs and articles telling new attendees what to expect, where to go, what to say, what to do. Undoubtedly, there’s some utility to this approach, but since we’ll be in New York, I think it’s appropriate to take a more skeptical approach and proffer a list of what *NOT* to do at LTNY.

  1. DON’T get caught up in Buzzword Bingo. There are already dozens of sources attempting to prognosticate what the most popular buzzwords will be at this year’s show.  Leading candidates include “predictive coding,” “technology assisted review,” “information governance,” “big data” and even the pedestrian sounding “sampling.” And, while these terms will undoubtedly be on booths and broadcast repeatedly from the Hilton elevator, it doesn’t mean an attendee should merely parrot these without a deeper dive.  Here, the key is go behind the green curtain to see what vendors, panelists and tweet-ers actually mean by these buzzwords, since it’s often surprising to see how the devil really is in the details.
  2. DON’T get a coffee at the Hilton Starbucks. Yes, we all love our morning coffee, but there’s no need to wait in the Justin Bieber-esque line queue at the in-hotel Starbucks. There are approximately 49 locations in a ½ mile radius, including one right across the street. There’s also the vendor giving out free coffee on the second floor, so save yourself 30 minutes of needless line waiting.
  3. DON’T ride the Hilton elevator. For those staying or taking meetings at the Hilton, the elevator lines can be excessively long.  Once you finally get on, you’ll wish they’d been even longer as you then find yourself subjected to the brainwashing of vendor announcements while you make multiple stops on your way to your desired floor. Either take the stairs or, if that’s not possible, try to minimize the trips to keep your sanity. Or, plan B – bring your iPod.
  4. DON’T talk to booth models. It’s tempting to gravitate to the most attractive person at a given vendor’s booth, but they’re often hired professionals designed to get you in for the all-important “badge scan.” Instead, focus on  the person who looks like they’ve been in the same company-branded oxford for 48 hours, because they probably have. While perhaps less aesthetically pleasing, they’ll certainly know more about the product and that’s why you’re there after all, isn’t it?
  5. DON’T pass out your resume on the show floor. While certainly a great networking opportunity, LTNY isn’t the place to blatantly tout your professional wares, at least if you want to keep your nascent job search on the down low. And, if you want to have more private meetings, you’ll need to do better than “hiding out” at the Warwick across the street. For more clandestine purposes, think about the Bronx.
  6. DON’T take tchotchkes without hearing the spiel. There are certain tchotchke hounds out there who roam around LTNY collecting “gifts” for the kids back at home. While I won’t frown on this behavior per se, it’s only courteous to actually listen to the pitch (as a quid pro quo) before you ask for the swag. Anything less is uncivilized.
  7. DON’T get over-served at the B-Discovery Party. After a long day on the show floor you’re probably ready to let loose with some of the eDiscovery practitioners you haven’t seen in a year.  But, in this era of flip cams and instant tweeting, letting your hair down too much can be career limiting. If you haven’t done Jägermeister shots since college, LTNY probably isn’t a good time to resume that dubious practice.
  8. DON’T forget to take your badge off (please!). Yes, it’s cool to let everyone know you’re attending the premier legal technology event of the year, but once you leave the show floor random New Yorkers will heckle you for sporting your badge after hours – particularly the baristas at Starbucks. Plus, if you’ve broken any of the other admonitions above, at least you’ll be more anonymous.
  9. DON’T forget to bring a heavy coat, mittens and scarf. Last year there was the infamous ice storm that stranded folks for days (me included). Even if the weather isn’t that severe this year, anyone from warmer climates will need to bundle up, particularly because it’s easy to unintentionally get caught outside for extended amounts of time – waiting for a cab in the Hilton queue, eating at Symantec’s free food cart, walking to a meeting at a “nearby” hotel that’s “just a block or so away.” Keep in mind those cross town blocks are longer than they appear on a map.
  10. DON’T forget to learn something. Without hyperbole, LTNY has the world’s greatest collection of legal/technology minds in one place for 3 days.  Most folks, even the vaunted panelists, judges and industry luminaries are actually quite accessible. So, at a minimum, attend sessions, ask questions and interact with your peers. Try to ignore the bright lights and signs on the floor and make sure to take some useful information back to your firm, company or governmental agency. You’ll undoubtedly have fun (and maybe a Jagermeister shot, too) along the way.

Lessons Learned for 2012: Spotlighting the Top eDiscovery Cases from 2011

Tuesday, January 3rd, 2012

The New Year has now dawned and with it, the certainty that 2012 will bring new developments to the world of eDiscovery.  Last month, we spotlighted some eDiscovery trends for 2012 that we feel certain will occur in the near term.  To understand how these trends will play out, it is instructive to review some of the top eDiscovery cases from 2011.  These decisions provide a roadmap of best practices that the courts promulgated last year.  They also spotlight the expectations that courts will likely have for organizations in 2012 and beyond.

Issuing a Timely and Comprehensive Litigation Hold

Case: E.I. du Pont de Nemours v. Kolon Industries (E.D. Va. July 21, 2011)

Summary: The court issued a stiff rebuke against defendant Kolon Industries for failing to issue a timely and proper litigation hold.  That rebuke came in the form of an instruction to the jury that Kolon executives and employees destroyed key evidence after the company’s preservation duty was triggered.  The jury responded by returning a stunning $919 million verdict for DuPont.

The spoliation at issue occurred when several Kolon executives and employees deleted thousands emails and other records relevant to DuPont’s trade secret claims.  The court laid the blame for this destruction on the company’s attorneys and executives, reasoning they could have prevented the spoliation through an effective litigation hold process.  At issue were three hold notices circulated to the key players and data sources.  The notices were all deficient in some manner.  They were either too limited in their distribution, ineffective since they were prepared in English for Korean-speaking employees, or too late to prevent or otherwise ameliorate the spoliation.

The Lessons for 2012: The DuPont case underscores the importance of issuing a timely and comprehensive litigation hold notice.  As DuPont teaches, organizations should identify what key players and data sources may have relevant information.  A comprehensive notice should then be prepared to communicate the precise hold instructions in an intelligible fashion.  Finally, the hold should be circulated immediately to prevent data loss.

Organizations should also consider deploying the latest technologies to help effectuate this process.  This includes an eDiscovery platform that enables automated legal hold acknowledgements.  Such technology will allow custodians to be promptly and properly apprised of litigation and thereby retain information that might otherwise have been discarded.

Another Must-Read Case: Haraburda v. Arcelor Mittal U.S.A., Inc. (D. Ind. June 28, 2011)

Suspending Document Retention Policies

Case: Viramontes v. U.S. Bancorp (N.D. Ill. Jan. 27, 2011)

Summary: The defendant bank defeated a sanctions motion because it modified aspects of its email retention policy once it was aware litigation was reasonably foreseeable.  The bank implemented a retention policy that kept emails for 90 days, after which the emails were overwritten and destroyed.  The bank also promulgated a course of action whereby the retention policy would be promptly suspended on the occurrence of litigation or other triggering event.  This way, the bank could establish the reasonableness of its policy in litigation.  Because the bank followed that procedure in good faith, it was protected from court sanctions under the Federal Rules of Civil Procedure 37(e) “safe harbor.”

The Lesson for 2012: As Viramontes shows, an organization can be prepared for eDiscovery disputes by timely suspending aspects of its document retention policies.  By modifying retention policies when so required, an organization can develop a defensible retention procedure and be protected from court sanctions under Rule 37(e).

Coupling those procedures with archiving software will only enhance an organization’s eDiscovery preparations.  Effective archiving software will have a litigation hold mechanism, which enables an organization to suspend automated retention rules.  This will better ensure that data subject to a preservation duty is actually retained.

Another Must-Read Case: Micron Technology, Inc. v. Rambus Inc., 645 F.3d 1311 (Fed. Cir. 2011)

Managing the Document Collection Process

Case: Northington v. H & M International (N.D.Ill. Jan. 12, 2011)

Summary: The court issued an adverse inference jury instruction against a company that destroyed relevant emails and other data.  The spoliation occurred in large part because legal and IT were not involved in the collection process.  For example, counsel was not actively engaged in the critical steps of preservation, identification or collection of electronically stored information (ESI).  Nor was IT brought into the picture until 15 months after the preservation duty was triggered. By that time, rank and file employees – some of whom were accused by the plaintiff of harassment – stepped into this vacuum and conducted the collection process without meaningful oversight.  Predictably, key documents were never found and the court had little choice but to promise to inform the jury that the company destroyed evidence.

The Lesson for 2012: An organization does not have to suffer the same fate as the company in the Northington case.  It can take charge of its data during litigation through cooperative governance between legal and IT.  After issuing a timely and effective litigation hold, legal should typically involve IT in the collection process.  Legal should rely on IT to help identify all data sources – servers, systems and custodians – that likely contain relevant information.  IT will also be instrumental in preserving and collecting that data for subsequent review and analysis by legal.  By working together in a top-down fashion, organizations can better ensure that their eDiscovery process is defensible and not fatally flawed.

Another Must-Read Case: Green v. Blitz U.S.A., Inc. (E.D. Tex. Mar. 1, 2011)

Using Proportionality to Dictate the Scope of Permissible Discovery

Case: DCG Systems v. Checkpoint Technologies (N.D. Ca. Nov. 2, 2011)

The court adopted the new Model Order on E-Discovery in Patent Cases recently promulgated by the U.S. Court of Appeals for the Federal Circuit.  The model order incorporates principles of proportionality to reduce the production of email in patent litigation.  In adopting the order, the court explained that email productions should be scaled back since email is infrequently introduced as evidence at trial.  As a result, email production requests will be restricted to five search terms and may only span a defined set of five custodians.  Furthermore, email discovery in DCG Systems will wait until after the parties complete discovery on the “core documentation” concerning the patent, the accused product and prior art.

The Lesson for 2012: Courts seem to be slowly moving toward a system that incorporates proportionality as the touchstone for eDiscovery.  This is occurring beyond the field of patent litigation, as evidenced by other recent cases.  Even the State of Utah has gotten in on the act, revising its version of Rule 26 to require that all discovery meet the standards of proportionality.  While there are undoubtedly deviations from this trend (e.g., Pippins v. KPMG (S.D.N.Y. Oct. 7, 2011)), the clear lesson is that discovery should comply with the cost cutting mandate of Federal Rule 1.

Another Must-Read Case: Omni Laboratories Inc. v. Eden Energy Ltd [2011] EWHC 2169 (TCC) (29 July 2011)

Leveraging eDiscovery Technologies for Search and Review

Case: Oracle America v. Google (N.D. Ca. Oct. 20, 2011)

The court ordered Google to produce an email that it previously withheld on attorney client privilege grounds.  While the email’s focus on business negotiations vitiated Google’s claim of privilege, that claim was also undermined by Google’s production of eight earlier drafts of the email.  The drafts were produced because they did not contain addressees or the heading “attorney client privilege,” which the sender later inserted into the final email draft.  Because those details were absent from the earlier drafts, Google’s “electronic scanning mechanisms did not catch those drafts before production.”

The Lesson for 2012: Organizations need to leverage next generation, robust technology to support the document production process in discovery.  Tools such as email analytical software, which can isolate drafts and offer to remove them from production, are needed to address complex production issues.  Other technological capabilities, such as Near Duplicate Identification, can also help identify draft materials and marry them up with finals that have been marked as privileged.  Last but not least, technology assisted review has the potential of enabling one lawyer to efficiently complete the work that previously took thousands of hours.  Finding the budget and doing the research to obtain the right tools for the enterprise should be a priority for organizations in 2012.

Another Must-Read Case: J-M Manufacturing v. McDermott, Will & Emery (CA Super. Jun. 2, 2011)

Conclusion

There were any number of other significant cases from 2011 that could have made this list.  We invite you to share your favorites in the comments section or contact us directly with your feedback.

For more on the cases discussed above, watch this video:

Top Ten eDiscovery Predictions for 2012

Thursday, December 8th, 2011

As 2011 comes quickly to a close we’ve attempted, as in years past, to do our best Carnac impersonation and divine the future of eDiscovery.  Some of these predictions may happen more quickly than others, but it’s our sense that all will come to pass in the near future – it’s just a matter of timing.

  1. Technology Assisted Review (TAR) Gains Speed.  The area of Technology Assisted Review is very exciting since there are a host of emerging technologies that can help make the review process more efficient, ranging from email threading, concept search, clustering, predictive coding and the like.  There are two fundamental challenges however.  First, the technology doesn’t work in a vacuum, meaning that the workflows need to be properly designed and the users need to make accurate decisions because those judgment calls often are then magnified by the application.  Next, the defensibility of the given approach needs to be well vetted.  While it’s likely not necessary (or practical) to expect a judge to mandate the use of a specific technological approach, it is important for the applied technologies to be reasonable, transparent and auditable since the worst possible outcome would be to have a technology challenged and then find the producing party unable to adequately explain their methodology.
  2. The Custodian-Based Collection Model Comes Under Stress. Ever since the days of Zubulake, litigants have focused on “key players” as a proxy for finding relevant information during the eDiscovery process.  Early on, this model worked particularly well in an email-centric environment.  But, as discovery from cloud sources, collaborative worksites (like SharePoint) and other unstructured data repositories continues to become increasingly mainstream, the custodian-oriented collection model will become rapidly outmoded because it will fail to take into account topically-oriented searches.  This trend will be further amplified by the bench’s increasing distrust of manual, custodian-based data collection practices and the presence of better automated search methods, which are particularly valuable for certain types of litigation (e.g., patent disputes, product liability cases).
  3. The FRCP Amendment Debate Will Rage On – Unfortunately Without Much Near Term Progress. While it is clear that the eDiscovery preservation duty has become a more complex and risk laden process, it’s not clear that this “pain” is causally related to the FRCP.  In the notes from the Dallas mini-conference, a pending Sedona survey was quoted referencing the fact that preservation challenges were increasing dramatically.  Yet, there isn’t a consensus viewpoint regarding which changes, if any, would help improve the murky problem.  In the near term this means that organizations with significant preservation pains will need to better utilize the rules that are on the books and deploy enabling technologies where possible.
  4. Data Hoarding Increasingly Goes Out of Fashion. The war cry of many IT professionals that “storage is cheap” is starting to fall on deaf ears.  Organizations are realizing that the cost of storing information is just the tip of the iceberg when it comes to the litigation risk of having terabytes (and conceivably petabytes) of unstructured, uncategorized and unmanaged electronically stored information (ESI).  This tsunami of information will increasingly become an information liability for organizations that have never deleted a byte of information.  In 2012, more corporations will see the need to clean out their digital houses and will realize that such cleansing (where permitted) is a best practice moving forward.  This applies with equal force to the US government, which has recently mandated such an effort at President Obama’s behest.
  5. Information Governance Becomes a Viable Reality.  For several years there’s been an effort to combine the reactive (far right) side of the EDRM with the logically connected proactive (far left) side of the EDRM.  But now, a number of surveys have linked good information governance hygiene with better response times to eDiscovery requests and governmental inquires, as well as a corresponding lower chance of being sanctioned and the ability to turn over less responsive information.  In 2012, enterprises will realize that the litigation use case is just one way to leverage archival and eDiscovery tools, further accelerating adoption.
  6. Backup Tapes Will Be Increasingly Seen as a Liability.  Using backup tapes for disaster recovery/business continuity purposes remains a viable business strategy, although backing up to tape will become less prevalent as cloud backup increases.  However, if tapes are kept around longer than necessary (days versus months) then they become a ticking time bomb when a litigation or inquiry event crops up.
  7. International eDiscovery/eDisclosure Processes Will Continue to Mature. It’s easy to think of the US as dominating the eDiscovery landscape. While this is gospel for us here in the States, international markets are developing quickly and in many ways are ahead of the US, particularly with regulatory compliance-driven use cases, like the UK Bribery Act 2010.  This fact, coupled with the menagerie of international privacy laws, means we’ll be less Balkanized in our eDiscovery efforts moving forward since we do really need to be thinking and practicing globally.
  8. Email Becomes “So 2009” As Social Media Gains Traction. While email has been the eDiscovery darling for the past decade, it’s getting a little long in the tooth.  In the next year, new types of ESI (social media, structured data, loose files, cloud context, mobile device messages, etc.) will cause headaches for a number of enterprises that have been overly email-centric.  Already in 2011, organizations are finding that other sources of ESI like documents/files and structured data are rivaling email in importance for eDiscovery requests, and this trend shows no signs of abating, particularly for regulated industries. This heterogeneous mix of ESI will certainly result in challenges for many companies, with some unlucky ones getting sanctioned because they ignored these emerging data types.
  9. Cost Shifting Will Become More Prevalent – Impacting the “American Rule.” For ages, the American Rule held that producing parties had to pay for their production costs, with a few narrow exceptions.  Next year we’ll see even more courts award winning parties their eDiscovery costs under 28 U.S.C. §1920(4) and Rule 54(d)(1) FRCP. Courts are now beginning to consider the services of an eDiscovery vendor as “the 21st Century equivalent of making copies.”
  10. Risk Assessment Becomes a Critical Component of eDiscovery. Managing risk is a foundational underpinning for litigators generally, but its role in eDiscovery has been a bit obscure.  Now, with the tremendous statistical insights that are made possible by enabling software technologies, it will become increasingly important for counsel to manage risk by deciding what types of error/precision rates are possible.  This risk analysis is particularly critical for conducting any variety of technology assisted review process since precision, recall and f-measure statistics all require a delicate balance of risk and reward.

Accurately divining the future is difficult (some might say impossible), but in the electronic discovery arena many of these predictions can happen if enough practitioners decide they want them to happen.  So, the future is fortunately within reach.

Fulbright’s 2011 Litigation Trends Report Predicts a Constant Litigation Pace and a Swell of Regulatory Investigations

Monday, November 7th, 2011

Fulbright & Jaworski has conducted their Litigation Trends survey for nearly the past decade and the results are always interesting since they tend to capture the mindset of inside counsel and litigators as they anticipate the upcoming year.  In their 8th Annual Litigation Trends Survey, Fulbright noted that 92% of U.S. respondents predict that litigation will either increase or stay the same in the upcoming year.  This trend bodes well for players in the litigation services and eDiscovery sectors, and confirms the counter cyclical nature of the industry.  Breaking down the perceived increases across industry verticals, the Survey noted that the biggest anticipated jumps were in the technology, financial services, healthcare and insurance sectors.  Meanwhile energy (the leading sector from the prior year) was one of the few that predicted a decrease.

Going behind the scenes, there were a number of factors that caused respondents to predict litigation increases.  First and foremost, respondents indicated that “stricter regulation was the number one reason” for the increases, particularly with insurance, financial services, health care and retail sectors.  These concerns around regulatory compliance have been increasingly keeping GCs and corporate boards awake as the governance climate continues to heat up.  This regulation driver showed a demonstrable increase with 46% of all respondents having retained outside counsel to assist with regulatory proceedings, up from 37% in the prior year.  The Survey noted that U.S. companies facing a regulatory investigation were most likely to be under pressure from the DOJ (27%), State Attorney General (24%), OSHA (18%), the EPA (16%) and U.S. Attorney (13%).  Also on the regulatory front, U.S. respondents have increasingly begun to recognize the potential jurisdictional reach of the U.K. Bribery Act, with 25% of U.S. companies stating that they have already conducted a review of existing procedures in preparation for implementation.

In addition to managing risk, most in-house counsel are keenly concerned with controlling litigation costs.  The good news here is that associated costs are predicted to be generally flat.  Yet, eDiscovery remained the largest category targeted for increased spending, with 18% of respondents making this their top priority.  Interestingly, though, large enterprises seem to have been doing a good job of getting eDiscovery expenses under control (likely by taking expensive elements of the EDRM in-house), with these expenses declining among the largest companies, from 42% last year to 24% this year.

The Survey noted that the use of cloud computing has gained speed, with 34% of all public companies using the cloud.  And yet, only 40% of those companies using cloud computing have had “to preserve and/or collect data from the cloud in connection with actual or threatened litigation, disputes or investigations.”  This number appears curiously light, and it should definitely rise during the upcoming year as the plaintiff’s bar gets more savvy about this relatively new source of responsive electronically stored information (ESI).

On the narrower eDiscovery front, the Survey honed in on newer issues like cooperation.  Here, the Survey noted that this Sedona-sponsored concept still hasn’t completely taken hold, with nearly 40% of all respondents claiming that “their company has not made the effort to be more transparent or cooperative” due to a litigation strategy of “defending on all fronts.”  This area appears particularly muddled, with one third saying their previous attempts haven’t been reciprocated and another quarter feeling that their company was already transparent.

All in all,  the 2011 Fulbright Litigation Trends Survey notes trends that appear to be largely in line with the primary drivers of (1) managing risk and (2) lowering litigation costs.  On the risk side, compliance with an increasingly complex regulatory environment is offsetting any potential lull in the litigation environment.  And, on the cost side, eDiscovery continues to be a hot button issue, particularly with the relatively new challenges associated with ESI distributed on social media, cloud computing and mobile sources.

A Judicial Perspective: Q&A With Former United States Magistrate Judge Ronald J. Hedges Regarding Possible Discovery Related Rule Changes

Friday, September 9th, 2011

If you have been following my previous posts regarding possible amendments to the Federal Rules of Civil Procedure (Rules), then you know I promised a special interview with former United States Magistrate Judge Ron Hedges.  The timing of the discussion is perfect considering that a “mini-conference” is being hosted by a Federal Rules Discovery Subcommittee today (September 9th) in Dallas, TX.  The debate will focus on whether or not the Rules should be amended to address evidence preservation and sanctions.  I am attending the mini-conference and will summarize my observations as part of my next post.  In the meantime, please enjoy reading the dialogue below for a glimpse into Judge Hedges’ perspective regarding possible Rule amendments.

Nelson: You were recently quoted in a Law Technology News (LTN) article written by Evan Koblentz as saying, “I don’t see a need to amend the rules” because these rules haven’t been around long enough to see what happens.  Isn’t almost five years long enough?

Judge Hedges: No.  For the simple reason that both attorneys and judges continue to need education on the 2006 amendments and, more particularly, they need to understand the technologies that create and store electronic information.  The amendments establish a framework within which attorneys and judges make daily decisions on discovery.  I have not seen any objective evidence that the framework is somehow failing and needs further amendment.

Nelson: You also said the “big problem” is that people don’t talk enough.  What did you mean?  Hasn’t the Sedona Cooperation Proclamation made a difference?

Judge Hedges: The centerpiece of the 2006 amendments (at least in my view) is Rule 26(f).  I think it is fair to say that the legal community’s response to 26(f) has been, to say the least, varied. Civil actions with large volumes of ESI that may be discoverable under Rule 26(b)(1) cry out for extensive 26(f) meet-and-confer discussions that may take a number of meetings and require the presence of party representatives from, for example, IT.  There is an element of trust required between adversary counsel (with the concurrence of the parties they represent) that may be difficult to establish – but some cooperation is necessary to make 26(f) work.  Overlay that reality with our adversary system and the duty of attorneys to zealously advocate on behalf of their clients and you can understand why cooperation isn’t always a top priority for some attorneys.

However, “transparency” in discussing ESI is essential, along with advocacy and the need to maintain appropriate confidentiality. That’s where the Sedona Conference Proclamation can make a big difference. Has the Proclamation done that? It’s too early to reach a conclusion on that question, but the Proclamation is often cited and, as education progresses in eDiscovery, I am confident that the Proclamation will be recognized as a means to realize the just, speedy, and inexpensive resolution of litigation, as articulated under Rule 1.

Nelson: You also mentioned that the Federal Rules Advisory Committee might be running afoul of the Rules Enabling Act.  Can you explain?

Judge Hedges: There is a distinction between “procedural” and “substantive” rules.  The Rules Enabling Act governs the adoption of the former.  Rule 502 of the Federal Rules of Evidence is an example of a substantive rule that was proposed by the Judicial Conference.  However, since Rule 502 is a rule dealing with substantive privilege and waiver issues, it had to be enacted into law through an Act of Congress.  I am concerned that proposals to further amend the Federal Rules of Civil Procedure may cross the line from procedural to substantive.  I am not prepared to suggest at this time, however, that anything I have seen has crossed the line.  Stay tuned.

Nelson: If you had to select one of the three options currently being considered (see page 264), which option would you select and why?

Judge Hedges: To start, I would not choose option 1, which presumes that the Rules can reach pre-litigation conduct consistent with the Rules Enabling Act.  My concern here is also that, in the area of electronic information, a too-specific rule risks “overnight” obsolescence, just as the Electronic Communications Privacy Act, enacted in 1986, is considered by a number of commentators to be, at best, obsolescent.  Note also that I did not use the word “stored” when I mentioned electronic information, as courts have already required that so-called ephemeral information be preserved.  Nor would I choose option 2.  Absent seeing more than the brief description of the category on page 264, it seems to me that option 2 is likely to do nothing more than be a restatement of the existing law on when the duty to preserve is “triggered.”

So, by default, I am forced to choose option 3.  I presume a rule would say something like, “sanctions may not be imposed on a party for loss of ESI (or “EI”) if that party acted reasonably in making preservation decisions.”  There are a number of problems here. First, in a jurisdiction which allows the imposition of at least some sanction for negligence, all the rule would likely do is be interpreted to foreclose “serious” sanctions. Isn’t that correct? Or is the rule intended to supersede existing variances in the law of sanctions?  At that point, does the rule become “substantive”?   Second, how will “reasonableness” be defined?  Reasonableness supposes the existence of a duty – in this case, a duty to preserve.  For example, is there a duty to preserve ephemeral data that a party knows is relevant?  We come back full circle to where we began.

Remember, Rule 37(f) (now 37(e)) was intended to provide some level of protection against the imposition of sanctions, just as the categories are intended to.  Right?  And five years later 37(e) remains defined variously to be a “safe harbor” or a “lighthouse” by some lawyers such as Jonathan Redgrave or an “uncharted minefield” by others like me.

Nelson: What about heightened pleading standards after the Iqbal and Twombly decisions?  Do these decisions have any relevance to electronic discovery and the topic at hand?

Judge Hedges: Let me begin by saying that I am no fan of Twombly or Iqbal. The decisions, however well intended, have led to undue cost and delay all too often.  Not only is motion to dismiss practice costly for parties, but it imposes great burdens on the United States Courts and, as often as not, leads to at least one other round of motion practice as plaintiffs are given leave to re-plead.  All the while, parties have preservation obligations to fulfill and, in the hope of saving expense, discovery is often stayed until a motion is “finally” decided.  I would like to see objective evidence of the delay and cost of this motion practice (and I expect that the Administrative Office of the United States has statistical evidence already).  I would also like to see objective evidence from defendants distinguishing between the cost of motion practice and later discovery costs.

Putting all that aside, and if I had to accept one option, I would choose to allow some discovery that is integrated to the motion practice.  First, even without the filing of a responsive pleading, there should be a 26(f) meet-and-confer to discuss, if nothing else, the nature and scope of preservation and the possibility of securing a Rule 502(d) order. Second, while I have serious concerns about “pre-answer discovery” for a number of reasons, I would have the parties make 26(a)(1) disclosures while a motion to dismiss is pending or leave to re-plead has been granted in order to address the likely “asymmetry of information” between a plaintiff and a moving defendant.  Once the disclosures are made, I would allow the plaintiff to secure some information identified in the disclosures to allow re-pleading and perhaps obviate the need for continued motion practice.

All of this would, of course, require active judicial management.  And one would hope that Congress, which seems so interested in conserving resources, would recognize the vital role of the United States Courts in securing justice for everyone and give adequate funding to the Courts.

Top Five Predictions in Electronic Discovery

Monday, November 15th, 2010

What’s next in the electronic discovery world?  Well, it’s nearly impossible to say with too much precision, but my recent e-discovery trends article attempts to peer into the crystal ball to divine some hints about the future.

The following five predictions are what I expect to create the biggest waves in e-discovery in 2011.  Most are nascent trends that we’ve seen a bit of in 2010, but that should continue to accelerate next year.  Enterprises that can prepare for and understand these areas will be well equipped to continue taking a proactive approach to the ever-changing challenges of e-discovery.

  1. Changes in Forensic Best Practices: In 2011, manual forensic imaging will continue to take a backseat to more automated, forensically sound data collection techniques.  Forensic (bit for bit) images have long been the gold standard for the legally sound collection of ESI in response to legal proceedings.  And, while forensic imaging will continue to be important in a number of discrete situations (fraud, misappropriation of trade secrets cases, etc.), it will largely be seen as overkill in basic electronic discovery cases.  Since imaging is both time consuming and highly manual, automated collection tools will increasingly be used by savvy organizations to speed up and streamline the collection process.
  2. Consolidation in the Electronic Discovery Industry: Consolidation in the electronic discovery sector will impact market forces and the balance of power.  The past year saw traditional, pure-play electronic discovery companies looking (sometimes successfully and sometimes not) for diversification and deep pockets.  In the upcoming year, the relative dearth of pure play EDD companies may reverse the downward price pressure that’s been seen over the past several years.
  3. Proportionality Becomes Reality: Burgeoning data volumes, as seen in multi-terabyte (versus gigabyte) cases, means that the legal community will continue to search for ways to prevent electronic discovery costs from exceeding legal exposure and attorneys fees.  Groups like The Sedona Conference will continue to push for better clarification within the community surrounding “proportionality” in order to keep the electronic discovery “tail” from wagging the litigation “dog.”  If successful at all, there may be a slight respite for litigious enterprises that may be able to better scale e-discovery efforts with the risk profile of the matter at hand.
  4. Collision of Cloud, Social Media and E-Discovery: The seemingly unstoppable migration of corporate data to the cloud, combined with the proliferation of social media applications, will continue to stress electronic discovery practitioners as they attempt to preserve, collect, search, and process electronically stored information (ESI) from sources that aren’t traditionally managed behind the firewall.  Proactive enterprises will increasingly evaluate the legal and compliance risks of storing data in the cloud so that they’re not painted into a corner when they need to preserve, collect, and produce offsite ESI.
  5. Global E-Discovery Matures: International jurisdictions will increasingly look to the United States (and the Federal Rules of Civil Procedure) as their nascent electronic discovery paradigms are increasingly stressed by the proliferation of both ESI and discovery disputes.  The recent Goodale case out of the UK (and impending procedural changes to the e-Disclosure Practice Direction) demonstrates how the global community is rapidly maturing along the electronic discovery continuum.

While the tools and best practices designed to combat top ediscovery hurdles continue to mature, the challenges are multiplying at any equally fast rate.  In the past, the crux of most discovery matters usually centered around email and sometimes instant messaging.  In 2011, new problems will continue to crop up on the horizon, such as collecting SharePoint data from the cloud, trying to extract structured data from a range of proprietary systems and capturing ephemeral ESI from an ever changing array of social media applications.

Please let me know if you disagree with any of the predictions or have any others you’d like to share.