24h-payday

Archive for the ‘search’ Category

Breaking News: Court Orders Google to Produce eDiscovery Search Terms in Apple v. Samsung

Friday, May 10th, 2013

Apple obtained a narrow discovery victory yesterday in its long running legal battle against fellow technology titan Samsung. In Apple Inc. v. Samsung Electronics Co. Ltd, the court ordered non-party Google to turn over the search terms and custodians that it used to produce documents in response to an Apple subpoena.

According to the court’s order, Apple argued for the production of Google’s search terms and custodians in order “to know how Google created the universe from which it produced documents.” The court noted that Apple sought such information “to evaluate the adequacy of Google’s search, and if it finds that search wanting, it then will pursue other courses of action to obtain responsive discovery.”

Google countered that argument by defending the extent of its production and the burdens that Apple’s request would place on Google as a non-party to Apple’s dispute with Samsung. Google complained that Apple’s demands were essentially a gateway to additional discovery from Google, which would arguably be excessive given Google’s non-party status.

Sensitive to the concerns of both parties, the court struck a middle ground in its order. On the one hand, the court ordered Google to produce the search terms and custodians since that “will aid in uncovering the sufficiency of Google’s production and serves greater purposes of transparency in discovery.” But on the other hand, the court preserved Google’s right to object to any further discovery efforts by Apple: “The court notes that its order does not speak to the sufficiency of Google’s production nor to any arguments Google may make regarding undue burden in producing any further discovery.”

This latest opinion from the Apple v. Samsung series of lawsuits is noteworthy for two reasons. First, the decision is instructive regarding the eDiscovery burdens that non-parties must shoulder in litigation. While the disclosure of a non-party’s underlying search methodology (in this instance, search terms and custodians) may not be unduly burdensome, further efforts to obtain non-party documents could exceed the boundaries of reasonableness that courts have designed to protect non-parties from the vicissitudes of discovery. For as the court in this case observed, a non-party “should not be required to ‘subsidize’ litigation to which it is not a party.”

Second, the decision illustrates that the use of search terms remains a viable method for searching and producing responsive ESI. Despite the increasing popularity of predictive coding technology, it is noteworthy that neither the court nor Apple took issue with Google’s use of search terms in connection with its production process. Indeed, the intelligent use of keyword searches is still an acceptable eDiscovery approach for most courts, particularly where the parties agree on the terms. That other forms of technology assisted review, such as predictive coding, could arguably be more efficient and cost effective in identifying responsive documents does not impugn the use of keyword searches in eDiscovery. Only time will tell whether the use of keyword searches as the primary means for responding to document requests will give way to more flexible approaches that include the use of multiple technology tools.

From A to PC – Running a Defensible Predictive Coding Workflow

Tuesday, September 11th, 2012

So far in our ongoing predictive coding blog series, we’ve touched on the “whys” and “whats” of predictive coding, and now I’d like to address the “hows” of using this new technology. Given that predictive coding is groundbreaking technology in the world of eDiscovery, it’s no surprise that a different workflow is required in order to run the review process.

The traditional linear review process utilizes a “brute force” approach of manually reading each document and processing it for responsiveness and privilege. In order to reduce the high cost of this process, many organizations now farm out documents to contract attorneys for review. Often, however, contract attorneys possess less expertise and knowledge of the issues, which means that multiple review passes along with additional checks and balances are often needed in order to ensure review accuracy. This process commonly results in a significant number of documents being reviewed multiple times, which in turn increases the cost of review. When you step away from an “eyes-on review” of every document and use predictive coding to leverage the expertise of more experienced attorneys, you will naturally aim to review as few documents as possible in order to achieve the best possible results.

How do you review the minimum number of documents with predictive coding? For starters, organizations should prepare their case to use predictive coding by performing an early case assessment (ECA) in order to cull down to your review population prior to review. While some may suggest that predictive coding can be run without any ECA up front, you will actually save a significant amount of review time if you put in the effort to cull out the profoundly irrelevant documents in your case. Doing so will prevent a “junk in, junk out” situation where leaving too much junk in the case will result in having to necessarily review a number of junk documents throughout the predictive coding workflow.

Next, segregating documents that are unsuitable for predictive coding is important. Most predictive coding solutions leverage the extracted text content within documents to operate. That means any documents that do not contain extracted text, such as photographs and engineering schematics, should be manually reviewed so they are not overlooked by the predictive coding engine. The same concept applies to any other document that has other reviewable limitations, such as encrypted and password protected files. All of these documents should be reviewed separately as to not miss any relevant documents.

After culling down to your review population, the next step in preparing to use predictive coding is to create a Control Set by drawing a randomly selected statistical sample from the document population. Once the Control Set is manually reviewed, it will serve two main purposes. First, it will allow you to estimate the population yield, otherwise referred to as the percentage of responsive documents contained within the larger population. (The size of the control set may need to be adjusted to insure the yield is properly taken into account). Second, it will serve as your baseline for a true “apples-to-apples” comparison of your prediction accuracy across iterations as you move through the predictive coding workflow. The Control Set will only need to be reviewed once up front to be used for measuring accuracy throughout the workflow.

It is essential that the documents in the Control Set are selected randomly from the entire population. While some believe that taking other sampling approaches give better peace of mind, they actually may result in unnecessary review. For example, other workflows recommend sampling from the documents that are not predicted to be relevant to see if anything was left behind. If you instead create a proper Control Set from the entire population, you can get the necessary precision and recall metrics that are representative of the entire population, which in turn represents the documents that are not predicted to be relevant.

Once the Control Set is created, you can begin training the software to evaluate documents by the review criteria in the case. Selecting the optimal set of documents to train the system (commonly referred to as the training set or seed set) is one of the most important steps in the entire predictive coding workflow as it sets the initial accuracy for the system, and thus it should be chosen carefully. Some suggest creating the initial training set by taking a random sample (much like how the control set is selected) from the population instead of proactively selecting responsive documents. However, the important thing to understand is that any items used for training should accurately represent the responsive items instead. The reason selecting responsive documents for inclusion in the training set is important is related to the fact that most eDiscovery cases generally have low yield – meaning the prevalence of responsive documents contained within the overall document population is low. This means the system will not be able to effectively learn how to identify responsive items if enough responsive documents are not included in the training set.

An effective method for selecting the initial training set is to use a targeted search to locate a small set of documents (typically between 100-1000) that is expected to be about 50% responsive. For example, you may choose to focus on only the key custodians in the case and use a combination of tighter keyword/date range/etc search criteria. You do not have to perform exhaustive searches, but a high quality initial training set will likely minimize the amount of additional training needed to achieve high prediction accuracy.

After the initial training set is selected, it must then be reviewed. It is extremely important that the review decisions made on any training items are as accurate as possible since the systems will be learning from these items, which typically means that the more experienced case attorneys should be used for this review. Once review is finished on all of the training documents, then the system can learn from the tagging decisions in order to be able to predict the responsiveness or non-responsiveness of the remaining documents.

While you can now predict on all of the other documents in the population, it is most important to predict on the Control Set at this time. Not only may this decision be more time effective than applying predictions to all the documents in the case, but you will need predictions on all of the documents in the Control Set in order to assess the accuracy of the predictions. With predictions and tagging decisions on each of the Control Set documents, you will be able to get accurate precision and recall metrics that you can extrapolate to the entire review population.

At this point, the accuracy of the predictions is likely to not be optimal, and thus the iterative process begins. In order to increase the accuracy, you must select additional documents to use for training the system. Much like the initial training set, this additional training set must also be selected carefully. The best documents to use for an additional training set are those that the system would be unable to accurately predict. Rather than choosing these documents manually, the software is often able to mathematically determine this set more effectively than human reviewers. Once these documents are selected, you simply continue the iterative process of training, predicting and testing until your precision and recall are at an acceptable point. Following this workflow will result in a set of documents identified to be responsive by the system along with trustworthy and defensible accuracy metrics.

You cannot simply produce all of these documents at this point, however. The documents must still go through a privileged screen in order to remove any documents that should not be produced, and also go through any other review measures that you usually take on your responsive documents. This does, however, open up the possibility of applying additional rounds of predictive coding on top of this set of responsive documents. For example, after running the privileged screen, you can train on the privileged tag and attempt to identify additional privileged documents in your responsive set that were missed.

The important thing to keep in mind is that predictive coding is meant to strengthen your current review workflows. While we have outlined one possible workflow that utilizes predictive coding, the flexibility of the technology lends itself to be utilized for a multitude of other uses, including prioritizing a linear review. Whatever application you choose, predictive coding is sure to be an effective tool in your future reviews.

Gartner’s 2012 Magic Quadrant for E-Discovery Software Looks to Information Governance as the Future

Monday, June 18th, 2012

Gartner recently released its 2012 Magic Quadrant for E-Discovery Software, which is its annual report analyzing the state of the electronic discovery industry. Many vendors in the Magic Quadrant (MQ) may initially focus on their position and the juxtaposition of their competitive neighbors along the Visionary – Execution axis. While a very useful exercise, there are also a number of additional nuggets in the MQ, particularly regarding Gartner’s overview of the market, anticipated rates of consolidation and future market direction.

Context

For those of us who’ve been around the eDiscovery industry since its infancy, it’s gratifying to see the electronic discovery industry mature.  As Gartner concludes, the promise of this industry isn’t off in the future, it’s now:

“E-discovery is now a well-established fact in the legal and judicial worlds. … The growth of the e-discovery market is thus inevitable, as is the acceptance of technological assistance, even in professions with long-standing paper traditions.”

The past wasn’t always so rosy, particularly when the market was dominated by hundreds of service providers that seemed to hold on by maintaining a few key relationships, combined with relatively high margins.

“The market was once characterized by many small providers and some large ones, mostly employed indirectly by law firms, rather than directly by corporations. …  Purchasing decisions frequently reflected long-standing trusted relationships, which meant that even a small book of business was profitable to providers and the effects of customary market forces were muted. Providers were able to subsist on one or two large law firms or corporate clients.”

Consolidation

The Magic Quadrant correctly notes that these “salad days” just weren’t feasible long term. Gartner sees the pace of consolidation heating up even further, with some players striking it rich and some going home empty handed.

“We expect that 2012 and 2013 will see many of these providers cease to exist as independent entities for one reason or another — by means of merger or acquisition, or business failure. This is a market in which differentiation is difficult and technology competence, business model rejuvenation or size are now required for survival. … The e-discovery software market is in a phase of high growth, increasing maturity and inevitable consolidation.”

Navigating these treacherous waters isn’t easy for eDiscovery providers, nor is it simple for customers to make purchasing decisions if they’re correctly concerned that the solution they buy today won’t be around tomorrow.  Yet, despite the prognostication of an inevitable shakeout (Gartner forecasts that the market will shrink 25% in the raw number of firms claiming eDiscovery products/services) they are still very bullish about the sector.

“Gartner estimates that the enterprise e-discovery software market came to $1 billion in total software vendor revenue in 2010. The five-year CAGR to 2015 is approximately 16%.”

This certainly means there’s a window of opportunity for certain players – particularly those who help larger players fill out their EDRM suite of offerings, since the best of breed era is quickly going by the wayside.  Gartner notes that end-to-end functionality is now table stakes in the eDiscovery space.

“We have seen a large upsurge in user requests for full-spectrum EDRM functionality. Whether that functionality will be used initially, or at all, remains an open question. Corporate buyers do seem minded to future-proof their investments in this way, by anticipating what they may wish to do with the software and the vendor in the future.”

Information Governance

Not surprisingly, it’s this “full-spectrum” functionality that most closely aligns with marrying the reactive, right side of the EDRM with the proactive, left side.  In concert, this yin and yang is referred to as information governance, and it’s this notion that’s increasingly driving buying behaviors.

“It is clear from our inquiry service that the desire to bring e-discovery under control by bringing data under control with retention management is a strategy that both legal and IT departments pursue in order to control cost and reduce risks. Sometimes the archiving solution precedes the e-discovery solution, and sometimes it follows it, but Gartner clients that feel the most comfortable with their e-discovery processes and most in control of their data are those that have put archiving systems in place …”

As Gartner looks out five years, the analyst firm anticipates more progress on the information governance front, because the “entire e-discovery industry is founded on a pile of largely redundant, outdated and trivial data.”  At some point this digital landfill is going to burst and organizations are finally realizing that if they don’t act now, it may be too late.

“During the past 10 to 15 years, corporations and individuals have allowed this data to accumulate for the simple reason that it was easy — if not necessarily inexpensive — to do so. … E-discovery has proved to be a huge motivation for companies to rethink their information management policies. The problem of determining what is relevant from a mass of information will not be solved quickly, but with a clear business driver (e-discovery) and an undeniable return on investment (deleting data that is no longer required for legal or business purposes can save millions of dollars in storage costs) there is hope for the future.”

 

The Gartner Magic Quadrant for E-Discovery Software is insightful for a number of reasons, not the least of which is how it portrays the developing maturity of the electronic discovery space. In just a few short years, the niche has sprouted wings, raced to $1B and is seeing massive consolidation. As we enter the next phase of maturation, we’ll likely see the sector morph into a larger, information governance play, given customers’ “full-spectrum” functionality requirements and the presence of larger, mainstream software companies.  Next on the horizon is the subsuming of eDiscovery into both the bigger information governance umbrella, as well as other larger adjacent plays like “enterprise information archiving, enterprise content management, enterprise search and content analytics.” The rapid maturation of the eDiscovery industry will inevitably result in growing pains for vendors and practitioners alike, but in the end we’ll all benefit.

 

About the Magic Quadrant
Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.

Gartner’s “2012 Magic Quadrant for E-Discovery Software” Provides a Useful Roadmap for Legal Technologists

Tuesday, May 29th, 2012

Gartner has just released its 2012 Magic Quadrant for E-Discovery Software, which is an annual report that analyzes the state of the electronic discovery industry and provides a detailed vendor-by-vendor evaluation. For many, particularly those in IT circles, Gartner is an unwavering north star used to divine software market leaders, in topics ranging from business intelligence platforms to wireless lan infrastructures. When IT professionals are on the cusp of procuring complex software, they look to analysts like Gartner for quantifiable and objective recommendations – as a way to inform and buttress their own internal decision making processes.

But for some in the legal technology field (particularly attorneys), looking to Gartner for software analysis can seem a bit foreign. Legal practitioners are often more comfortable with the “good ole days” when the only navigation aid in the eDiscovery world was provided by the dynamic duo of George Socha and Tom Gelbmanm, who (beyond creating the EDRM) were pioneers of the first eDiscovery rankings survey. Albeit somewhat short lived, their Annual Electronic Discovery[i] Survey ranked the hundreds of eDiscovery providers and bucketed the top tier players in both software and litigation support categories. The scope of their mission was grand, and they were perhaps ultimately undone by the breadth of their task (stopping the Survey in 2010), particularly as the eDiscovery landscape continued to mature, fragment and evolve.

Gartner, which has perfected the analysis of emerging software markets, appears to have taken on this challenge with an admittedly more narrow (and likely more achievable) focus. Gartner published its first Magic Quadrant (MQ) for the eDiscovery industry last year, and in the 2012 Magic Quadrant for E-Discovery Software report they’ve evaluated the top 21 electronic discovery software vendors. As with all Gartner MQs, their methodology is rigorous; in order to be included, vendors must meet quantitative requirements in market penetration and customer base and are then evaluated upon criteria for completeness of vision and ability to execute.

By eliminating the legion of service providers and law firms, Gartner has made their mission both more achievable and perhaps (to some) less relevant. When talking to certain law firms and litigation support providers, some seem to treat the Gartner initiative (and subsequent Magic Quadrant) like a map from a land they never plan to visit. But, even if they’re not directly procuring eDiscovery software, the Gartner MQ should still be seen by legal technologists as an invaluable tool to navigate the perils of the often confusing and shifting eDiscovery landscape – particularly with the rash of recent M&A activity.

Beyond the quadrant positions[ii], comprehensive analysis and secular market trends, one of the key underpinnings of the Magic Quadrant is that the ultimate position of a given provider is in many ways an aggregate measurement of overall customer satisfaction. Similar in ways to the net promoter concept (which is a tool to gauge the loyalty of a firm’s customer relationships simply by asking how likely that customer is to recommend a product/service to a colleague), the Gartner MQ can be looked at as the sum total of all customer experiences.[iii] As such, this usage/satisfaction feedback is relevant even for parties that aren’t purchasing or deploying electronic discovery software per se. Outside counsel, partners, litigation support vendors and other interested parties may all end up interacting with a deployed eDiscovery solution (particularly when such solutions have expanded their reach as end-to-end information governance platforms) and they should want their chosen solution to used happily and seamlessly in a given enterprise. There’s no shortage of stories about unhappy outside counsel (for example) that complain about being hamstrung by a slow, first generation eDiscovery solution that ultimately makes their job harder (and riskier).

Next, the Gartner MQ also is a good short-handed way to understand more nuanced topics like time to value and total cost of ownership. While of course related to overall satisfaction, the Magic Quadrant does indirectly address the query about whether the software does what it says it will (delivering on the promise) in the time frame that is claimed (delivering the promise in a reasonable time frame) since these elements are typically subsumed in the satisfaction metric. This kind of detail is disclosed in the numerous interviews that Gartner conducts to go behind the scenes, querying usage and overall satisfaction.

While no navigation aid ensures that a traveler won’t get lost, the Gartner Magic Quadrant for E-Discovery Software is a useful map of the electronic discovery software world. And, particularly looking at year-over-year trends, the MQ provides a useful way for legal practitioners (beyond the typical IT users) to get a sense of the electronic discovery market landscape as it evolves and matures. After all, staying on top of the eDiscovery industry has a range of benefits beyond just software procurement.

Please register here to access the Gartner Magic Quadrant for E-Discovery Software.

About the Magic Quadrant
Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.



[i] Note, in the good ole days folks still used two words to describe eDiscovery.

[ii] Gartner has a proprietary matrix that it uses to place the entities into four quadrants: Leaders, Challengers, Visionaries and Niche Players.

[iii] Under the Ability to Execute axis Gartner weighs a number of factors including “Customer Experience: Relationships, products and services or programs that enable clients to succeed with the products evaluated. Specifically, this criterion includes implementation experience, and the ways customers receive technical support or account support. It can also include ancillary tools, the existence and quality of customer support programs, availability of user groups, service-level agreements and so on.”

District Court Upholds Judge Peck’s Predictive Coding Order Over Plaintiff’s Objection

Monday, April 30th, 2012

In a decision that advances the predictive coding ball one step further, United States District Judge Andrew L. Carter, Jr. upheld Magistrate Judge Andrew Peck’s order in Da Silva Moore, et. al. v. Publicis Groupe, et. al. despite Plaintiff’s multiple objections. Although Judge Carter rejected all of Plaintiff’s arguments in favor of overturning Judge Peck’s predictive coding order, he did not rule on Plaintiff’s motion to recuse Judge Peck from the current proceedings – a matter that is expected to be addressed separately at a later time. Whether or not a successful recusal motion will alter this or any other rulings in the case remains to be seen.

Finding that it was within Judge Peck’s discretion to conclude that the use of predictive coding technology was appropriate “under the circumstances of this particular case,” Judge Carter summarized Plaintiff’s key arguments listed below and rejected each of them in his five-page Opinion and Order issued on April 26, 2012.

  • the predictive coding method contemplated in the ESI protocol lacks generally accepted reliability standards,
  • Judge Peck improperly relied on outside documentary evidence,
  • Defendant MSLGroup’s (“MSL’s”) expert is biased because the use of predictive coding will reap financial benefits for his company,
  • Judge Peck failed to hold an evidentiary hearing and adopted MSL’s version of the ESI protocol on an insufficient record and without proper Rule 702 consideration

Since Judge Peck’s earlier order is “non-dispositive,” Judge Carter identified and applied the “clearly erroneous or contrary to law” standard of review in rejecting Plaintiffs’ request to overturn the order. Central to Judge Carter’s reasoning is his assertion that any confusion regarding the ESI protocol is immaterial because the protocol “contains standards for measuring the reliability of the process and the protocol builds in levels of participation by Plaintiffs.” In other words, Judge Carter essentially dismisses Plaintiff’s concerns as premature on the grounds that the current protocol provides a system of checks and balances that protects both parties. To be clear, that doesn’t necessarily mean Plaintiffs won’t get a second bite of the apple if problems with MSL’s productions surface.

For now, however, Judge Carter seems to be saying that although Plaintiffs must live with the current order, they are by no means relinquishing their rights to a fair and just discovery process. In fact, the existing protocol allows Plaintiffs to actively participate in and monitor the entire process closely. For example, Judge Carter writes that, “if the predictive coding software is flawed or if Plaintiffs are not receiving the types of documents that should be produced, the parties are allowed to reconsider their methods and raise their concerns with the Magistrate Judge.”

Judge Carter also specifically addresses Plaintiff’s concerns related to statistical sampling techniques which could ultimately prove to be their meatiest argument. A key area of disagreement between the parties is whether or not MSL is reviewing enough documents to insure relevant documents are not completely overlooked even if this complex process is executed flawlessly. Addressing this point Judge Carter states that, “If the method provided in the protocol does not work or if the sample size is indeed too small to properly apply the technology, the Court will not preclude Plaintiffs from receiving relevant information, but to call the method unreliable at this stage is speculative.”

Although most practitioners are focused on seeing whether and how many of these novel predictive coding issues play out, it is important not to overlook two key nuggets of information lining Judge Carter’s Opinion and Order. First, Judge Carter’s statement that “[t]here simply is no review tool that guarantees perfection” serves as an acknowledgement that “reasonableness” is the standard by which discovery should be measured, not “perfection.” Second, Judge Carter’s acknowledgement that manual review with keyword searches may be appropriate in certain situations should serve as a wake-up call for those who think predictive coding technology will replace all predecessor technologies. To the contrary, predictive coding is a promising new tool to add to the litigator’s tool belt, but it is not necessarily a replacement for all other technology tools.

Plaintiffs in Da Silva Moore may not have received the ruling they were hoping for, but Judge Carter’s Opinion and Order makes it clear that the court house door has not been closed. Given the controversy surrounding this case, one can assume that Plaintiffs are likely to voice many of their concerns at a later date as discovery proceeds. In other words, don’t expect all of these issues to fade away without a fight.

First State Court Issues Order Approving the Use of Predictive Coding

Thursday, April 26th, 2012

On Monday, Virginia Circuit Court Judge James H. Chamblin issued what appears to be the first state court Order approving the use of predictive coding technology for eDiscovery. Tuesday, Law Technology News reported that Judge Chamblin issued the two-page Order in Global Aerospace Inc., et al, v. Landow Aviation, L.P. dba Dulles Jet Center, et al, over Plaintiffs’ objection that traditional manual review would yield more accurate results. The case stems from the collapse of three hangars at the Dulles Jet Center (“DJC”) that occurred during a major snow storm on February 6, 2010. The Order was issued at Defendants’ request after opposing counsel objected to their proposed use of predictive coding technology to “retrieve potentially relevant documents from a massive collection of electronically stored information.”

In Defendants’ Memorandum in Support of their motion, they argue that a first pass manual review of approximately two million documents would cost two million dollars and only locate about sixty percent of all potentially responsive documents. They go on to state that keyword searching might be more cost-effective “but likely would retrieve only twenty percent of the potentially relevant documents.” On the other hand, they claim predictive coding “is capable of locating upwards of seventy-five percent of the potentially relevant documents and can be effectively implemented at a fraction of the cost and in a fraction of the time of linear review and keyword searching.”

In their Opposition Brief, Plaintiffs argue that Defendants should produce “all responsive documents located upon a reasonable inquiry,” and “not just the 75%, or less, that the ‘predictive coding’ computer program might select.” They also characterize Defendants’ request to use predictive coding technology instead of manual review as a “radical departure from the standard practice of human review” and point out that Defendants cite no case in which a court compelled a party to accept a document production selected by a “’predictive coding’ computer program.”

Considering predictive coding technology is new to eDiscovery and first generation tools can be difficult to use, it is not surprising that both parties appear to frame some of their arguments curiously. For example, Plaintiffs either mischaracterize or misunderstand Defendants’ proposed workflow given their statement that Defendants want a “computer program to make the selections for them” instead of having “human beings look at and select documents.” Importantly, predictive coding tools require human input for a computer program to “predict” document relevance. Additionally, the proposed approach includes an additional human review step prior to production that involves evaluating the computer’s predictions.

On the other hand, some of Defendants’ arguments also seem to stray a bit off course. For example, Defendants’ seem to unduly minimize the value of using other tools in the litigator’s tool belt like keyword search or topic grouping to cull data prior to using potentially more expensive predictive coding technology. To broadly state that keyword searching “likely would retrieve only twenty percent of the potentially relevant documents” seems to ignore two facts. First, keyword search for eDiscovery is not dead. To the contrary, keyword searches can be an effective tool for broadly culling data prior to manual review and for conducting early case assessments. Second, the success of keyword searches and other litigation tools depends as much on the end user as the technology. In other words, the carpenter is just as important as the hammer.

The Order issued by Judge Chamblin, the current Chief Judge for the 20th Judicial Circuit of Virginia, states that “Defendants shall be allowed to proceed with the use of predictive coding for purposes of the processing and production of electronically stored information.”  In a hand written notation, the Order further provides that the processing and production is to be completed within 120 days, with “processing” to be completed within 60 days and “production to follow as soon as practicable and in no more than 60 days.” The order does not mention whether or not the parties are required to agree upon a mutually agreeable protocol; an issue that has plagued the court and the parties in the ongoing Da Silva Moore, et. al. v. Publicis Groupe, et. al. for months.

Global Aerospace is the third known predictive coding case on record, but appears to present yet another set of unique legal and factual issues. In Da Silva Moore, Judge Andrew Peck of the Southern District of New York rang in the New Year by issuing the first known court order endorsing the use of predictive coding technology.  In that case, the parties agreed to the use of predictive coding technology, but continue to fight like cats and dogs to establish a mutually agreeable protocol.

Similarly, in the 7th Federal Circuit, Judge Nan Nolan is tackling the issue of predictive coding technology in Kleen Products, LLC, et. al. v. Packaging Corporation of America, et. al. In Kleen, Plaintiffs basically ask that Judge Nolan order Defendants to redo their production even though Defendants have spent thousands of hours reviewing documents, have already produced over a million documents, and their review is over 99 percent complete. The parties have already presented witness testimony in support of their respective positions over the course of two full days and more testimony may be required before Judge Nolan issues a ruling.

What is interesting about Global Aerospace is that Defendants proactively sought court approval to use predictive coding technology over Plaintiffs’ objections. This scenario is different than Da Silva Moore because the parties in Global Aerospace have not agreed to the use of predictive coding technology. Similarly, it appears that Defendants have not already significantly completed document review and production as they had in Kleen Products. Instead, the Global Aerospace Defendants appear to have sought protection from the court before moving full steam ahead with predictive coding technology and they have received the court’s blessing over Plaintiffs’ objection.

A key issue that the Order does not address is whether or not the parties will be required to decide on a mutually agreeable protocol before proceeding with the use of predictive coding technology. As stated earlier, the inability to define a mutually agreeable protocol is a key issue that has plagued the court and the parties for months in Da Silva Moore, et. al. v. Publicis Groupe, et. al. Similarly, in Kleen, the court was faced with issues related to the protocol for using technology tools. Both cases highlight the fact that regardless of which eDiscovery technology tools are selected from the litigator’s tool belt, the tools must be used properly in order for discovery to be fair.

Judge Chamblin left the barn door wide open for Plaintiffs to lodge future objections, perhaps setting the stage for yet another heated predictive coding battle. Importantly, the Judge issued the Order “without prejudice to a receiving party” and notes that parties can object to the “completeness or the contents of the production or the ongoing use of predictive coding technology.”  Given the ongoing challenges in Da Silva Moore and Kleen, don’t be surprised if the parties in Global Aerospace Inc. face some of the same process-based challenges as their predecessors. Hopefully some of the early challenges related to the use of first generation predictive coding tools can be overcome as case law continues to develop and as next generation predictive coding tools become easier to use. Stay tuned as the facts, testimony, and arguments related to Da Silva Moore, Kleen Products, and Global Aerospace Inc. cases continue to evolve.

Breaking News: Court Clarifies Duty to Preserve Evidence, Denies eDiscovery Sanctions Motion Against Pfizer

Wednesday, April 18th, 2012

It is fortunately becoming clearer that organizations do not need to preserve information until litigation is “reasonably anticipated.” In Brigham Young University v. Pfizer (D. Utah Apr. 16, 2012), the court denied the plaintiff university’s fourth motion for discovery sanctions against Pfizer, likely ending its chance to obtain a “game-ending” eDiscovery sanction. The case, which involves disputed claims over the discovery and development of prominent anti-inflammatory drugs, is set for trial on May 29, 2012.

In Brigham Young, the university pressed its case for sanctions against Pfizer based on a vastly expanded concept of a litigant’s preservation duty. Relying principally on the controversial Phillip M. Adams & Associates v. Dell case, the university argued that Pfizer’s “duty to preserve runs to the legal system generally.” The university reasoned that just as the defendant in the Adams case was “sensitized” by earlier industry lawsuits to the real possibility of plaintiff’s lawsuit, Pfizer was likewise put on notice of the university’s claims due to related industry litigation.

The court rejected such a sweeping characterization of the duty to preserve, opining that it was “simply too broad.” Echoing the concerns articulated by the Advisory Committee when it framed the 2006 amendments to the Federal Rules of Civil Procedure (FRCP), the court took pains to emphasize the unreasonable burdens that parties such as Pfizer would face if such a duty were imposed:

“It is difficult for the Court to imagine how a party could ever dispose of information under such a broad duty because of the potential for some distantly related litigation that may arise years into the future.”

The court also rejected the university’s argument because such a position failed to appreciate the basic workings of corporate records retention policies. As the court reasoned, “[e]vidence may simply be discarded as a result of good faith business procedures.” When those procedures operate to inadvertently destroy evidence before the duty to preserve is triggered, the court held that sanctions should not issue: “The Federal Rules protect from sanctions those who lack control over the requested materials or who have discarded them as a result of good faith business procedures.”

The Brigham Young case is significant for a number of reasons. First, it reiterates that organizations need not keep electronically stored information (ESI) for legal or regulatory purposes until the duty to preserve is reasonably anticipated. As American courts have almost uniformly held since the 1997 case of Concord Boat Corp. v. Brunswick Corp., organizations are not required to keep every piece of paper, every email, every electronic document and every back up tape.

Second, Brigham Young emphasizes that organizations can and should use document retention protocols to rid themselves of data stockpiles. Absent a preservation duty or other exceptional circumstances, paring back ESI pursuant to “good faith business procedures” (such as a neutral retention policy) will be protected under the law.

Finally, Brigham Young narrows the holding of the Adams case to its particular facts. The Adams case has been particularly troublesome to organizations as it arguably expanded their preservation duty in certain circumstances. However, Brigham Young clarified that this expansion was unwarranted in the instant case, particularly given that Pfizer documents were destroyed pursuant to “good faith business procedures.”

In summary, Brigham Young teaches that organizations will be protected from eDiscovery sanctions to the extent they destroy ESI in good faith pursuant to a reasonable records retention policy. This will likely bring a sigh of relief to enterprises struggling with the information explosion since it encourages confident deletion of data when the coast is clear of a discrete litigation event.

Take Two and Call me in the Morning: U.S. Hospitals Need an Information Governance Remedy

Wednesday, April 11th, 2012

Given the vast amount of sensitive information and legal exposure faced by hospitals today it’s a mystery why these organizations aren’t taking advantage of enabling technologies to minimize risk. Both HIPPA and the HITECH Act are often achieved by manual, ad hoc methods, which are hazardous at best. In the past, state and federal auditing environments have not been very aggressive in ensuring compliance, but that is changing. While many hospitals have invested in high tech records management systems (EMR/EHR), those systems do not encompass the entire information and data environment within a hospital. Sensitive information often finds its way into and onto systems outside the reach of EMR/EHR systems, bringing with it increased exposure to security breach and legal liability.

This information overload often metastasizes into email (both hospital and personal), attachments, portable storage devices, file, web and development servers, desktops and laptops, home or affiliated practice’s computers and mobile devices such as iPads and smart phones. These avenues for the dissemination and receipt of information expand the information governance challenge and data security risks. Surprisingly, the feedback from the healthcare sector suggests that hospitals rarely get sued in federal court.

One place hospitals do not want to be is the “Wall of Shame,” otherwise known as the HHS website that has detailed 281 Health Insurance Portability and Accountability Act (HIPAA) security violations that have affected more than 500 individuals as of June 9, 2011. Overall, physical theft and loss accounted for about 63% of the reported breaches. Unauthorized access / disclosure accounted for another 16%, while hacking was only 6%. While Software Advice reasons these statistics seem to indicate that physical theft has been the reason for the majority of breaches, it should also be considered that due to the lack of data loss prevention technology, many hospitals are unaware of breaches that have occurred and therefore cannot report on them.

There are a myriad of reasons hospitals aren’t landing on the front page of the newspaper with the same frequency as other businesses and government agencies when it comes to security breach, and document retention and eDiscovery blunders. But, the underlying contagion is not contained and it certainly is not benign. Feedback from the field reveals some alarming symptoms of the unhealthy state of healthcare information governance, including:

  • uncontrolled .pst files
  • exploding storage growth
  • missing or incomplete data retention rules
  • doctors/nurses storing and sending sensitive data via their personal email, iPads and smartphones
  • encryption rules that rely on individuals to determine what to encrypt
  • data backup policies that differ from data retention and information governance rules
  • little to no compliance training
  • and many times non-existent data loss prevention efforts.

This results in the need for more storage, while creating larger legal liability, an indefensible eDiscovery posture, and the risk of breach.

The reason this problem remains latent in most hospitals is because they are not yet feeling the pain of the problem from massive and multiple lawsuits, large invoices from outside law firms or the operational challenges/costs incurred from searching through many mountains of dispersed data.  The symptoms are observable, the pathology is present, the problem is real and the pain is about to acutely present itself as more states begin to deeply embrace eDiscovery requirements and government regulators increase audit frequency and fine amounts. Another less talked about reason hospitals have not had the same pressure to search and produce their data pursuant to litigation is due to cases being settled before they even get to the discovery stage. The lack of well-developed information governance practices leads to cases being settled too soon, for too much money when they otherwise may not have needed to settle at all.

The Patient’s Symptoms Were Treated, but the Patient’s Data Still Needs Medicine

What is still unclear is why hospitals, given their compliance requirements and tightening IT budgets, aren’t archiving, classifying, and protecting their data with the same type of innovation they are demonstrating in their cutting edge patient care technology. In this realm, two opposite ends of the IT innovation spectrum seem to co-exist in the hospital’s data environment. This dichotomy leaves much of a hospital’s data unprotected, unorganized and uncontrolled. Hospitals are experiencing increasing data security breaches and often are not aware that a breach or data loss has occurred. As more patient data is created and copied in electronic format, used in and exposed by an increasing number of systems and delivered on emerging mobile platforms, the legal and audit risks are compounding on top of a faulty or missing information governance foundation.

Many hospitals have no retention schedules or data classification rules applied to existing information, which often results in a checkbox compliance mentality and a keep-everything-forever practice. Additionally, many hospitals have no ability to apply a comprehensive legal hold across different data sources and lack technology to stop or alert them when there has been a breach.

Information Governance and Data Health in Hospitals

With the mandated push for paper to be converted to digital records, many hospitals are now evaluating the interplay of their various information management and distribution systems. They must consider the newly scanned legacy data (or soon to be scanned), and if they have been operating without an archive, they must now look to implement a searchable repository where they can collectively apply document retention and records management while decreasing the amount of storage needed to retain the data.  We are beginning to see internal counsel leading the way to make this initiative happen across business units. Different departments are coming together to pool resources in tight economic and high regulation times that require collaboration.  We are at the beginning of a widespread movement in the healthcare industry for archiving, data classification and data loss prevention as hospitals link their increasing compliance and data loss requirements with the need to optimize and minimize storage costs. Finally, it comes as no surprise that the amount of data hospitals are generating is crippling their infrastructures, breaking budgets and serving as the primary motivator for change absent lawsuits and audits.

These factors are bringing together various stakeholders into the information governance conversation, helping to paint a very clear picture that putting in place a comprehensive information governance solution is in the entire hospital’s best interest. The symptoms are clear, the problem is treatable, the prescription for information governance is well proven. Hospitals can begin this process by calling an information governance meeting with key stakeholders and pursuing an agenda set around examining their data map and assessing areas of security vulnerability, as well as auditing the present state of compliance with regulations for the healthcare industry.

Editor’s note: This post was co-authored with Eric Heck, Healthcare Account Manager at Symantec.  Eric has over 25 years of experience in applying technology to emerging business challenges, and currently works with healthcare providers and hospitals to manage the evolving threat landscape of compliance, security, data loss and information governance within operational, regulatory and budgetary constraints.

The eDiscovery “Passport”: The First Step to Succeeding in International Legal Disputes

Monday, April 2nd, 2012

The increase in globalization continues to erase borders throughout the world economy. Organizations now routinely conduct business in countries that were previously unknown to their industry vertical.  The trend of global integration is certain to increase, with reports such as the Ernst & Young 2011 Global Economic Survey confirming that 74% of companies believe that globalization, particularly in emerging markets, is essential to their continued vitality.

Not surprisingly, this trend of global integration has also led to a corresponding increase in cross-border litigation. For example, parties to U.S. litigation are increasingly seeking discovery of electronically stored information (ESI) from other litigants and third parties located in Continental Europe and the United Kingdom. Since traditional methods under the Federal Rules of Civil Procedure (FRCP) may be unacceptable for discovering ESI in those forums, the question then becomes how such information can be obtained.

At this point, many clients and their counsel are unaware how to safely navigate these international waters. The short answer for how to address these issues for much of Europe would be to resort to the Hague Convention of March 18, 1970 on the Taking of Evidence Abroad in Civil or Commercial Matters (Hague Convention). Simply referring to the Hague Convention, however, would ignore the complexities of electronic discovery in Europe. Worse, it would sidestep the glaring knowledge gap that exists in the United States regarding the cultural differences distinguishing European litigation from American proceedings.

The ability to bridge this gap with an awareness of the discovery processes in Europe is essential. Understanding that process is similar to holding a valid passport for international travel. Just as a passport is required for travelers to successfully cross into foreign lands, an “eDiscovery Passport™” is likewise necessary for organizations to effectively conduct cross-border discovery.

The Playing Field for eDiscovery in Continental Europe

Litigation in Continental Europe and is culturally distinct from American court proceedings. “Discovery,” as it is known in the United States, does not exist in Europe. Interrogatories, categorical document requests and requests for admissions are simply unavailable as European discovery devices. Instead, European countries generally allow only a limited exchange of documents, with parties typically disclosing only that information that supports their claims.

The U.S. Court of Appeals for the Seventh Circuit recently commented on this key distinction between European and American discovery when it observed that “the German legal system . . . does not authorize discovery in the sense of Rule 26 of the Federal Rules of Civil Procedure.” The court went on to explain that “[a] party to a German lawsuit cannot demand categories of documents from his opponent. All he can demand are documents that he is able to identify specifically—individually, not by category.” Heraeus Kulzer GmbH v. Biomet, Inc., 633 F.3d 591, 596 (7th Cir. 2011).

Another key distinction to discovery in Continental Europe is the lack of rules or case law requiring the preservation of ESI or paper documents. This stands in sharp contrast to American jurisprudence, which typically requires organizations to preserve information as soon as they reasonably anticipate litigation. See, e.g., Micron Technology, Inc. v. Rambus Inc., 645 F.3d 1311, 1320 (Fed.Cir. 2011). In Europe, while an implied preservation duty could arise if a court ordered the disclosure of certain materials, the penalties for European non-compliance are typically not as severe as those issued by American courts.

Only the nations of the United Kingdom, from which American notions of litigation are derived, have discovery obligations that are more similar to those in the United States. For example, in the combined legal system of England and Wales, a party must disclose to the other side information adverse to its claims. Moreover, England and Wales also suggest that parties should take affirmative steps to prepare for disclosure. According to the High Court in Earles v Barclays Bank Plc [2009] EWHC 2500 (Mercantile) (08 October 2009), this includes having “an efficient and effective information management system in place to provide identification, preservation, collection, processing, review analysis and production of its ESI in the disclosure process in litigation and regulation.” For organizations looking to better address these issues, a strategic and intelligent information governance plan offers perhaps the best chance to do so.

Hostility to International Discovery Requests

Despite some similarities between the U.S. and the U.K., Europe as a whole retains a certain amount of cultural hostility to pre-trial discovery. Given this fact, it should come as no surprise that international eDiscovery requests made pursuant to the Hague Convention are frequently denied. Requests are often rejected because they are overly broad.  In addition, some countries such as Italy simply refuse to honor requests for pre-trial discovery from common law countries like the United States. Moreover, other countries like Austria are not signatories to the Hague Convention and will not accept requests made pursuant to that treaty. To obtain ESI from those countries, litigants must take their chances with the cumbersome and time-consuming process of submitting letters rogatory through the U.S. State Department. Finally, requests for information that seek email or other “personal information” (i.e., information that could be used to identify a person) must additionally satisfy a patchwork of strict European data protection rules.

Obtaining an eDiscovery Passport

This backdrop of complexity underscores the need for both lawyers and laymen to understand the basic principles governing eDisclosure in Europe. Such a task should not be seen as daunting. There are resources that provide straightforward answers to these issues at no cost to the end-user. For example, Symantec has just released a series of eDiscovery Passports™ that touch on the basic issues underlying disclosure and data privacy in the United Kingdom, France, Germany, Holland, Belgium, Austria, Switzerland, Italy and Spain. Organizations such as The Sedona Conference have also made available materials that provide significant detail on these issues, including its recently released International Principles on Discovery, Disclosure and Data Protection.

These resources can provide valuable information to clients and counsel alike and better prepare litigants for the challenges of pursuing legal rights across international boundaries. By so doing, organizations can moderate the effects of legal risk and more confidently pursue their globalization objectives.

eDiscovery Down Under: New Zealand and Australia Are Not as Different as They Sound, Mate!

Thursday, March 29th, 2012

Shortly after arriving in Wellington, New Zealand, I picked up the Dominion Post newspaper and read its lead article: a story involving U.S. jurisdiction being exercised over billionaire NZ resident Mr. Kim Dotcom. The article reinforced the challenges we face with blurred legal and data governance issues presented by the globalization of the economy and the expansive reach of the internet. Originally from Germany, and having changed his surname to reflect the origin of his fortune, Mr. Dotcom has become all too familiar in NZ of late. He has just purchased two opulent homes in NZ, and has become an internationally controversial figure for internet piracy. Mr. Dotcom’s legal troubles arise out of his internet business that enables illegal downloads of pirated material between users, which allegedly is powering the largest copyright infringement in global history. It is approximated that his website constitutes 4% of the internet traffic in the world, which means there could be tons of discovery in this case (or, cases).

The most recent legal problems Mr. Dotcom faces are with U.S. authorities who want to extradite him to face copyright charges worth $500 million by his Megaupload file-sharing website. From a criminal and record-keeping standpoint, Mr. Dotcom’s issues highlight the need for and use of appropriate technologies. In order to establish a case against him, it’s likely that search technologies were deployed by U.S. intelligence agencies to piece together Mr. Dotcom’s activities, banking information, emails and the data transfers on his site. In a case like this, where intelligence agencies would need to collect, search and cull email from so many different geographies and data sources down to just the relevant information, using technologies that link email conversation threads and give insight into a data collection set from a transparent search point of view would provide immense value. Additionally, the Immigration bureau in New Zealand has been required to release hundreds of documents about Mr. Dotcom’s residency application that were requested under the Official Information Act (OIA). The records that Immigration had to produce were likely pulled from their archive or records management system in NZ, and then redacted for private information before production to the public.

The same tools are needed in Australia and New Zealand to build a criminal case or to comply with the OIA that we use here in the U.S for investigatory and compliance purposes, as well as for litigation. The trend in information governance technology in APAC is trending first toward government agencies who are purchasing archiving and eDiscovery technologies more rapidly than private companies. Why is this? One reason could be that because the governments in APAC have a larger responsibility for healthcare, education and the protection of privacy; they are more invested in the compliance requirements and staying off the front page of the news for shortcomings. APAC private enterprises that are small or mid-sized and are not yet doing international business do not have the same archiving and eDiscovery needs large government agencies do, nor do they face litigation in the same way their American counterparts do. Large global companies should assume no matter where they are based, that they may be availed to litigation where they are doing business.

An interesting NZ use case on the enterprise level is that of Transpower (the quasi-governmental energy agency), where compliance with both the “private and public” requirements are mandatory. Transpower is an organisation that is government-owned, yet operates for a profit. Sally Myles, an experienced records manager that recently came to Transpower to head up information governance initiatives, says,

“We have to comply with the Public Records Act of 2005, public requests for information are frequent as we and are under constant scrutiny about where we will develop our plants. We also must comply with the Privacy Act of 1993. My challenge is to get the attention of our leadership to demonstrate why we need to make these changes and show them a plan for implementation as well as cost savings.”

Myles’ comments indicate NZ is facing many of the same information challenges we are here in the US with storage, records management and searching for meaningful information within the organisation.

Australia, New Zealand and U.S. Commonalities

In Australia and NZ, litigation is not seen as a compelling business driver the same way it is in the U.S. This is because many of the information governance needs of organisations are driven by regulatory, statutory and compliance requirements and the environment is not as litigious as it is in the U.S. The Official Information Act in NZ, and the Freedom of Information in Australia, are analogous to the Freedom of Information Act (FOIA) here in the U.S. The requirements to produce public records alone justify the use of technology to provide the ability to manage large volumes of data and produce appropriately redacted information to the public. This is true regardless of litigation. Additionally, there are now cases like DuPont or Mr. Dotcom’s, that legitimatize the risk of litigation with the U.S. The fact that implementing an information governance product suite will also enable a company to be prepared for litigation is a beneficial by-product for many entities as they need technology for record keeping and privacy reasons anyway. In essence, the same capabilities are achieved at the end of the day, regardless of the impetus for implementing a solution.

The Royal Commission – The Ultimate eDiscovery Vehicle

One way to think about the Australian Royal Commission (RCs) is to see it as a version of the U.S.’ government investigation. A key difference, however, is that in the case of the U.S. government, an investigation is typically into private companies. Conversely, a Royal Commission is typically an investigation into a government body after a major tragedy and it is initiated by the Head of State. A RC is an ad-hoc, formal, public inquiry into a defined issue with considerable discovery powers. These powers can be greater than those of a judge and are restricted to the scope and terms of reference of the Commission. RCs are called to look into matters of great importance and usually have very large budgets. The RC is charged with researching the issue, consulting experts both within and outside of government and developing findings to recommend changes to the law or other courses of actions. RCs have immense investigatory powers, including summoning witnesses under oath, offering of indemnities, seizing of documents and other evidence (sometimes including those normally protected, such as classified information), holding hearings in camera if necessary and—in a few cases—compelling government officials to aid in the execution of the Commission.

These expansive powers give the RC the opportunity to employ state of the art technology and to skip the slow bureaucratic decision making processes found within the government when it comes to implementing technological change. For this reason, initially, eDiscovery will continue to increase in the government sector at a more rapid pace than in the private in the Asia Pacific region. This is because litigation is less prevalent in the Asia Pacific, and because the RC is a unique investigatory vehicle with the most far-reaching authority for discovering information. Moreover, the timeframes for RCs are tight and their scopes are broad, making them hair on fire situations that move quickly.

While the APAC information management environment does not have the exact same drivers the U.S. market does, it definitely has the same archiving, eDiscovery and technology needs for different reasons. Another key point is that the APAC archiving and eDiscovery market will likely be driven by the government as records, search and production requirements are the main compliance needs in Australia and NZ. APAC organisations would be well served by beginning to modularly implement key elements of an information governance plan, as globalization is driving us all to a more common and automated approach to data management.