24h-payday

Posts Tagged ‘e-discovery costs’

The Gartner 2013 Magic Quadrant for eDiscovery Software is Out!

Wednesday, June 12th, 2013

This week marks the release of the 3rd annual Gartner Magic Quadrant for e-Discovery Software report.  In the early days of eDiscovery, most companies outsourced almost every sizeable project to vendors and law firms so eDiscovery software was barely a blip on the radar screen for technology analysts. Fast forward a few years to an era of explosive information growth and rising eDiscovery costs and the landscape has changed significantly. Today, much of the outsourced eDiscovery “services” business has been replaced by eDiscovery software solutions that organizations bring in house to reduce risk and cost. As a result, the enterprise eDiscovery software market is forecast to grow from $1.4 billion in total software revenue worldwide in 2012 to $2.9 billion by 2017. (See Forecast:  Enterprise E-Discovery Software, Worldwide, 2012 – 2017, Tom Eid, December, 2012).

Not surprisingly, today’s rapidly growing eDiscovery software market has become significant enough to catch the attention of mainstream analysts like Gartner. This is good news for company lawyers who are used to delegating enterprise software decisions to IT departments and outside law firms. Because today those same company lawyers are involved in eDiscovery and other information management software purchasing decisions for their organizations. While these lawyers understand the company’s legal requirements, they do not necessarily understand how to choose the best technology to address those requirements. Conversely, IT representatives understand enterprise software, but they do not necessarily understand the law. Gartner bridges this information gap by providing in depth and independent analysis of the top eDiscovery software solutions in the form of the Gartner Magic Quadrant for e-Discovery Software.

Gartner’s methodology for preparing the annual Magic Quadrant report is rigorous. Providers must meet quantitative requirements such as revenue and significant market penetration to be included in the report. If these threshold requirements are met then Gartner probes deeper by meeting with company representatives, interviewing customers, and soliciting feedback to written questions. Providers that make the cut are evaluated across four Magic Quadrant categories as either “leaders, challengers, niche players, or visionaries.” Where each provider ends up on the quadrant is guided by an independent evaluation of each provider’s “ability to execute” and “completeness of vision.” Landing in the “leaders” quadrant is considered a top recognition.

The nine Leaders in this year’s Magic Quadrant have four primary characteristics (See figure 1 above).

The first is whether the provider has functionality that spans both sides of the electronic discovery reference model (EDRM) (left side – identification, preservation, litigation hold, collection, early case assessment (ECA) and processing and right-side – processing, review, analysis and production). “While Gartner recognizes that not all enterprises — or even the majority — will want to perform legal-review work in-house, more and more are dictating what review tools will be used by their outside counsel or legal-service providers. As practitioners become more sophisticated, they are demanding that data change hands as little as possible, to reduce cost and risk. This is a continuation of a trend we saw developing last year, and it has grown again in importance, as evidenced both by inquiries from Gartner clients and reports from vendors about the priorities of current and prospective customers.”

We see this as consistent with the theme that providers with archiving solutions designed to automate data retention and destruction policies generally fared better than those without archiving technology. The rationale is that part of a good end-to-end eDiscovery strategy includes proactively deleting data organizations do not have a legal or business need to keep. This approach decreases the amount of downstream electronically stored information (ESI) organizations must review on a case-by-case basis so the cost savings can be significant.

Not surprisingly, whether or not a provider offers technology assisted review or predictive coding capabilities was another factor in evaluating each provider’s end-to-end functionality. The industry has witnessed a surge in predictive coding case law since 2012 and judicial interest has helped drive this momentum. However, a key driver for implementing predictive coding technology is the ability to reduce the amount of ESI attorneys need to review on a case-by-case basis. Given the fact that attorney review is the most expensive phase of the eDiscovery process, many organizations are complementing their proactive information reduction (archiving) strategy with a case-by-case information reduction plan that also includes predictive coding.

The second characteristic Gartner considered was that Leaders’ business models clearly demonstrate that their focus is software development and sales, as opposed to the provision of services. Gartner acknowledged that the eDiscovery services market is strong, but explains that the purpose of the Magic Quadrant is to evaluate software, not services. The justification is that “[c]orporate buyers and even law firms are trending towards taking as much e-Discovery process in house as they can, for risk management and cost control reasons. In addition, the vendor landscape for services in this area is consolidating. A strong software offering, which can be exploited for growth and especially profitability, is what Gartner looked for and evaluated.”

Third, Gartner believes the solution provider market is shrinking and that corporations are becoming more involved in buying decisions instead of deferring technology decisions to their outside law firms. Therefore, those in the Leaders category were expected to illustrate a good mix of corporate and law firm buying centers. The rationale behind this category is that law firms often help influence corporate buying decisions so both are important players in the buying cycle. However, Gartner also highlighted that vendors who get the majority of their revenues from the “legal solution provider channel” or directly from “law firms” may soon face problems.

The final characteristic Gartner considered for the Leaders quadrant is related to financial performance and growth. In measuring this component, Gartner explained that a number of factors were considered. Primary among them is whether the Leaders are keeping pace with or even exceeding overall market growth. (See “Forecast:  Enterprise E-Discovery Software, Worldwide, 2012 – 2017,” Tom Eid, December, 2012).

Companies landing in Gartner’s Magic Quadrant for eDiscovery Software have reason to celebrate their position in an increasingly competitive market. To review Gartner’s full report yourself, click here. In the meantime, please feel free to share your own comments below as the industry anxiously awaits next year’s Magic Quadrant Report.

Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.

South Africa’s Motivation for Information Governance: Privacy, Fraud and the Cloud

Tuesday, March 19th, 2013

On a recent trip to South Africa, where Symantec sponsored an event with PricewaterhouseCoopers (PwC) entitled The Protection of Personal Information (POPI) Drives Information Governance, customers and partners shared important insights. One major concern the attendees had was how they will comply with the newly proposed privacy legislation set to pass any day now.

POPI is the first comprehensive body of law addressing privacy in the country. Personal data is defined as a natural person’s name, date of birth, national identification number, passport number, health or credit information and other personally identifiable information. The bill has eight principles, each of which addresses aspects of how data must be collected, stored, processed, secured, expired and how access may be granted. This bill will apply to both public and private organizations and is driving the need for archiving, classification, eDiscovery, and data loss prevention technology.

Interestingly, the main motivator for purchasing eDiscovery technology will be the need for organizations in Africa to be able to conduct internal investigations to detect fraud. South Africa’s recent POPI legislation was crafted in order to address the age of digital information and the risks associated with it, but also to instill a level of confidence from the global economy in South Africa as a safe place to do business. A recent survey by Compuscan found that South Africa and Nigeria have the highest number of reported fraud cases in Africa. In addition, fraud related crimes have cost African businesses and governments at least $10.9 billion in 2011-12. Of the 875 reported cases, 40% of fraud perpetrators were in upper management.

Archiving the email of top management is a recommended best practice to address this fraud because it ensures that there will be a record of electronic communications should an investigation or lawsuit be necessary. Similarly, leveraging in-house eDiscovery and data loss prevention (DLP) technology enables investigators within the organizations to collect and analyze these emails in conjunction with other pertinent information to detect and even prevent fraud. To date, the majority of organizations in South Africa lack this kind of capability because they have not invested in technology.

Because corruption and fraud have been impediments to doing business in South Africa in the past, businesses and the government are taking steps to address these issues. Having the ability to conduct internal investigations will be a huge advantage for organizations looking to gain control over their information and those who commit fraud. PwC Partner Kris Budnik noted at the conference, “Many times when clients call me for an emergency forensic investigation, about 50% of the time in South Africa I cannot help them.  The reason for this is that the clients are not keeping the appropriate information governance systems in place and not keeping log files. Many times when we go to collect evidence, none is there because it has truly been overwritten in the data environment due to poor information governance practices.”

Litigation does not appear to be the biggest factor for purchasing eDiscovery technologies and implementing workflows as one might expect. The reason for this is unclear, but may be related to a less aggressive litigation profile as compared to that of the U.S. Much of the discovery in South Africa that involves electronically stored information is printed, reviewed and produced in paper format. The concern over retaining relevant metadata and reviewing/producing data in the format data was originally created does not seem to be top of mind for litigators.

Litigators in South Africa are not taking advantage of the rich information in metadata to supplement their cases or to challenge opposing counsel’s claims/productions. Also of concern is the inability to deduplicate and sort data once metadata is removed. The reason for this is most likely because there have not been enough cases where lack of metadata has been challenged. With time, and as cross-border litigation increases, there will be more demand for eDiscovery technology in the traditional legal context.

The increase in privacy concerns and internal fraud investigations presents a compelling reason for investing in archiving, eDiscovery, and DLP technologies for businesses in South Africa. Many organizations are moving data to the cloud to streamline POPI related objectives faster and because outsourcing their infrastructure is very attractive to organizations that don’t want to own the responsibilities of managing their information on premise. The main business drivers for cloud archiving in South Africa are: email continuity, cost and compliance.

It is interesting to observe how different countries and economies respond to technology and what drives use cases. The legal frameworks in each jurisdiction around the world vary, but the great equalizer will be technology. This is because whether it is privacy, litigation or fraud driving the information governance plan, the technology is the same.

Check out this article for more information on privacy legislation in South Africa.

Available soon: please visit our eDiscovery passport page for more information the legal system, eDiscovery, privacy and data protection in South Africa and other countries.

 

Breaking News: Over $12 million in Attorney Fees Awarded in Patent Case Involving Predictive Coding

Thursday, February 14th, 2013

A federal judge for the Southern District of California rang in the month of February by ordering plaintiffs in a patent related case to pay a whopping $12 million in attorney fees. The award included more than $2.8 million in “computer assisted” review fees and to add insult to injury, the judge tacked on an additional $64,316.50 in Rule 11 sanctions against defendants’ local counsel. Plaintiffs filed a notice of appeal on February 13th, but regardless of the final outcome, the case is chock-full of important lessons about patent litigation, eDiscovery and the use of predictive coding technology.

The Lawsuit

In Gabriel Technologies Corp. v. Qualcomm Inc., plaintiffs filed a lawsuit seeking over $1 billion in damages. Among its eleven causes of action were claims for patent infringement and misappropriation of trade secrets.  The Court eventually dismissed or granted summary judgment in defendants’ favor as to all of plaintiffs’ claims making defendants the prevailing party and prompting Defendants’ subsequent request for attorneys’ fees.

In response to defendants’ motion for attorney fees, U. S. District Judge Anthony J. Battaglia relied on plaintiffs’ repeated email references to “the utter lack of a case” and their inability to identify the alleged patent inventors to support his finding that their claims were brought in “subjective bad faith” and were “objectively baseless.” Given these findings, Judge Battaglia determined that an award of attorney fees was warranted.

The Attorney Fees Award

The judge then turned to the issue of whether or not defendants’ fee request for $13,465,331.01 was reasonable. He began by considering how defendants itemized their fees which were broken down as follows:

  • $10,244,053 for its outside counsel Cooley LLP (“Cooley”);
  • $391,928.91 for document review performed by Black Letter Discovery, Inc. (“Black Letter”); and
  • $2,829,349.10 for a document review algorithm generated by outside vendor H5.

The court also considered defendants’ request that plaintiffs’ local counsel be held jointly and severally liable for the entire fee award based on the premise that local counsel is required to certify that all pleadings are legally tenable and “well-grounded in fact” under Federal Rule of Civil Procedure 11.

Following a brief analysis, Judge Battaglia found the overall request “reasonable,” but reduced the fee award by $1 million. In lieu of holding local counsel jointly liable, the court chose to sanction local counsel in the amount of $64,316.50 (identical to the amount of local counsel’s fees) for failing to “undertake a reasonable investigation into the merits of the case.”

Three Lessons Learned

The case is important on many fronts. First, the decision makes clear that filing baseless patent claims can lead to financial consequences more severe than many lawyers might expect. If reviewed and upheld on appeal, counsel in the Ninth Circuit accustomed to fending off unsubstantiated patent or misappropriation claims will be armed with an important new tool to ward off would-be patent trolls.

Second, Judge Battaglia’s decision to order Rule 11 sanctions should serve as a wake-up call for local counsel. The ruling reinforces the fact that merely rubber-stamping filings and passively monitoring cases is a risky proposition. Gabriel Technologies illustrates the importance of properly monitoring lead counsel and the consequences of not complying with the mandate of Rule 11 whether serving as lead or local counsel.

The final lesson relates to curbing the costs of eDiscovery and the importance of understanding tools like predictive coding technology. The court left the barn door wide open for plaintiffs to attack defendants’ predictive coding and other fees as “unreasonable,” but plaintiffs didn’t bite. In evaluating H5’s costs, the court determined that Cooley’s review fees were reasonable because Cooley used H5’s “computer-assisted” review services to apparently cull down 12 million documents to a more reasonable number of documents prior to manual review. Although one would expect this approach to be less expensive than paying attorneys to review all 12 million documents, $2,829,349.10 is still an extremely high price to pay for technology that is expected to help cut traditional document review costs by as much as 90 percent.

Plaintiffs were well-positioned to argue that predictive coding technology should be far less expensive because the technology allows a fraction of documents to be reviewed at a fraction of the cost compared to traditional manual review. These savings are possible because a computer is used to evaluate how human reviewers categorize a small subset of documents in order to construct and apply an algorithm that ranks the remaining documents by degree of responsiveness automatically. There are many tools on the market that vary drastically in quality and price, but a price tag approaching $3 million is extravagant and should certainly raise a few eyebrows in today’s predictive coding market. Whether or not plaintiffs missed an opportunity to challenge the reasonableness of defendants’ document review approach may never be known. Stay tuned to see if these and other arguments surface on appeal.

Q&A with Allison Walton of Symantec and Laura Zubulake, Author of Zubulake’s e-Discovery: The Untold Story of my Quest for Justice

Monday, February 4th, 2013

The following is my Q&A with Laura Zubulake, Author of Zubulake’s e-Discovery: The Untold Story of my Quest for Justice.

Q: Given your case began in 2003, and the state of information governance today, do you believe that adoption to has been too slow? Do you think organizations in 2013, ten years later, have come far enough in managing their information?

A: From a technology standpoint, the advancements have been significant. The IT industry has come a long way with regard to the tools available to conduct eDiscovery. Alternatively, surveys indicate a significant percentage of organizations do not prioritize information management and have not established eDiscovery policies and procedures. This is disappointing. The fact that organizations apparently do not understand the value of proactively managing information only puts them at a competitive disadvantage and at increased risk.

 Q: Gartner predicts that the market will be $2.9 billion by 2017. Given this prediction, don’t you think eDiscovery is basically going to be absorbed as a business process and not something so distinct as to require outside 3rd party help? 

A: First, as a former financial executive those predictions, if realized, are reasonably attractive. Any business that can generate double-digit revenue growth until 2017, in this economy and interest rate environment, is worthy of note (assuming costs are controlled). Second, here I would like to distinguish between information governance and eDiscovery. I view eDiscovery as a subset of a broader information governance effort. My case while renowned for eDiscovery, at its essence, was about information. I insisted on searching for electronic documents because I understood the value and purpose of information. I could not make strategic decisions without, what I refer to as, “full” information. The Zubulake opinions were a result of my desire for information, not the other way around. I believe corporations will increasingly recognize the need to proactively manage information for business, cost, legal, and risk purposes. As such, I think information governance will become more of a business process, just like any management, operational, product, and finance process.

With regard to eDiscovery, I think there will continue to be a market for outside third-party assistance. eDiscovery requires specific skills and technologies. Companies lacking financial resources and expertise, and requiring assistance to address the volume of data will likely deem it economical to outsource eDiscovery efforts. As with any industry, eDiscovery will evolve.  The sector has grown quickly. There will be consolidation. Eventually, the fittest will survive.

Q: What do you think about the proposed changes to the FRCP regarding preservation? 

A: As a former plaintiff (non-attorney), eDiscovery was (to me) about preservation. Very simply, documents could not be collected, reviewed, and produced if they had not been preserved. Any effort to clarify preservation rules would benefit all parties—uncertainty created challenges. Of course, there needs to be a balance between overwhelming corporations with legal requirements and costs versus protecting a party’s rights to evidence. Apparently, the current proposals do not specifically pertain to preservation. They concern the scope of discovery and proportionality and thus indirectly address the issue of preservation. While this would be helpful, it is not ideal. Scope is, in part, a function of relevance – a frequently debated concept. What was relevant to me might not have been relevant to others. Regarding proportionality, my concern is perspective.  Too often I find discussions about proportionality, stem from the defendant’s perspective. Rarely, do I hear the viewpoint of the plaintiff represented. Although not all plaintiffs are individuals, often the plaintiff is the relatively under-resourced party. Deciding whether the burden of proposed discovery outweighs its likely benefits is not a science. As I wrote in my book:

Imagine if the Court were to have agreed with [the Defendant’s] argument and determined the burden of expense of the proposed discovery in my case outweighed its likely benefit. Not only would the Zubulake opinions not have come to fruition, but also I would have been denied my opportunity to prove my claims. 

Q: Lastly, what other trends are you see in in the area of eDiscovery and what predictions do you have for the market in 2013? 

A: eDiscovery Morphs. Organizations will realize that eDiscovery should be part of a broader information governance effort. Information governance will become a division within a corporation with separate accountable management from which operations, legal, IT, and HR professionals can source and utilize information to achieve goals. Financial markets will increasingly reward companies (with higher multiples) who proactively manage information.

Reorganization. Organizations will recognize while information is their most valuable asset it is fearless— crossing functions, divisions, borders and not caring if it overwhelms an entity with volume, costs, and risks. Organizational structures will need to adapt and accommodate the ubiquitous nature of information. A systems thinking framework (understanding how processes influence one another within a whole) will increasingly replace a business silo structure. Information and communication managed proactively and globally, will improve efficiency, enhance profitability, reduces costs, increase compliance, and mitigate risks.

Search. Algorithms become an accepted search tool. Although keyword, concept, cluster, etc. searches will still play a role. For years, law enforcement, government, and Wall Street have used algorithms—the concept is not new and not without peril (significant market corrections were the result of algorithms gone wrong). Parties confronted with volumes of data and limited resources will have no choice but to agree to computer assistance. However, negative perceptions and concerns about algorithms will only change when there is a case where the parties initiate and voluntarily agree to their use.

Education. Within information governance efforts, organizations will increasingly establish training for employees. Employees need to be educated about the origination, maintenance, use, disposal, risks, rules, and regulations associated with ESI. A goal should be to lessen the growth of data and encourage smart and efficient communications. Education is a cost-control and risk-mitigating effort.

BYOD Reconsidered. Thinking a BYOD to work policy is cost-effective will be questioned and should be evaluated on a risk-adjusted basis. When companies analyze the costs (cash outlay) of providing employees with devices versus the unquantifiable costs associated with the lack of control, disorganization, and increased risks – it will become clear BYOD has the potential to be very expensive.

Government Focus. I had the privilege of addressing the Dept. of Justice’s Civil E-Discovery training program. It was evident to me that eDiscovery is one of the department’s focuses. With recent headlines concerning emails uncovering evidence (e.g. Fast and Furious), government entities (state and federal) will increasingly adopt rules, procedures, and training to address ESI. This brings me back to your first question—have organizations come far enough in managing their information? Government efforts to focus on eDiscovery will incentivize more corporations to (finally) address eDiscovery and information governance challenges.

Stay tuned for more breaking news coverage with industry luminaries.

For Westerners Seeking Discovery From China, Fortune Cookie Reads: Discovery is Uncertain, and Will Likely Be Hard

Monday, January 7th, 2013

In a recent Inside Counsel article, we explored the eDiscovery climate in China and some of the most important differences between the Chinese and U.S. legal systems. There is an increased interest in China and the legal considerations surrounding doing business with Chinese organizations, which we also covered on this Inside Counsel webcast.

 Five highlights from this series include:

1.  Conflicting Corporate Cultures- In general, business in China is done in a way that relies heavily on relationships. This can easily cause a conflict of interest for organizations and put them at risk for violations under the FCPA and UK Bribery Act. The concept that “relationships are gold” or Guanxi is crucial to conducting successful business in China. However, a fine line exists for organizations, necessitating a need for strong local counsel and guidance. Moreover, Chinese businesses don’t share the same definitions the Western world does for concepts like: information governance, legal hold or privacy.

 2.   FCPA and the UK Bribery Act- Both of these regulations are very troublesome for those doing business in China, yet necessary for regulating white-collar crime. In order to do business in China one must walk a fine line developing close relationships, without going too far and participating in bribery or other illegal acts. There are increased levels of prosecution under both of these statutes as businesses globalize.

3.  Drastically Different Legal Systems- The Chinese legal system is very different than those of common law jurisdictions. China’s legal system is based on civil law and there is no requirement for formal pre-litigation discovery. For this reason, litigants may find it very difficult to successfully procure discovery from Chinese parties. Chinese companies have been historically slow to cooperate with U.S. regulatory bodies and many discovery requests in civil litigation can take up to a year for a response. A copy of our eDiscovery passport on China can be found here, along with other important countries.

4.  State Secrets- In addition to the differences between common and civil law jurisdictions, China has strict laws protecting state secrets. Anything deemed a state secret would not be discoverable, and an attempt to remove state secrets from China could result in criminal prosecution. The definition of a state secret under People’s Republic of China law includes a wide range of information and is more ambiguous than Western definitions about national security (for example, the Chinese definitions are less defined than those in the U.S. Patriot Act). Politically sensitive data is susceptible to the government’s scrutiny and protection, regardless of whether it is possessed by PRC citizens or officials working for foreign corporations- there is no distinction or exception for civil discovery.

5.  Globalization- Finally, it is no secret that the world has become one huge marketplace. The rapid proliferation of information creation as well as the clashing of disparate legal systems creates real discovery challenges. However, there are also abundant opportunities for lawyers that become specialized in the Asia Pacific region today. Lawyers that are particularly adept in eDiscovery and Asia will flourish for years to come.

For more, read here…

Spotlighting the Top Electronic Discovery Cases from 2012

Friday, December 14th, 2012

With the New Year quickly approaching, it is worth reflecting on some of the key eDiscovery developments that have occurred during 2012. While legislative, regulatory and rulemaking bodies have undoubtedly impacted eDiscovery, the judiciary has once again played the most dramatic role.  There are several lessons from the top 2012 court cases that, if followed, will likely help organizations reduce the costs and risks associated with eDiscovery. These cases also spotlight the expectations that courts will likely have for organizations in 2013 and beyond.

Implementing a Defensible Deletion Strategy

Case: Brigham Young University v. Pfizer, 282 F.R.D. 566 (D. Utah 2012)

In Brigham Young, the plaintiff university had pressed for sanctions as a result of Pfizer’s destruction of key documents pursuant to its information retention policies. The court rejected that argument because such a position failed to appreciate the basic workings of a valid corporate retention schedule. As the court reasoned, “[e]vidence may simply be discarded as a result of good faith business procedures.” When those procedures operate to inadvertently destroy evidence before the duty to preserve is triggered, the court held that sanctions should not issue: “The Federal Rules protect from sanctions those who lack control over the requested materials or who have discarded them as a result of good faith business procedures.”

Summary: The Brigham Young case is significant since it emphasizes that organizations should implement a defensible deletion strategy to rid themselves of data stockpiles. Absent a preservation duty or other exceptional circumstances, organizations that pare back ESI pursuant to “good faith business procedures” (such as a neutral retention policy) will be protected from sanctions.

**Another Must-Read Case: Danny Lynn Elec. v. Veolia Es Solid Waste (M.D. Ala. Mar. 9, 2012)

Issuing a Timely and Comprehensive Litigation Hold

Case: Apple, Inc. v. Samsung Electronics Co., Ltd, — F. Supp. 2d. — (N.D. Cal. 2012)

Summary: The court first issued an adverse inference instruction against Samsung to address spoliation charges brought by Apple. In particular, the court faulted Samsung for failing to circulate a comprehensive litigation hold instruction when it first anticipated litigation. This eventually culminated in the loss of emails from several key Samsung custodians, inviting the court’s adverse inference sanction.

Ironically, however, Apple was subsequently sanctioned for failing to issue a proper hold notice. Just like Samsung, Apple failed to distribute a hold until several months after litigation was reasonably foreseeable. The tardy hold instruction, coupled with evidence suggesting that Apple employees were “encouraged to keep the size of their email accounts below certain limits,” ultimately led the court to conclude that Apple destroyed documents after its preservation duty ripened.

The Lesson for 2013: The Apple case underscores the importance of issuing a timely and comprehensive litigation hold notice. For organizations, this likely means identifying the key players and data sources that may have relevant information and then distributing an intelligible hold instruction. It may also require suspending aspects of information retention policies to preserve relevant ESI. By following these best practices, organizations can better avoid the sanctions bogeyman that haunts so many litigants in eDiscovery.

**Another Must-Read Case: Chin v. Port Authority of New York, 685 F.3d 135 (2nd Cir. 2012)

Judicial Approval of Predictive Coding

Case: Da Silva Moore v. Publicis Groupe, — F.R.D. — (S.D.N.Y. Feb. 24, 2012)

Summary: The court entered an order that turned out to be the first of its kind: approving the use of predictive coding technology in the discovery phase of litigation. That order was entered pursuant to the parties’ stipulation, which provided that defendant MSL Group could use predictive coding in connection with its obligation to produce relevant documents. Pursuant to that order, the parties methodically (yet at times acrimoniously) worked over several months to fine tune the originally developed protocol to better ensure the production of relevant documents by defendant MSL.

The Lesson for 2013: The court declared in its order that predictive coding “is an acceptable way to search for relevant ESI in appropriate cases.” Nevertheless, the court also made clear that this technology is not the exclusive method now for conducting document review. Instead, predictive coding should be viewed as one of many different types of tools that often can and should be used together.

**Another Must-Read Case: In Re: Actos (Pioglitazone) Prods. Liab. Litig. (W.D. La. July 10, 2012)

Proportionality and Cooperation are Inextricably Intertwined

Case: Pippins v. KPMG LLP, 279 F.R.D. 245 (S.D.N.Y. 2012)

Summary: The court ordered the defendant accounting firm (KPMG) to preserve thousands of employee hard drives. The firm had argued that the high cost of preserving the drives was disproportionate to the value of the ESI stored on the drives. Instead of preserving all of the drives, the firm hoped to maintain a reduced sample, asserting that the ESI on the sample drives would satisfy the evidentiary demands of the plaintiffs’ class action claims.

The court rejected the proportionality argument primarily because the firm refused to permit plaintiffs or the court to analyze the ESI found on the drives. Without any transparency into the contents of the drives, the court could not weigh the benefits of the discovery against the alleged burdens of preservation. The court was thus left to speculate about the nature of the ESI on the drives, reasoning that it went to the heart of plaintiffs’ class action claims. As the district court observed, the firm may very well have obtained the relief it requested had it engaged in “good faith negotiations” with the plaintiffs over the preservation of the drives.

The Lesson for 2013: The Pippins decision reinforces a common refrain that parties seeking the protection of proportionality principles must engage in reasonable, cooperative discovery conduct. Staking out uncooperative positions in the name of zealous advocacy stands in sharp contrast to proportionality standards and the cost cutting mandate of Rule 1. Moreover, such a tactic may very well foreclose proportionality considerations, just as it did in Pippins.

**Another Must-Read Case: Kleen Products LLC v. Packaging Corp. of America (N.D. Ill. Sept. 28, 2012)

Conclusion

There were any number of other significant cases from 2012 that could have made this list.  We invite you to share your favorites in the comments section or contact us directly with your feedback.

December Symantec SharePoint Governance Twitter Chat

Thursday, December 13th, 2012

Join hashtag #IGChat and learn about SharePoint governance and creating effective governance plans

Over the years, SharePoint has become a favorite among organizations as a place to share and manage content. As SharePoint adoption increases – storage, performance and on-going maintenance become major challenges, and SharePoint governance becomes essential. Archiving and eDiscovery solutions provide a key part in any effective and lasting governance strategy for SharePoint.  

In a 2012 survey conducted by Osterman research, the results showed that 39 percent of all SharePoint implementations still don’t have a governance plan. This is due to the fact that implementing governance plans can be difficult.

During this Twitter Chat we will discuss the reasons why organizations need SharePoint governance and the role of archiving and eDiscovery in governance plans. Please join Symantec’s archiving/eDiscovery and SharePoint experts, Dave Scott (@DScottyt) and Rob Mossi (@RMossi24) next Tuesday, December 18 at 10 am PT to chat.

Dave Scott: Dave Scott is a Group Product Manager at Symantec specializing in social media and SharePoint archiving and eDiscovery. He has contributed articles to a number of leading industry publications and is a frequent contributor to Connect.symantec.com. 

Rob Mossi: Rob Mossi is a Sr. Product Marketing Manager with Symantec’s Enterprise Vault product team. With a focus on SharePoint, Rob actively participates in SharePoint archiving and information governance thought leadership activities, including research, conferences and social media. 

 Twitter Chat: SharePoint Governance #IGChat

 Date: Tuesday, December 18, 2012

 Time: 10 am PT

 Length: 1 hour

 Where: Twitter – follow the hashtag #IGChat

 Moderator: Symantec’s Dave Scott (@DScottyt)

Dueling Predictive Coding for Dummies Books Part Deux

Friday, December 7th, 2012

Long time readers of the eDiscovery 2.0 blog know we like to take advantage of every opportunity we have to discuss Charlie Sheen and eDiscovery.  While Charlie Sheen’s antics may have died down, the evolution and discussion of e-Discovery technology continues unabated. Thanks to Sharon Nelson and a recent blog post on her ride the lightning site, we’ve decided that there is no way we can pass up the opportunity to stretch the Charlie Sheen/eDiscovery analogy once again.

In the 1993 movie Hot Shots Part Deux, Charlie Sheen plays the main character in a Rambo parody that has similarities to the original Rambo movies starring Sylvester Stallone.  Not surprisingly, the parody is focused on comedic value and is a far cry from the original Rambo movies that helped make Stallone a Hollywood icon.  In recent months, those in the litigation community have watched an analogous situation play out with two competing books about predictive coding technology.

In September, the legal publication ALM (American Lawyer Media), reported that two competing Predictive Coding for Dummies books were published by Symantec and Recommind respectively. 

The ALM article, titled: Predictive Coding Vendors Duel for ‘Dummies’ did not provide an in depth analysis of either book, but a recent blog posted to ride the lightening by Terry Dexter provided an analysis and review of both books that many have eagerly anticipated.  The conclusion?  The Predictive Coding for Dummies sequel is a far cry from the original.

Here is the actual text of Mr. Dexter’s analysis for your reading pleasure:

Predictive Coding For Dummies®, Symantec(TM) Special Edition by Matthew D. Nelson, Esq. Copyright © 2012 from John Wiley & Sons, Inc. 111 River St. Hoboken, NJ 07030-5774 ISBN 978-1-118-48198-1 (pbk); ISBN 978-1-118-48237-7 (ebk)

Predictive Coding For Dummies®, Recommind Special Edition author(s) not listed, Copyright © 2013 from John Wiley & Sons, Inc. 111 River St.Hoboken, NJ 07030-5774 ISBN 978-1-118-52167-0 (pbk); ISBN 978-1-118-52230-1 (ebk)

Not being known as someone who won’t accept a challenge, I read both books cover to cover (several times).  In full disclosure, I am not an attorney (or played one on TV); I am simply a techno-geek with a Bachelor of Arts in English and strong interest in the tools, techniques and methods involved with electronic discovery (eDis). This review is based upon my reading and understanding of Predictive Coding, which, in turn, is based upon a combination of 30 years in Information Science & Technology and extensive research into the wild wooly world of electronic discovery. Any and all comments are mine and not that of Sharon Nelson (the individual) or Sensei Enterprises, Inc.

Up first: Predictive Coding For Dummies®, Symantec(TM) Special Edition by Matthew D. Nelson, Esq.

My initial impression of this book was good. The format follows the standard “Dummies” format and structure while legal and technical concepts are presented in a clear, easily understood manner. Nelson’s writing flows from one paragraph to another and doesn’t introduce new terms without first explaining them. The reader is immediately informed as to the what and why of electronic discovery.  From the third paragraph onward, the reader is gradually immersed into a sometimes murky world.

This excerpt from the Introduction sets the tone:

“Predictive coding technology is a new approach to attorney document review that can be used to help legal teams significantly reduce the time and cost of eDiscovery. Despite the promise of predictive coding technology, the technology is relatively new to the legal field, and significant confusion about the proper use of these tools is pervasive. This book helps eliminate that confusion by providing a wealth of information about predictive coding technology, related terminology, and the proper use of these tools.”

Specific comments:

Beyond the excellent writing, this book contains many positives and negatives; some of which I present here.

Positives:

  1. The cost in terms of timeliness, accuracy and productivity is compared to manual review. 
  2. Nelson introduces the Electronic Discovery Reference Model (EDRM) within the first three (3) pages. The subsequent discussions regarding potential costs is emphasized by illustrating the enormity of the potential volume of Electronically Stored Information (ESI). This early introduction is also valuable when process defensibility is introduced.
  3. The concepts of sanctions, privileged information, human v machine reading/review and risk are easily distinguished. Again, the “whys” for such concepts, easily understandable to a First Year Law Student, are easily understood for the layperson.
  4. The inclusion of website addresses to provide additional information is most welcome. Indeed, references to a predictive coding cost estimate page and to a Ralph Losey article helped me gain a deeper understanding of the planning and execution of a PC effort.
  5. A separate step in Nelson’s work flow considers Privileged Information. While no one on either side of a litigation struggle want to divulge such data, it can and does happen. Predictive Coding is not presented as a palliative cure-all for such ‘ooopsies’; however the book does go far in helping the reader comprehend the necessity of conducting separate actions to reduce if not eliminate the probability of such an event occurring.
  6. The three prominent eDis cases (DaSilva-Moore, Kleen Products and Global  Aerospace) are discussed relative to First Generation PC tools and Judicial Guidance.

Negatives:

  1. Clearly, this book is written and produced to influence litigators and law firms to orient themselves towards Symantec and Clearwell. Hints are subtly placed throughout the book. While not explicitly mentioning any names, the implication is clear and gets more obvious starting at Chapter 6. A more neutral, objective content makes more sense to someone who is already familiar with the eDis process.
  2. There is no discussion on the difficulties of using Optical Character Recognition (OCR) or different character set based ESI. All data is presumed to be 100% compatible and ANSI compliant.

Recommendation:

This is an excellent book to give to clients, new litigation support personnel, paralegals, etc. involved in the beginnings of any litigation where the use of Electronic Discovery tools is likely.

Next up: Predictive Coding For Dummies®, Recommind Special Edition author(s) not listed

My initial impression was guarded. The format follows the standard “Dummies” format and structure but the content reads like someone mashed several marketing ‘White Papers’ together. This impression is further supported when comparing copyright dates with the Symantec book. Indeed, a stark comparison between these tomes is like comparing apples to oranges.

Positives:

  1. It’s short.

Negatives:

  1. Only nine (9) pages (25%) have any direct relationship with the subject matter. Twenty-eight (28) pages (~77%) are more closely related to marketing collateral. The very topic of Predictive Coding is introduced to the reader at page 11!
  2. The reader is constantly bombarded with the cost differential between manual and automated document review. Figure 1-2 in this book compares savings in 3 types of cases (IP, Second Request & Tort). Linear (Manual) Review is compared to Predictive Coding and, of course, PC wins every time. However there is no mention as to the style of the PC effort (and related costs) – were documents reviewed in house or by a services provider?
  3. There is zero mention of risk, sanctions or privileged information. In fact, a reader may develop the idea that any Predictive Coding tool takes care of any such occurrences.
  4. There is no discussion on the difficulties of using Optical Character Recognition (OCR) or different character set based ESI. All ESI is presumed to be 100% compatible and ANSI compliant.
  5. What are ‘Frankenstacks’? This book is supposed to help IT Managers who already understand the hurdles of application incompatibility.
  6. The book is very difficult to read. The workflow discussion does not follow the accompanying diagram (Figure 2-1) and even introduces the concept of ‘Predictive Analysis’ without any further discussion.
  7. The book makes blatant reference to Recommind’s product. Indeed the content of the entire document builds to the conclusion that only Recommind has the capability to successfully conduct electronic discovery.

Recommendation:

This is a very poorly written book using a style that insults the reader’s intelligence. A cursory Bing or Google search would a better investment in time and money.”

Interestingly, only one day after Mr. Dexter’s review, another review by Jeffrey Reed was posted to ride the lightning criticizing both books. For those of us in a profession that thrives on advocacy, it probably comes as no surprise that two people could have different views of the same book. Unfortunately, inconsistent reviews might leave some to wonder which book they should read.  The good news is that both books are free so we invite you to read them both and draw your own conclusions.  As always, we also invite your feedback.

To download a copy of Symantec’s Predictive Coding for Dummies book click here.

Predictive Coding 101 & the Litigator’s Toolbelt

Wednesday, December 5th, 2012

Query your average litigation attorney about the difference between predictive coding technology and other more traditional litigation tools and you are likely to receive a wide range of responses. The fact that “predictive coding” goes by many names, including “computer-assisted review” (CAR) and “technology-assisted review” (TAR) illustrates a fundamental problem: what is predictive coding and how is it different from other tools in the litigator’s technology toolbelt™?

 Predictive coding is a type of machine-learning technology that enables a computer to “predict” how documents should be classified by relying on input (or “training”) from human reviewers. The technology is exciting for organizations attempting to manage skyrocketing eDiscovery costs because the ability to expedite the document review process and find key documents faster has the potential to save organizations thousands of hours of time. In a profession where the cost of reviewing a single gigabyte of data has been estimated to be around $18,000, narrowing days, weeks, or even months of tedious document review into more reasonable time frames means massive savings for thousands of organizations struggling to keep litigation expenditures in check.

 Unfortunately, widespread adoption of predictive coding technology has been relatively slow due to confusion about how predictive coding differs from other types of CAR or TAR tools that have been available for years. Predictive coding, unlike other tools that automatically extract patterns and identify relationships between documents with minimal human intervention, requires a deeper level of human interaction. That interaction involves significant reliance on humans to train and fine-tune the system through an iterative, hands-on process. Some common TAR tools used in eDiscovery that do not include this same level of interaction are described below:

  •  Keyword search: Involves inputting a word or words into a computer which then retrieves documents within the collection containing the same words. Also known as Boolean searching, keyword search tools typically include enhanced capabilities to identify word combinations and derivatives of root words among other things.
  •  Concept search: Involves the use of linguistic and statistical algorithms to determine whether a document is responsive to a particular search query. This technology typically analyzes variables such as the proximity and frequency of words as they appear in relationship to a keyword search. The technology can retrieve more documents than keyword searches because conceptually related documents are identified, whether or not those documents contain the original keyword search terms.
  •  Discussion threading: Utilizes algorithms to dynamically link together related documents (most commonly e-mail messages) into chronological threads that reveal entire discussions. This simplifies the process of identifying participants to a conversation and understanding the substance of the conversation.
  •  Clustering: Involves the use of algorithms to automatically organize a large collection of documents into different topical categories based on similarities between documents. Reviewing documents organized categorically can help increase the speed and efficiency of document review. 
  •  Find similar: Enables the automated retrieval of other documents related to a particular document of interest. Reviewing similar documents together accelerates the review process, provides full context for the document under review, and ensures greater coding consistency.
  •  Near-duplicate identification: Allows reviewers to easily identify, view, and code near-duplicate e-mails, attachments, and loose files. Some systems can highlight differences between near-duplicate documents to help simplify document review.

Unlike the TAR tools listed above, predictive coding technology relies on humans to review a small fraction of the overall document population, which ultimately results in a fraction of the review costs. The process entails feeding decisions about how to classify a small number of case documents called a training set into a computer system. The computer then relies on the human training decisions to generate a model that is used to predict how the remaining documents should be classified. The information generated by the model can be used to rank, analyze, and review the documents quickly and efficiently. Although documents can be coded with multiple designations that relate to various issues in the case during eDiscovery, many times predictive coding technology is simply used to segregate responsive and privileged documents from non-responsive documents in order to expedite and simplify the document review process.

 Training the predictive coding system is an iterative process that requires attorneys and their legal teams to evaluate the accuracy of the computer’s document prediction scores at each stage. A prediction score is simply a percentage value assigned to each document that is used to rank all the documents by degree of responsiveness. If the accuracy of the computer-generated predictions is insufficient, additional training documents can be selected and reviewed to help improve the system’s performance. Multiple training sets are commonly reviewed and coded until the desired performance levels are achieved. Once the desired performance levels are achieved, informed decisions can be made about which documents to produce.

 For example, if the legal team’s analysis of the computer’s predictions reveals that within a population of 1 million documents, only those with prediction scores in the 70 percent range and higher appear to be responsive, the team may elect to produce only those 300,000 documents to the requesting party. The financial consequences of this approach are significant because a majority of the documents can be excluded from expensive manual review by humans. The simple rule of thumb in eDiscovery is that the fewer documents requiring human review, the more money saved since document review is typically the most expensive facet of eDiscovery.

 Hype and confusion surrounding the promise of predictive coding technology has led some to believe that the technology renders other TAR tools obsolete. To the contrary, predictive coding technology should be viewed as one of many different types of tools in the litigator’s technology toolbelt™ that often can and should be used together. Choosing which of these tools to use and how to use them depends on the case and requires balancing factors such as discovery deadlines, cost, and complexity. Many believe the choice about which tools should be used for a particular matter, however, should be left to producing party as long as the tools are used properly and in a manner that is “just” for both parties as mandated by Rule 1 of the Federal Rules of Civil Procedure

 The notion that parties should be able to choose which tools they use during discovery recently garnered support in the 7th Federal Circuit. In Kleen Products, LLC, et. al. v. Packaging Corporation of America, et. al., Judge Nolan was faced with exploring plaintiffs’ claim that the defendants’ should be required to supplement their use of keyword searching tools with more advanced tools in order to better comply with their duty to produce documents. Plaintiffs’ argument hinged largely on the assumption that using more advanced tools would result in a more thorough document production. In response to this argument, Judge Nolan referenced Sedona Best Practices Recommendations & Principles for Addressing Electronic Document Production during a hearing between the parties to suggest that carpenter (end user) is best equipped to select the appropriate tool during discovery. Sedona Principle 6 states that:

“[r]esponding parties are best situated to evaluate the procedures, methodologies, and technologies appropriate for preserving and producing their own electronically stored information.”

Even though the parties in Kleen Products ultimately postponed further discussion about whether tools like predictive coding technology should be used when possible during discovery, the issue remains important because it is likely to resurface again and again as predictive coding momentum continues to grow. Some will argue that parties who fail to leverage modern technology tools like predictive coding are attempting to game the legal system to avoid thorough document productions.  In some instances, that argument could be valid, but it should not be a foregone conclusion.

Although there will likely come a day where predictive coding technology is the status quo for managing large scale document review, that day has not yet arrived. Predictive coding technology is a type of machine learning technology that has been used in other disciplines for decades. However, predictive coding tools are still very new to the field of law. As a result, most predictive coding tools lack transparency because they provide little if any information about the underlying statistical methodologies they apply. The issue is important because the misapplication of statistics could have a dramatic effect on the thoroughness of document productions. Unfortunately, these nuanced issues are sometimes misunderstood or overlooked by predictive coding proponents –a problem that could ultimately result in unfairness to requesting parties and stall broader adoption of otherwise promising technology. 

Further complicating matters is the fact that several solution providers have introduced new predictive coding tools in recent months to try and capture market share. In the long term, competition is good for consumers and the industry as a whole. In the short term, however, most of these tools are largely untested and vary in quality and ease of use, thereby adding more confusion to would-be consumers. The unfortunate end result is that many lawyers are shying away from using predictive coding technology until the pros and cons of various technology solutions and their providers are better understood.  Market confusion is often one of the biggest stumbling blocks to faster adoption of technology that could save organizations millions and the current predictive coding landscape is a testament to this fact.

Eliminating much of the current confusion through education is the precise goal of Symantec’s Predictive Coding for Dummies book. The book addresses everything from predictive coding case law and defensible workflows, to key factors that should be considered when evaluating different predictive coding tools. The book strives to provide attorneys and legal staff accustomed to using traditional TAR tools like keyword searching with a baseline understanding of a new technological approach that many find confusing. We believe providing the industry with this basic level of understanding will help ensure that predictive coding technology and related best practices standards will evolve in a manner that is fair to both parties –ultimately, expediting rather than slowing broader adoption of this promising new technology. To learn more, download a free copy of Predictive Coding for Dummies and feel free to share your feedback and comments below.

Q&A With Predictive Coding Guru, Maura R. Grossman, Esq.

Tuesday, November 13th, 2012

Can you tell us a little about your practice and your interest in predictive coding?

After a prior career as a clinical psychologist, I joined Wachtell Lipton as a litigator in 1999, and in 2007, when I was promoted to counsel, my practice shifted exclusively to advising lawyers and clients on legal, technical, and strategic issues involving electronic discovery and information management, both domestically and abroad.

I became interested in technology-assisted review (“TAR”) in the 2007/2008 time frame, when I sought to address the fact that Wachtell Lipton had few associates to devote to document review, and contract attorney review was costly, time-consuming, and generally of poor quality.  At about the same time, I crossed paths with Jason R. Baron and got involved in the TREC Legal Track.

What are a few of the biggest predictive coding myths?

There are so many, it’s hard to limit myself to only a few!  Here are my nominations for the top ten, in no particular order:

Myth #1:  TAR is the same thing as clustering, concept search, “find similar,” or any number of other early case assessment tools.
Myth #2:  Seed or training sets must always be random.
Myth #3:  Seed or training sets must always be selected and reviewed by senior partners.
Myth #4:  Thousands of documents must be reviewed as a prerequisite to employing TAR, therefore, it is not suitable for smaller matters.
Myth #5:  TAR is more susceptible to reviewer error than the “traditional approach.”
Myth #6:  One should cull with keywords prior to employing TAR.
Myth #7:  TAR does not work for short documents, spreadsheets, foreign language documents, or OCR’d documents.
Myth #8:  Tar finds “easy” documents at the expense of “hot” documents.
Myth #9:  If one adds new custodians to the collection, one must always retrain the system.
Myth #10:  Small changes to the seed or training set can cause large changes in the outcome, for example, documents that were previously tagged as highly relevant can become non-relevant. 

The bottom line is that your readers should challenge commonly held (and promoted) assumptions that lack empirical support.

Are all predictive coding tools the same?  If not, then what should legal departments look for when selecting a predictive coding tool?

Not at all, and neither are all manual reviews.  It is important to ask service providers the right questions to understand what you are getting.  For example, some TAR tools employ supervised or active machine learning, which require the construction of a “training set” of documents to teach the classifier to distinguish between responsive and non-responsive documents.  Supervised learning methods are generally more static, while active learning methods involve more interaction with the tool and more iteration.  Knowledge engineering approaches (a.k.a. “rule-based” methods) involve the construction of linguistic and other models that replicate the way that humans think about complex problems.  Both approaches can be effective when properly employed and validated.  At this time, only active machine learning and rule-based approaches have been shown to be effective for technology-assisted review.  Service providers should be prepared to tell their clients what is “under the hood.”

What is the number one mistake practitioners should avoid when using these tools?

Not employing proper validation protocols, which are essential to a defensible process.  There is widespread misunderstanding of statistics and what they can and cannot tell us.  For example, many service providers report that their tools achieve 99% accuracy.  Accuracy is the fraction of documents that are correctly coded by a search or review effort.  While accuracy is commonly advanced as evidence of an effective search or review effort, it can be misleading because it is heavily influenced by prevalence, or the number of responsive documents in the collection.  Consider, for example, a document collection containing one million documents, of which ten thousand (or 1%) are relevant.  A search or review effort that identified 100% of the documents as non-relevant, and therefore, found none of the relevant documents, would have 99% accuracy, belying the failure of that search or review effort to identify a single relevant document.

What do you see as the key issues that will confront practitioners who wish to use predictive coding in the near-term?

There are several issues that will be played out in the courts and in practice over the next few years.  They include:  (1) How does one know if the proposed TAR tool will work (or did work) as advertised?; (2) Must seed or training sets be disclosed, and why?; (3) Must documents coded as non-relevant be disclosed, and why?; (4) Should TAR be held to a higher standard of validation than manual review?; and (5) What cost and effort is justified for the purposes of validation?  How does one ensure that the cost of validation does not obliterate the savings achieved by using TAR?

What have you been up to lately?

In an effort to bring order to chaos by introducing a common framework and set of definitions for use by the bar, bench, and vendor community, Gordon V. Cormack and I recently prepared a glossary on technology-assisted review that is available for free download at:  http://cormack.uwaterloo.ca/targlossary.  We hope that your readers will send us their comments on our definitions and additional terms for inclusion in the next version of the glossary.

Maura R. Grossman, counsel at Wachtell, Lipton, Rosen & Katz, is a well-known e-discovery lawyer and recognized expert in technology-assisted review.  Her work was cited in the landmark 2012 case, Da Silva Moore v. Publicis Group (S.D.N.Y. 2012).