24h-payday

Posts Tagged ‘ECA’

The Gartner 2013 Magic Quadrant for eDiscovery Software is Out!

Wednesday, June 12th, 2013

This week marks the release of the 3rd annual Gartner Magic Quadrant for e-Discovery Software report.  In the early days of eDiscovery, most companies outsourced almost every sizeable project to vendors and law firms so eDiscovery software was barely a blip on the radar screen for technology analysts. Fast forward a few years to an era of explosive information growth and rising eDiscovery costs and the landscape has changed significantly. Today, much of the outsourced eDiscovery “services” business has been replaced by eDiscovery software solutions that organizations bring in house to reduce risk and cost. As a result, the enterprise eDiscovery software market is forecast to grow from $1.4 billion in total software revenue worldwide in 2012 to $2.9 billion by 2017. (See Forecast:  Enterprise E-Discovery Software, Worldwide, 2012 – 2017, Tom Eid, December, 2012).

Not surprisingly, today’s rapidly growing eDiscovery software market has become significant enough to catch the attention of mainstream analysts like Gartner. This is good news for company lawyers who are used to delegating enterprise software decisions to IT departments and outside law firms. Because today those same company lawyers are involved in eDiscovery and other information management software purchasing decisions for their organizations. While these lawyers understand the company’s legal requirements, they do not necessarily understand how to choose the best technology to address those requirements. Conversely, IT representatives understand enterprise software, but they do not necessarily understand the law. Gartner bridges this information gap by providing in depth and independent analysis of the top eDiscovery software solutions in the form of the Gartner Magic Quadrant for e-Discovery Software.

Gartner’s methodology for preparing the annual Magic Quadrant report is rigorous. Providers must meet quantitative requirements such as revenue and significant market penetration to be included in the report. If these threshold requirements are met then Gartner probes deeper by meeting with company representatives, interviewing customers, and soliciting feedback to written questions. Providers that make the cut are evaluated across four Magic Quadrant categories as either “leaders, challengers, niche players, or visionaries.” Where each provider ends up on the quadrant is guided by an independent evaluation of each provider’s “ability to execute” and “completeness of vision.” Landing in the “leaders” quadrant is considered a top recognition.

The nine Leaders in this year’s Magic Quadrant have four primary characteristics (See figure 1 above).

The first is whether the provider has functionality that spans both sides of the electronic discovery reference model (EDRM) (left side – identification, preservation, litigation hold, collection, early case assessment (ECA) and processing and right-side – processing, review, analysis and production). “While Gartner recognizes that not all enterprises — or even the majority — will want to perform legal-review work in-house, more and more are dictating what review tools will be used by their outside counsel or legal-service providers. As practitioners become more sophisticated, they are demanding that data change hands as little as possible, to reduce cost and risk. This is a continuation of a trend we saw developing last year, and it has grown again in importance, as evidenced both by inquiries from Gartner clients and reports from vendors about the priorities of current and prospective customers.”

We see this as consistent with the theme that providers with archiving solutions designed to automate data retention and destruction policies generally fared better than those without archiving technology. The rationale is that part of a good end-to-end eDiscovery strategy includes proactively deleting data organizations do not have a legal or business need to keep. This approach decreases the amount of downstream electronically stored information (ESI) organizations must review on a case-by-case basis so the cost savings can be significant.

Not surprisingly, whether or not a provider offers technology assisted review or predictive coding capabilities was another factor in evaluating each provider’s end-to-end functionality. The industry has witnessed a surge in predictive coding case law since 2012 and judicial interest has helped drive this momentum. However, a key driver for implementing predictive coding technology is the ability to reduce the amount of ESI attorneys need to review on a case-by-case basis. Given the fact that attorney review is the most expensive phase of the eDiscovery process, many organizations are complementing their proactive information reduction (archiving) strategy with a case-by-case information reduction plan that also includes predictive coding.

The second characteristic Gartner considered was that Leaders’ business models clearly demonstrate that their focus is software development and sales, as opposed to the provision of services. Gartner acknowledged that the eDiscovery services market is strong, but explains that the purpose of the Magic Quadrant is to evaluate software, not services. The justification is that “[c]orporate buyers and even law firms are trending towards taking as much e-Discovery process in house as they can, for risk management and cost control reasons. In addition, the vendor landscape for services in this area is consolidating. A strong software offering, which can be exploited for growth and especially profitability, is what Gartner looked for and evaluated.”

Third, Gartner believes the solution provider market is shrinking and that corporations are becoming more involved in buying decisions instead of deferring technology decisions to their outside law firms. Therefore, those in the Leaders category were expected to illustrate a good mix of corporate and law firm buying centers. The rationale behind this category is that law firms often help influence corporate buying decisions so both are important players in the buying cycle. However, Gartner also highlighted that vendors who get the majority of their revenues from the “legal solution provider channel” or directly from “law firms” may soon face problems.

The final characteristic Gartner considered for the Leaders quadrant is related to financial performance and growth. In measuring this component, Gartner explained that a number of factors were considered. Primary among them is whether the Leaders are keeping pace with or even exceeding overall market growth. (See “Forecast:  Enterprise E-Discovery Software, Worldwide, 2012 – 2017,” Tom Eid, December, 2012).

Companies landing in Gartner’s Magic Quadrant for eDiscovery Software have reason to celebrate their position in an increasingly competitive market. To review Gartner’s full report yourself, click here. In the meantime, please feel free to share your own comments below as the industry anxiously awaits next year’s Magic Quadrant Report.

Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.

Where There’s Smoke There’s Fire: Powering eDiscovery with Data Loss Prevention

Monday, November 12th, 2012

New technologies are being repurposed for Early Case Assessment (ECA) in this ever-changing global economy chockfull of intellectual property theft and cybertheft. These increasingly hot issues are now compelling lawyers to become savvier about how the technologies they use to identify IP theft and related issues in eDiscovery. One of the more useful, but often overlooked tools in this regard is Data Loss Prevention (DLP) technology. Traditionally a data breach and security tool, DLP has emerged as yet another tool in the Litigator’s Tool Belt™ that can be applied in eDiscovery.

DLP technology utilizes Vector Machine Learning (VML) to detect intellectual property, such as product designs, source code and trademarked language that are deemed proprietary and confidential. This technology eliminates the need for developing laborious keyword-based policies or fingerprinting documents. While a corporation can certainly customize these policies, there are off the shelf materials that make the technology easy to deploy.

An exemplary use case that spotlights how DLP could have been deployed in the eDiscovery context is the case of E.I. Du Pont de Nemours v. Kolon Industries. In DuPont, a jury issued a $919 million verdict after finding that the defendant manufacturer stole critical elements of the formula for Kevlar, a closely guarded and highly profitable DuPont trade secret. Despite the measures that were taken to protect the trade secret, a former DuPont consultant successfully copied key information relating to Kevlar on to a CD that was later disseminated to the manufacturer’s executives. All of this came to light in the recently unsealed criminal indictments the U.S. Department of Justice obtained against the manufacturer and several of its executives.

Perhaps all of this could have been avoided had a DLP tool been deployed. A properly implemented DLP solution in the DuPont case might have detected the misappropriation that occurred and perhaps prompted an internal investigation. At the very least, DLP could possibly have mitigated the harmful effects of the trade secret theft. DLP technology could potentially have detected the departure/copying of proprietary information and any other suspicious behavior regarding sensitive IP.

As the DuPont case teaches, DLP can be utilized to detect IP theft and data breaches. In addition, it can act as an early case assessment (ECA) tool for lawyers in both civil and criminal actions. With data breaches, where there is smoke (breach) there is generally fire (litigation). A DLP incident report can be used as a basis for an investigation, and essentially reverse engineer the ECA process with hard evidence underlying the data breach. Thus, instead of beginning an investigation with a hunch or tangential lead, DLP gives hard facts to lawyers, and ultimately serves as a roadmap for effective legal hold implementation for the communications of custodians. Instead of discovering data breaches during the discovery process, DLP allows lawyers to start with this information, making the entire matter more efficient and targeted.

From an information governance point of view, DLP also has a relationship with the left proactive side of the Electronic Discovery Reference Model. The DLP technology can also be repurposed as Data Classification Services for automated document retention. The policy and technology combination of DCS/DLP speak to each other in harmony to accomplish appropriate document retention as well as breach prevention and notification. It follows that there would be similar identifiers for both policy consoles in DCS/DLP, and that these indicators enable the technology to make intelligent decisions.

Given this backdrop, it behooves both firm lawyers and corporate counsel to consider getting up to speed on the capabilities of DLP tools. The benefits DLP offers in eDiscovery are too important to be ignored.

From A to PC – Running a Defensible Predictive Coding Workflow

Tuesday, September 11th, 2012

So far in our ongoing predictive coding blog series, we’ve touched on the “whys” and “whats” of predictive coding, and now I’d like to address the “hows” of using this new technology. Given that predictive coding is groundbreaking technology in the world of eDiscovery, it’s no surprise that a different workflow is required in order to run the review process.

The traditional linear review process utilizes a “brute force” approach of manually reading each document and processing it for responsiveness and privilege. In order to reduce the high cost of this process, many organizations now farm out documents to contract attorneys for review. Often, however, contract attorneys possess less expertise and knowledge of the issues, which means that multiple review passes along with additional checks and balances are often needed in order to ensure review accuracy. This process commonly results in a significant number of documents being reviewed multiple times, which in turn increases the cost of review. When you step away from an “eyes-on review” of every document and use predictive coding to leverage the expertise of more experienced attorneys, you will naturally aim to review as few documents as possible in order to achieve the best possible results.

How do you review the minimum number of documents with predictive coding? For starters, organizations should prepare their case to use predictive coding by performing an early case assessment (ECA) in order to cull down to your review population prior to review. While some may suggest that predictive coding can be run without any ECA up front, you will actually save a significant amount of review time if you put in the effort to cull out the profoundly irrelevant documents in your case. Doing so will prevent a “junk in, junk out” situation where leaving too much junk in the case will result in having to necessarily review a number of junk documents throughout the predictive coding workflow.

Next, segregating documents that are unsuitable for predictive coding is important. Most predictive coding solutions leverage the extracted text content within documents to operate. That means any documents that do not contain extracted text, such as photographs and engineering schematics, should be manually reviewed so they are not overlooked by the predictive coding engine. The same concept applies to any other document that has other reviewable limitations, such as encrypted and password protected files. All of these documents should be reviewed separately as to not miss any relevant documents.

After culling down to your review population, the next step in preparing to use predictive coding is to create a Control Set by drawing a randomly selected statistical sample from the document population. Once the Control Set is manually reviewed, it will serve two main purposes. First, it will allow you to estimate the population yield, otherwise referred to as the percentage of responsive documents contained within the larger population. (The size of the control set may need to be adjusted to insure the yield is properly taken into account). Second, it will serve as your baseline for a true “apples-to-apples” comparison of your prediction accuracy across iterations as you move through the predictive coding workflow. The Control Set will only need to be reviewed once up front to be used for measuring accuracy throughout the workflow.

It is essential that the documents in the Control Set are selected randomly from the entire population. While some believe that taking other sampling approaches give better peace of mind, they actually may result in unnecessary review. For example, other workflows recommend sampling from the documents that are not predicted to be relevant to see if anything was left behind. If you instead create a proper Control Set from the entire population, you can get the necessary precision and recall metrics that are representative of the entire population, which in turn represents the documents that are not predicted to be relevant.

Once the Control Set is created, you can begin training the software to evaluate documents by the review criteria in the case. Selecting the optimal set of documents to train the system (commonly referred to as the training set or seed set) is one of the most important steps in the entire predictive coding workflow as it sets the initial accuracy for the system, and thus it should be chosen carefully. Some suggest creating the initial training set by taking a random sample (much like how the control set is selected) from the population instead of proactively selecting responsive documents. However, the important thing to understand is that any items used for training should accurately represent the responsive items instead. The reason selecting responsive documents for inclusion in the training set is important is related to the fact that most eDiscovery cases generally have low yield – meaning the prevalence of responsive documents contained within the overall document population is low. This means the system will not be able to effectively learn how to identify responsive items if enough responsive documents are not included in the training set.

An effective method for selecting the initial training set is to use a targeted search to locate a small set of documents (typically between 100-1000) that is expected to be about 50% responsive. For example, you may choose to focus on only the key custodians in the case and use a combination of tighter keyword/date range/etc search criteria. You do not have to perform exhaustive searches, but a high quality initial training set will likely minimize the amount of additional training needed to achieve high prediction accuracy.

After the initial training set is selected, it must then be reviewed. It is extremely important that the review decisions made on any training items are as accurate as possible since the systems will be learning from these items, which typically means that the more experienced case attorneys should be used for this review. Once review is finished on all of the training documents, then the system can learn from the tagging decisions in order to be able to predict the responsiveness or non-responsiveness of the remaining documents.

While you can now predict on all of the other documents in the population, it is most important to predict on the Control Set at this time. Not only may this decision be more time effective than applying predictions to all the documents in the case, but you will need predictions on all of the documents in the Control Set in order to assess the accuracy of the predictions. With predictions and tagging decisions on each of the Control Set documents, you will be able to get accurate precision and recall metrics that you can extrapolate to the entire review population.

At this point, the accuracy of the predictions is likely to not be optimal, and thus the iterative process begins. In order to increase the accuracy, you must select additional documents to use for training the system. Much like the initial training set, this additional training set must also be selected carefully. The best documents to use for an additional training set are those that the system would be unable to accurately predict. Rather than choosing these documents manually, the software is often able to mathematically determine this set more effectively than human reviewers. Once these documents are selected, you simply continue the iterative process of training, predicting and testing until your precision and recall are at an acceptable point. Following this workflow will result in a set of documents identified to be responsive by the system along with trustworthy and defensible accuracy metrics.

You cannot simply produce all of these documents at this point, however. The documents must still go through a privileged screen in order to remove any documents that should not be produced, and also go through any other review measures that you usually take on your responsive documents. This does, however, open up the possibility of applying additional rounds of predictive coding on top of this set of responsive documents. For example, after running the privileged screen, you can train on the privileged tag and attempt to identify additional privileged documents in your responsive set that were missed.

The important thing to keep in mind is that predictive coding is meant to strengthen your current review workflows. While we have outlined one possible workflow that utilizes predictive coding, the flexibility of the technology lends itself to be utilized for a multitude of other uses, including prioritizing a linear review. Whatever application you choose, predictive coding is sure to be an effective tool in your future reviews.

The 2012 EDGE Summit (21st Century Technology for Information Governance) Debuts In Nation’s Capitol

Monday, April 23rd, 2012

The EDGE Summit this week is one of the most prestigious eDiscovery events of the year as well as arguably the largest for the government sector. This year’s topics and speakers are top notch. The opening keynote speaker will be the Director of Litigation for the National Archives and Records Administration (NARA), Mr. Jason Baron. The EDGE Summit will be the first appearance for Mr. Baron since the submission deadline for the 480 agencies to submit their reports to his Agency in order to construct the Directive required by the Presidential Mandate. Attendees will be eager to hear what steps NARA is taking to implement a Directive to the government later this year, and the potential impact it will have on how the government approaches its eDiscovery obligations. The Directive will be a significant step in attempting to bring order to the government’s Big Data challenges and to unify agencies with a similar approach to an information governance plan.

Also speaking at EDGE is the renowned Judge Facciola who will be discussing the anticipated updates the American Bar Association (ABA) is expected to make to the Model Rules of Professional Conduct. He plans to speak on the challenges that lawyers are facing in the digital age, and what that means with regard to competency as a practicing lawyer. He will focus as well on the government lawyer and how they can better meet their legal obligations through education, training, or knowing when and how to find the right expert. Whether it is the investigating party for law enforcement, producing party under the Freedom of Information Act (FOIA), or defendant in civil litigation, Judge Facciola will also discuss what he sees in his courtroom every day and where the true knowledge gaps are in the technological understanding of many lawyers today.

While the EDGE Summit offers CLE credit, it also has a very unique practical aspect as well. There will be a FOIA-specific lab, a lab on investigations, one on civil litigation and early case assessment (ECA) and also one on streamlining the eDiscovery workflow process. Those that plan on attending the labs will get the hands-on experience with technology that few educational events offer. It is rare to get in the driver’s seat of the car on the showroom floor and actually drive, which is what EDGE is providing for end users and interested attendees. When talking about the complex problems government agencies face today with Big Data, records management, information governance, eDiscovery, compliance, security, etc. it is necessary to give users a way to  truly visualize how these technologies work.

Another key draw at the Summit will be the panel discussions which will feature experienced government lawyers who have been on the front lines of litigation and have very unique perspectives. The legal hold panel will cover some exciting aspects of the evolution of manual versus automated processes for legal hold. Mr. David Shonka, the Deputy General Counsel of the Federal Trade Commission, is on the panel and he will discuss the defensibility of the process the FTC used and the experience his department had with two 30 (b) (6) witnesses in the Federal Trade Commission v. Lights of America, Inc (CD California, Mar 2011). The session will also cover how issuing a legal hold is imperative once the duty to preserve has been triggered. There are a whole new generation of lawyers that are managing the litigation hold process in an automated way, and it will be great to discuss both the manual and automated approaches and talk about best practices for government agencies. There will also be a session on predictive coding and discussion about the recent cases that have involve the use of technology assisted review. While we are not at the point of mainstream adoption for predictive coding, it is quite exciting to think about the government going from a paper world straight into solutions that would help them manage their unique challenges as well as save them time and money.

Finally, the EDGE Summit will conclude with closing remarks from The Hon. Michael Chertoff, former Secretary of the U.S. Department of Homeland Security from 2005 to 2009. Mr. Chertoff presently consults with high-level strategic counsel to corporate and government leaders on a broad range of security issues, from risk identification and prevention to preparedness, response and recovery. All of these issues now involve data and how to search, collect, analyze, protect and store it. Security is one of the most important aspects of information governance. The government has unique challenges including size and many geographical locations, records management requirements, massive data volume and case load, investigations, heightened security and defense intelligence risks. This year, in particular, will be a defining year; not only because of the Presidential Mandate, but because of the information explosion and the stretch of global economy. This is why the sector needs to come together to share best practices and hear success stories.  Otherwise, they won’t be able to keep up with the data explosion that’s threatening private and public sectors alike.

eDiscovery Down Under: New Zealand and Australia Are Not as Different as They Sound, Mate!

Thursday, March 29th, 2012

Shortly after arriving in Wellington, New Zealand, I picked up the Dominion Post newspaper and read its lead article: a story involving U.S. jurisdiction being exercised over billionaire NZ resident Mr. Kim Dotcom. The article reinforced the challenges we face with blurred legal and data governance issues presented by the globalization of the economy and the expansive reach of the internet. Originally from Germany, and having changed his surname to reflect the origin of his fortune, Mr. Dotcom has become all too familiar in NZ of late. He has just purchased two opulent homes in NZ, and has become an internationally controversial figure for internet piracy. Mr. Dotcom’s legal troubles arise out of his internet business that enables illegal downloads of pirated material between users, which allegedly is powering the largest copyright infringement in global history. It is approximated that his website constitutes 4% of the internet traffic in the world, which means there could be tons of discovery in this case (or, cases).

The most recent legal problems Mr. Dotcom faces are with U.S. authorities who want to extradite him to face copyright charges worth $500 million by his Megaupload file-sharing website. From a criminal and record-keeping standpoint, Mr. Dotcom’s issues highlight the need for and use of appropriate technologies. In order to establish a case against him, it’s likely that search technologies were deployed by U.S. intelligence agencies to piece together Mr. Dotcom’s activities, banking information, emails and the data transfers on his site. In a case like this, where intelligence agencies would need to collect, search and cull email from so many different geographies and data sources down to just the relevant information, using technologies that link email conversation threads and give insight into a data collection set from a transparent search point of view would provide immense value. Additionally, the Immigration bureau in New Zealand has been required to release hundreds of documents about Mr. Dotcom’s residency application that were requested under the Official Information Act (OIA). The records that Immigration had to produce were likely pulled from their archive or records management system in NZ, and then redacted for private information before production to the public.

The same tools are needed in Australia and New Zealand to build a criminal case or to comply with the OIA that we use here in the U.S for investigatory and compliance purposes, as well as for litigation. The trend in information governance technology in APAC is trending first toward government agencies who are purchasing archiving and eDiscovery technologies more rapidly than private companies. Why is this? One reason could be that because the governments in APAC have a larger responsibility for healthcare, education and the protection of privacy; they are more invested in the compliance requirements and staying off the front page of the news for shortcomings. APAC private enterprises that are small or mid-sized and are not yet doing international business do not have the same archiving and eDiscovery needs large government agencies do, nor do they face litigation in the same way their American counterparts do. Large global companies should assume no matter where they are based, that they may be availed to litigation where they are doing business.

An interesting NZ use case on the enterprise level is that of Transpower (the quasi-governmental energy agency), where compliance with both the “private and public” requirements are mandatory. Transpower is an organisation that is government-owned, yet operates for a profit. Sally Myles, an experienced records manager that recently came to Transpower to head up information governance initiatives, says,

“We have to comply with the Public Records Act of 2005, public requests for information are frequent as we and are under constant scrutiny about where we will develop our plants. We also must comply with the Privacy Act of 1993. My challenge is to get the attention of our leadership to demonstrate why we need to make these changes and show them a plan for implementation as well as cost savings.”

Myles’ comments indicate NZ is facing many of the same information challenges we are here in the US with storage, records management and searching for meaningful information within the organisation.

Australia, New Zealand and U.S. Commonalities

In Australia and NZ, litigation is not seen as a compelling business driver the same way it is in the U.S. This is because many of the information governance needs of organisations are driven by regulatory, statutory and compliance requirements and the environment is not as litigious as it is in the U.S. The Official Information Act in NZ, and the Freedom of Information in Australia, are analogous to the Freedom of Information Act (FOIA) here in the U.S. The requirements to produce public records alone justify the use of technology to provide the ability to manage large volumes of data and produce appropriately redacted information to the public. This is true regardless of litigation. Additionally, there are now cases like DuPont or Mr. Dotcom’s, that legitimatize the risk of litigation with the U.S. The fact that implementing an information governance product suite will also enable a company to be prepared for litigation is a beneficial by-product for many entities as they need technology for record keeping and privacy reasons anyway. In essence, the same capabilities are achieved at the end of the day, regardless of the impetus for implementing a solution.

The Royal Commission – The Ultimate eDiscovery Vehicle

One way to think about the Australian Royal Commission (RCs) is to see it as a version of the U.S.’ government investigation. A key difference, however, is that in the case of the U.S. government, an investigation is typically into private companies. Conversely, a Royal Commission is typically an investigation into a government body after a major tragedy and it is initiated by the Head of State. A RC is an ad-hoc, formal, public inquiry into a defined issue with considerable discovery powers. These powers can be greater than those of a judge and are restricted to the scope and terms of reference of the Commission. RCs are called to look into matters of great importance and usually have very large budgets. The RC is charged with researching the issue, consulting experts both within and outside of government and developing findings to recommend changes to the law or other courses of actions. RCs have immense investigatory powers, including summoning witnesses under oath, offering of indemnities, seizing of documents and other evidence (sometimes including those normally protected, such as classified information), holding hearings in camera if necessary and—in a few cases—compelling government officials to aid in the execution of the Commission.

These expansive powers give the RC the opportunity to employ state of the art technology and to skip the slow bureaucratic decision making processes found within the government when it comes to implementing technological change. For this reason, initially, eDiscovery will continue to increase in the government sector at a more rapid pace than in the private in the Asia Pacific region. This is because litigation is less prevalent in the Asia Pacific, and because the RC is a unique investigatory vehicle with the most far-reaching authority for discovering information. Moreover, the timeframes for RCs are tight and their scopes are broad, making them hair on fire situations that move quickly.

While the APAC information management environment does not have the exact same drivers the U.S. market does, it definitely has the same archiving, eDiscovery and technology needs for different reasons. Another key point is that the APAC archiving and eDiscovery market will likely be driven by the government as records, search and production requirements are the main compliance needs in Australia and NZ. APAC organisations would be well served by beginning to modularly implement key elements of an information governance plan, as globalization is driving us all to a more common and automated approach to data management. 

Big Data Decisions Ahead: Government-Sponsored Town Hall Meeting for eDiscovery Industry Coincides With Federal Agency Deadline

Wednesday, February 29th, 2012

Update For Report Submission By Agencies

We are fast approaching the March 27, 2012 deadline for federal agencies to submit their reports to the Office of Management and Budget and the National Archives and Records Administration (NARA) to comply with the Presidential Mandate on records management. We are only at the inception, as we look to a very exciting public town hall meeting in Washington, D.C. – also scheduled for March 27, 2012. This meeting is primarily focused on gathering input from the public sector community, the vendor/IT community, and members of the public at large. Ultimately, NARA will issue a directive that will outline a centralized approach for the federal government for managing records and eDiscovery.

Agencies have been tight lipped about how far along they are in the process of evaluating their workflows and tools for managing their information (both electronic and paper). There is, however, some empirical data from an InformationWeek Survey conducted last year that takes the temperature on where the top IT professionals within the government have their sights set, and the Presidential Mandate should bring some of these concerns to the forefront of the reports. For example, the #1 business driver for migrating to the cloud – cited by 62% of respondents – was cost, while 77% of respondents said their biggest concern was security. Nonetheless, 46% were still highly likely to migrate to a private cloud.

Additionally, as part of the Federal Data Center Consolidation Initiative, agencies are looking to eliminate 800 data centers. While the cost savings are clear, from an information governance viewpoint, it’s hard not to ask what the government plans to do with all of those records?  Clearly, this shift, should it happen, will force the government into a more service-based management approach, as opposed to the traditional asset-based management approach. Some agencies have already migrated to the cloud. This is squarely in line with the Opex over Capex approach emerging for efficiency and cost savings.

Political Climate Unknown

Another major concern that will affect any decisions or policy implementation within the government is, not surprisingly, politics. Luckily, regardless of political party affiliation, it seems to be broadly agreed that the combination of IT spend in Washington, D.C. and the government’s slow move to properly manage electronic records is a problem. Two of the many examples of the problem are manifested in the inability to issue effective litigation holds or respond to Freedom of Information Act (FOIA) requests in a timely and complete manner. Even still, the political agenda of the Republican party may affect the prioritization of the Democratic President’s mandate and efforts could be derailed with a potential change in administration.

Given the election year and the heavy analysis required to produce the report, there is a sentiment in Washington that all of this work may be for naught if the appropriate resources cannot be secured then allocated to effectuate the recommendations. The reality is that data is growing at an unprecedented rate, and the need for the intelligent management of information is no longer deniable. The long term effects of putting this overhaul on the back burner could be disastrous. The government needs a modular plan and a solid budget to address the problem now, as they are already behind.

VanRoekel’s Information Governance

One issue that will likely not be agreed upon between Democrats and Republicans to accomplish the mandate is the almighty budget, and the technology the government must purchase in order to accomplish the necessary technological changes are going to cost a pretty penny.  Steven VanRoekel, the Federal CIO, stated upon the release of the FY 2013 $78.8 billion dollar IT budget:

“We are also making cyber security a cross-agency, cross-government priority goal this year. We have done a good job in ramping up on cyber capabilities agency-by-agency, and as we come together around this goal, we will hold the whole of government accountable for cyber capabilities and examine threats in a holistic way.”

His quote indicates the priority from the top down of evaluating IT holistically, which dovetails nicely with the presidential mandate since security and records management are only two parts of the entire information governance picture. Each agency still has their own work cut out for them across the EDRM. One of the most pressing issues in the upcoming reports will be what each agency decides to bring in-house or to continue outsourcing. This decision will in part depend on whether the inefficiencies identified lead agencies to conclude that they can perform those functions for less money and more efficiently than their contractors.  In evaluating their present capabilities, each agency will need to look at what workflows and technologies they currently have deployed across divisions, what they presently outsource,  and what the marketplace potentially offers them today to address their challenges.

The reason this question is central is because it begs an all-important question about information governance itself.  Information governance inherently implies that an organization or government control most or all aspects of the EDRM model in order to derive the benefits of security, storage, records management and eDiscovery capabilities. Presently, the government is outsourcing many of their litigation services to third party companies that have essentially become de facto government agencies.  This is partly due to scalability issues, and partly because the resources and technologies that are deployed in-house within these agencies are inadequate to properly execute a robust information governance plan.

Conclusion

The ideal scenario for each government agency to comply with the mandate would be that they deploy automated classification for their records management, archiving with expiration appropriately implemented for more than just email, and finally, some level of eDiscovery capability in order to conduct early case assessment and easily produce data for FOIA.  The level of early case assessment needed by each agency will vary, but the general idea would be that before contacting a third party to conduct data collection, the scope of an investigation or matter would be able to be determined in-house.  All things considered, the question remains if the Obama administration will foot this bill or if we will have to wait for a bigger price tag later down the road.  Either way, the government will have to come up to speed and make these changes eventually and the town hall meeting should be an accurate thermometer on where the government stands.

Judge Peck Issues Order Addressing “Joint Predictive Coding Protocol” in Da Silva Moore eDiscovery Case

Thursday, February 23rd, 2012

Litigation attorneys were abuzz last week when a few breaking news stories erroneously reported that The Honorable Andrew J. Peck, United States Magistrate Judge for the Southern District of New York, ordered the parties in a gender discrimination case to use predictive coding technology during discovery.  Despite early reports, the parties in the case (Da Silva Moore v. Publicis Group, et. al.) actually agreed to use predictive coding technology during discovery – apparently of their own accord.  The case is still significant because predictive coding technology in eDiscovery is relatively new to the legal field, and many have been reluctant to embrace a new technological approach to document review due to, among other things, a lack of judicial guidance.

Unfortunately, despite this atmosphere of cooperation, the discussion stalled when the parties realized they were miles apart in terms of defining a mutually agreeable predictive coding protocol.  A February status conference transcript reveals significant confusion and complexity related to issues such as random sampling, quality control testing, and the overall process integrity.  In response, Judge Peck ordered the parties to submit a Joint Protocol for eDiscovery to address eDiscovery generally and the use of predictive coding technology specifically.

The parties submitted their proposed protocol on February 22, 2012 and Judge Peck quickly reduced that submission to a stipulation and order.  The stipulation and order certainly provides more clarity and insight into the process than the status conference transcript.  However, reading the stipulation and order leaves little doubt that the devil is in the details – and there are a lot of details.  Equally clear is the fact that the parties are still in disagreement and the plaintiffs do not support the “joint” protocol laid out in the stipulation and order.  Plaintiffs actually go so far as to incorporate a paragraph into the stipulation and order stating that they “object to this ESI Protocol in its entirety” and they “reserve the right to object to its use in the case.”

These problems underscore some of the points made in a Forbes article published earlier this week titled,Federal Judges Consider Important Issues That Could Shape the Future of Predictive Coding Technology.”  The Forbes article relies in part on a recent predictive coding survey to make the point that, while predictive coding technology has tremendous potential, the solutions need to become more transparent and the workflows must be simplified before they go mainstream.

Information Governance Gets Presidential Attention: Banking Bailout Cost $4.76 Trillion, Technology Revamp Approaches $240 Billion

Tuesday, January 10th, 2012

On November 28, 2011, The White House issued a Presidential Memorandum that outlines what is expected of the 480 federal agencies of the government’s three branches in the next 240 days.  Up until now, Washington, D.C. has been the Wild West with regard to information governance as each agency has often unilaterally adopted its own arbitrary policies and systems.  Moreover, some agencies have recently purchased differing technologies.  Unfortunately,  with the President’s ultimate goal of uniformity, this centralization will be difficult to accomplish with a range of disparate technological approaches.

Particular pain points for the government traditionally include retention, search, collection, review and production of vast amounts of data and records.  Specifically, these pain points include examples of: FOIA requests gone awry, the issuance of legal holds across different agencies leading to spoliation, and the ever present problem of decentralization.

Why is the government different?

Old Practices. First, in some instances the government is technologically behind (its corporate counterparts) and is failing to meet the judiciary’s expectation that organizations effectively store, manage and discover their information.  This failing is self-evident via  the directive coming from the President mandating that these agencies start to get a plan to attack this problem.  Though different than other corporate entities, the government is nevertheless held to the same standards of eDiscovery under the Federal Rules of Civil Procedure (FRCP).  In practice, the government has been given more leniency until recently, and while equal expectations have not always been the case, the gap between the private and public sectors in no longer possible to ignore.

FOIA.  The government’s arduous obligation to produce information under the Freedom of Information Act (FOIA) has no corresponding analog for private organizations, who are responding to more traditional civil discovery requests.  Because the government is so large with many disparate IT systems, it is cumbersome to work efficiently through the information governance process across agencies and many times still difficult inside one individual agency with multiple divisions.  Executing this production process is even more difficult if not impossible to do manually without properly deployed technology.  Additionally, many of the investigatory agencies that issue requests to the private sector need more efficient ways to manage and review data they are requesting.  To compound problems, within the US government there are two opposing interests are at play; both screaming for a resolution, and that solution needs to be centralized.  On the one hand, the government needs to retain more than a corporation may need to in order to satisfy a FOIA request.

Titan Pulled at Both Ends. On the other hand, without classification of the records that are to be kept, technology to organize this vast amount of data and some amount of expiry, every agency will essentially become their own massive repository.  The “retain everything mentality” coupled with the inefficient search and retrieval of data and records is where they stand today.  Corporations are experiencing this on a smaller scale today and many are collectively further along than the government in this process, without the FOIA complications.

What are agencies doing to address these mandates?

In their plans, agencies must describe how they will improve or maintain their records management programs, particularly with regard to email, social media and other electronic communications.  They must also move away from such a paper-centric existence.  eDiscovery consultants and software companies are helping agencies through this process, essentially writing their plans to match the President’s directive.  The cloud conversation has been revisited, and agencies also have to explain how they will use cloud-based services and storage solutions, as well as identify gaps in existing laws or regulations that presently prevent improved management.  Small innovations are taking place.  In fact, just recently the DOJ added a new search feature on their website to make it easier for the public to find documents that have been posted by agencies on their websites.

The Office of Management and Budget (OMB), National Archives and Records Administration (NARA), and Justice Department will use those reports to come up with a government-wide records management framework that is more efficient, maintains accountability by documenting agency actions and promotes “appropriate” public access to records.  Hopefully, the framework they come up with will be centralized and workable on a realistic timeframe with resources sufficiently allocated to the initiative.

How much will this cost?

The President’s mandate is a great initiative and very necessary, but one cannot help but think about the costs in terms of money, time and resources when considering these crucial changes.  The most recent version of a financial services and general government appropriations bill in the Senate extends $378.8 million to NARA for this initiative.  President Obama appointed Steven VanRoekel as the United States CIO in August 2011 to succeed Vivek Kundra.  After VanRoekel’s speech at the Churchill Club in October of 2011, an audience member asked him what the most surprising aspect of his new job was.  VanRoekel said that it was managing the huge and sometimes unwieldy resources of his $80 billion budget.  It is going to take even more than this to do the job right, however.

Using conservative estimates, assume for an agency to implement archiving and eDiscovery capabilities as an initial investment would be $100 million.  That approximates $480 billion for all 480 agencies.  Assume a uniform information governance platform gets adopted by all agencies at a 50% discount due to the large contracts and also factoring in smaller sums for agencies with lesser needs.  The total now comes to $240 billion.  For context, that figure is 5% of what was spent by Federal Government ($4.76 trillion) on the biggest bailout in history in 2008. That leaves a need for $160 billion more to get the job done. VanRoekel also commented at the same meeting that he wants to break down massive multi-year information technology projects into smaller, more modular projects in the hopes of saving the government from getting mired in multi-million dollar failures.   His solution to this, he says, is modular and incremental deployment.

While Rome was not built in a day, this initiative is long overdue, yet feasible, as technology exists to address these challenges rather quickly.  After these 240 days are complete and a plan is drawn the real question is, how are we going to pay now for technology the government needed yesterday?  In a perfect world, the government would select a platform for archiving and eDiscovery, break the project into incremental milestones and roll out a uniform combination of solutions that are best of breed in their expertise.

Lessons Learned for 2012: Spotlighting the Top eDiscovery Cases from 2011

Tuesday, January 3rd, 2012

The New Year has now dawned and with it, the certainty that 2012 will bring new developments to the world of eDiscovery.  Last month, we spotlighted some eDiscovery trends for 2012 that we feel certain will occur in the near term.  To understand how these trends will play out, it is instructive to review some of the top eDiscovery cases from 2011.  These decisions provide a roadmap of best practices that the courts promulgated last year.  They also spotlight the expectations that courts will likely have for organizations in 2012 and beyond.

Issuing a Timely and Comprehensive Litigation Hold

Case: E.I. du Pont de Nemours v. Kolon Industries (E.D. Va. July 21, 2011)

Summary: The court issued a stiff rebuke against defendant Kolon Industries for failing to issue a timely and proper litigation hold.  That rebuke came in the form of an instruction to the jury that Kolon executives and employees destroyed key evidence after the company’s preservation duty was triggered.  The jury responded by returning a stunning $919 million verdict for DuPont.

The spoliation at issue occurred when several Kolon executives and employees deleted thousands emails and other records relevant to DuPont’s trade secret claims.  The court laid the blame for this destruction on the company’s attorneys and executives, reasoning they could have prevented the spoliation through an effective litigation hold process.  At issue were three hold notices circulated to the key players and data sources.  The notices were all deficient in some manner.  They were either too limited in their distribution, ineffective since they were prepared in English for Korean-speaking employees, or too late to prevent or otherwise ameliorate the spoliation.

The Lessons for 2012: The DuPont case underscores the importance of issuing a timely and comprehensive litigation hold notice.  As DuPont teaches, organizations should identify what key players and data sources may have relevant information.  A comprehensive notice should then be prepared to communicate the precise hold instructions in an intelligible fashion.  Finally, the hold should be circulated immediately to prevent data loss.

Organizations should also consider deploying the latest technologies to help effectuate this process.  This includes an eDiscovery platform that enables automated legal hold acknowledgements.  Such technology will allow custodians to be promptly and properly apprised of litigation and thereby retain information that might otherwise have been discarded.

Another Must-Read Case: Haraburda v. Arcelor Mittal U.S.A., Inc. (D. Ind. June 28, 2011)

Suspending Document Retention Policies

Case: Viramontes v. U.S. Bancorp (N.D. Ill. Jan. 27, 2011)

Summary: The defendant bank defeated a sanctions motion because it modified aspects of its email retention policy once it was aware litigation was reasonably foreseeable.  The bank implemented a retention policy that kept emails for 90 days, after which the emails were overwritten and destroyed.  The bank also promulgated a course of action whereby the retention policy would be promptly suspended on the occurrence of litigation or other triggering event.  This way, the bank could establish the reasonableness of its policy in litigation.  Because the bank followed that procedure in good faith, it was protected from court sanctions under the Federal Rules of Civil Procedure 37(e) “safe harbor.”

The Lesson for 2012: As Viramontes shows, an organization can be prepared for eDiscovery disputes by timely suspending aspects of its document retention policies.  By modifying retention policies when so required, an organization can develop a defensible retention procedure and be protected from court sanctions under Rule 37(e).

Coupling those procedures with archiving software will only enhance an organization’s eDiscovery preparations.  Effective archiving software will have a litigation hold mechanism, which enables an organization to suspend automated retention rules.  This will better ensure that data subject to a preservation duty is actually retained.

Another Must-Read Case: Micron Technology, Inc. v. Rambus Inc., 645 F.3d 1311 (Fed. Cir. 2011)

Managing the Document Collection Process

Case: Northington v. H & M International (N.D.Ill. Jan. 12, 2011)

Summary: The court issued an adverse inference jury instruction against a company that destroyed relevant emails and other data.  The spoliation occurred in large part because legal and IT were not involved in the collection process.  For example, counsel was not actively engaged in the critical steps of preservation, identification or collection of electronically stored information (ESI).  Nor was IT brought into the picture until 15 months after the preservation duty was triggered. By that time, rank and file employees – some of whom were accused by the plaintiff of harassment – stepped into this vacuum and conducted the collection process without meaningful oversight.  Predictably, key documents were never found and the court had little choice but to promise to inform the jury that the company destroyed evidence.

The Lesson for 2012: An organization does not have to suffer the same fate as the company in the Northington case.  It can take charge of its data during litigation through cooperative governance between legal and IT.  After issuing a timely and effective litigation hold, legal should typically involve IT in the collection process.  Legal should rely on IT to help identify all data sources – servers, systems and custodians – that likely contain relevant information.  IT will also be instrumental in preserving and collecting that data for subsequent review and analysis by legal.  By working together in a top-down fashion, organizations can better ensure that their eDiscovery process is defensible and not fatally flawed.

Another Must-Read Case: Green v. Blitz U.S.A., Inc. (E.D. Tex. Mar. 1, 2011)

Using Proportionality to Dictate the Scope of Permissible Discovery

Case: DCG Systems v. Checkpoint Technologies (N.D. Ca. Nov. 2, 2011)

The court adopted the new Model Order on E-Discovery in Patent Cases recently promulgated by the U.S. Court of Appeals for the Federal Circuit.  The model order incorporates principles of proportionality to reduce the production of email in patent litigation.  In adopting the order, the court explained that email productions should be scaled back since email is infrequently introduced as evidence at trial.  As a result, email production requests will be restricted to five search terms and may only span a defined set of five custodians.  Furthermore, email discovery in DCG Systems will wait until after the parties complete discovery on the “core documentation” concerning the patent, the accused product and prior art.

The Lesson for 2012: Courts seem to be slowly moving toward a system that incorporates proportionality as the touchstone for eDiscovery.  This is occurring beyond the field of patent litigation, as evidenced by other recent cases.  Even the State of Utah has gotten in on the act, revising its version of Rule 26 to require that all discovery meet the standards of proportionality.  While there are undoubtedly deviations from this trend (e.g., Pippins v. KPMG (S.D.N.Y. Oct. 7, 2011)), the clear lesson is that discovery should comply with the cost cutting mandate of Federal Rule 1.

Another Must-Read Case: Omni Laboratories Inc. v. Eden Energy Ltd [2011] EWHC 2169 (TCC) (29 July 2011)

Leveraging eDiscovery Technologies for Search and Review

Case: Oracle America v. Google (N.D. Ca. Oct. 20, 2011)

The court ordered Google to produce an email that it previously withheld on attorney client privilege grounds.  While the email’s focus on business negotiations vitiated Google’s claim of privilege, that claim was also undermined by Google’s production of eight earlier drafts of the email.  The drafts were produced because they did not contain addressees or the heading “attorney client privilege,” which the sender later inserted into the final email draft.  Because those details were absent from the earlier drafts, Google’s “electronic scanning mechanisms did not catch those drafts before production.”

The Lesson for 2012: Organizations need to leverage next generation, robust technology to support the document production process in discovery.  Tools such as email analytical software, which can isolate drafts and offer to remove them from production, are needed to address complex production issues.  Other technological capabilities, such as Near Duplicate Identification, can also help identify draft materials and marry them up with finals that have been marked as privileged.  Last but not least, technology assisted review has the potential of enabling one lawyer to efficiently complete the work that previously took thousands of hours.  Finding the budget and doing the research to obtain the right tools for the enterprise should be a priority for organizations in 2012.

Another Must-Read Case: J-M Manufacturing v. McDermott, Will & Emery (CA Super. Jun. 2, 2011)

Conclusion

There were any number of other significant cases from 2011 that could have made this list.  We invite you to share your favorites in the comments section or contact us directly with your feedback.

For more on the cases discussed above, watch this video:

Top Ten eDiscovery Predictions for 2012

Thursday, December 8th, 2011

As 2011 comes quickly to a close we’ve attempted, as in years past, to do our best Carnac impersonation and divine the future of eDiscovery.  Some of these predictions may happen more quickly than others, but it’s our sense that all will come to pass in the near future – it’s just a matter of timing.

  1. Technology Assisted Review (TAR) Gains Speed.  The area of Technology Assisted Review is very exciting since there are a host of emerging technologies that can help make the review process more efficient, ranging from email threading, concept search, clustering, predictive coding and the like.  There are two fundamental challenges however.  First, the technology doesn’t work in a vacuum, meaning that the workflows need to be properly designed and the users need to make accurate decisions because those judgment calls often are then magnified by the application.  Next, the defensibility of the given approach needs to be well vetted.  While it’s likely not necessary (or practical) to expect a judge to mandate the use of a specific technological approach, it is important for the applied technologies to be reasonable, transparent and auditable since the worst possible outcome would be to have a technology challenged and then find the producing party unable to adequately explain their methodology.
  2. The Custodian-Based Collection Model Comes Under Stress. Ever since the days of Zubulake, litigants have focused on “key players” as a proxy for finding relevant information during the eDiscovery process.  Early on, this model worked particularly well in an email-centric environment.  But, as discovery from cloud sources, collaborative worksites (like SharePoint) and other unstructured data repositories continues to become increasingly mainstream, the custodian-oriented collection model will become rapidly outmoded because it will fail to take into account topically-oriented searches.  This trend will be further amplified by the bench’s increasing distrust of manual, custodian-based data collection practices and the presence of better automated search methods, which are particularly valuable for certain types of litigation (e.g., patent disputes, product liability cases).
  3. The FRCP Amendment Debate Will Rage On – Unfortunately Without Much Near Term Progress. While it is clear that the eDiscovery preservation duty has become a more complex and risk laden process, it’s not clear that this “pain” is causally related to the FRCP.  In the notes from the Dallas mini-conference, a pending Sedona survey was quoted referencing the fact that preservation challenges were increasing dramatically.  Yet, there isn’t a consensus viewpoint regarding which changes, if any, would help improve the murky problem.  In the near term this means that organizations with significant preservation pains will need to better utilize the rules that are on the books and deploy enabling technologies where possible.
  4. Data Hoarding Increasingly Goes Out of Fashion. The war cry of many IT professionals that “storage is cheap” is starting to fall on deaf ears.  Organizations are realizing that the cost of storing information is just the tip of the iceberg when it comes to the litigation risk of having terabytes (and conceivably petabytes) of unstructured, uncategorized and unmanaged electronically stored information (ESI).  This tsunami of information will increasingly become an information liability for organizations that have never deleted a byte of information.  In 2012, more corporations will see the need to clean out their digital houses and will realize that such cleansing (where permitted) is a best practice moving forward.  This applies with equal force to the US government, which has recently mandated such an effort at President Obama’s behest.
  5. Information Governance Becomes a Viable Reality.  For several years there’s been an effort to combine the reactive (far right) side of the EDRM with the logically connected proactive (far left) side of the EDRM.  But now, a number of surveys have linked good information governance hygiene with better response times to eDiscovery requests and governmental inquires, as well as a corresponding lower chance of being sanctioned and the ability to turn over less responsive information.  In 2012, enterprises will realize that the litigation use case is just one way to leverage archival and eDiscovery tools, further accelerating adoption.
  6. Backup Tapes Will Be Increasingly Seen as a Liability.  Using backup tapes for disaster recovery/business continuity purposes remains a viable business strategy, although backing up to tape will become less prevalent as cloud backup increases.  However, if tapes are kept around longer than necessary (days versus months) then they become a ticking time bomb when a litigation or inquiry event crops up.
  7. International eDiscovery/eDisclosure Processes Will Continue to Mature. It’s easy to think of the US as dominating the eDiscovery landscape. While this is gospel for us here in the States, international markets are developing quickly and in many ways are ahead of the US, particularly with regulatory compliance-driven use cases, like the UK Bribery Act 2010.  This fact, coupled with the menagerie of international privacy laws, means we’ll be less Balkanized in our eDiscovery efforts moving forward since we do really need to be thinking and practicing globally.
  8. Email Becomes “So 2009” As Social Media Gains Traction. While email has been the eDiscovery darling for the past decade, it’s getting a little long in the tooth.  In the next year, new types of ESI (social media, structured data, loose files, cloud context, mobile device messages, etc.) will cause headaches for a number of enterprises that have been overly email-centric.  Already in 2011, organizations are finding that other sources of ESI like documents/files and structured data are rivaling email in importance for eDiscovery requests, and this trend shows no signs of abating, particularly for regulated industries. This heterogeneous mix of ESI will certainly result in challenges for many companies, with some unlucky ones getting sanctioned because they ignored these emerging data types.
  9. Cost Shifting Will Become More Prevalent – Impacting the “American Rule.” For ages, the American Rule held that producing parties had to pay for their production costs, with a few narrow exceptions.  Next year we’ll see even more courts award winning parties their eDiscovery costs under 28 U.S.C. §1920(4) and Rule 54(d)(1) FRCP. Courts are now beginning to consider the services of an eDiscovery vendor as “the 21st Century equivalent of making copies.”
  10. Risk Assessment Becomes a Critical Component of eDiscovery. Managing risk is a foundational underpinning for litigators generally, but its role in eDiscovery has been a bit obscure.  Now, with the tremendous statistical insights that are made possible by enabling software technologies, it will become increasingly important for counsel to manage risk by deciding what types of error/precision rates are possible.  This risk analysis is particularly critical for conducting any variety of technology assisted review process since precision, recall and f-measure statistics all require a delicate balance of risk and reward.

Accurately divining the future is difficult (some might say impossible), but in the electronic discovery arena many of these predictions can happen if enough practitioners decide they want them to happen.  So, the future is fortunately within reach.