24h-payday

Posts Tagged ‘early case assessment’

The Gartner 2013 Magic Quadrant for eDiscovery Software is Out!

Wednesday, June 12th, 2013

This week marks the release of the 3rd annual Gartner Magic Quadrant for e-Discovery Software report.  In the early days of eDiscovery, most companies outsourced almost every sizeable project to vendors and law firms so eDiscovery software was barely a blip on the radar screen for technology analysts. Fast forward a few years to an era of explosive information growth and rising eDiscovery costs and the landscape has changed significantly. Today, much of the outsourced eDiscovery “services” business has been replaced by eDiscovery software solutions that organizations bring in house to reduce risk and cost. As a result, the enterprise eDiscovery software market is forecast to grow from $1.4 billion in total software revenue worldwide in 2012 to $2.9 billion by 2017. (See Forecast:  Enterprise E-Discovery Software, Worldwide, 2012 – 2017, Tom Eid, December, 2012).

Not surprisingly, today’s rapidly growing eDiscovery software market has become significant enough to catch the attention of mainstream analysts like Gartner. This is good news for company lawyers who are used to delegating enterprise software decisions to IT departments and outside law firms. Because today those same company lawyers are involved in eDiscovery and other information management software purchasing decisions for their organizations. While these lawyers understand the company’s legal requirements, they do not necessarily understand how to choose the best technology to address those requirements. Conversely, IT representatives understand enterprise software, but they do not necessarily understand the law. Gartner bridges this information gap by providing in depth and independent analysis of the top eDiscovery software solutions in the form of the Gartner Magic Quadrant for e-Discovery Software.

Gartner’s methodology for preparing the annual Magic Quadrant report is rigorous. Providers must meet quantitative requirements such as revenue and significant market penetration to be included in the report. If these threshold requirements are met then Gartner probes deeper by meeting with company representatives, interviewing customers, and soliciting feedback to written questions. Providers that make the cut are evaluated across four Magic Quadrant categories as either “leaders, challengers, niche players, or visionaries.” Where each provider ends up on the quadrant is guided by an independent evaluation of each provider’s “ability to execute” and “completeness of vision.” Landing in the “leaders” quadrant is considered a top recognition.

The nine Leaders in this year’s Magic Quadrant have four primary characteristics (See figure 1 above).

The first is whether the provider has functionality that spans both sides of the electronic discovery reference model (EDRM) (left side – identification, preservation, litigation hold, collection, early case assessment (ECA) and processing and right-side – processing, review, analysis and production). “While Gartner recognizes that not all enterprises — or even the majority — will want to perform legal-review work in-house, more and more are dictating what review tools will be used by their outside counsel or legal-service providers. As practitioners become more sophisticated, they are demanding that data change hands as little as possible, to reduce cost and risk. This is a continuation of a trend we saw developing last year, and it has grown again in importance, as evidenced both by inquiries from Gartner clients and reports from vendors about the priorities of current and prospective customers.”

We see this as consistent with the theme that providers with archiving solutions designed to automate data retention and destruction policies generally fared better than those without archiving technology. The rationale is that part of a good end-to-end eDiscovery strategy includes proactively deleting data organizations do not have a legal or business need to keep. This approach decreases the amount of downstream electronically stored information (ESI) organizations must review on a case-by-case basis so the cost savings can be significant.

Not surprisingly, whether or not a provider offers technology assisted review or predictive coding capabilities was another factor in evaluating each provider’s end-to-end functionality. The industry has witnessed a surge in predictive coding case law since 2012 and judicial interest has helped drive this momentum. However, a key driver for implementing predictive coding technology is the ability to reduce the amount of ESI attorneys need to review on a case-by-case basis. Given the fact that attorney review is the most expensive phase of the eDiscovery process, many organizations are complementing their proactive information reduction (archiving) strategy with a case-by-case information reduction plan that also includes predictive coding.

The second characteristic Gartner considered was that Leaders’ business models clearly demonstrate that their focus is software development and sales, as opposed to the provision of services. Gartner acknowledged that the eDiscovery services market is strong, but explains that the purpose of the Magic Quadrant is to evaluate software, not services. The justification is that “[c]orporate buyers and even law firms are trending towards taking as much e-Discovery process in house as they can, for risk management and cost control reasons. In addition, the vendor landscape for services in this area is consolidating. A strong software offering, which can be exploited for growth and especially profitability, is what Gartner looked for and evaluated.”

Third, Gartner believes the solution provider market is shrinking and that corporations are becoming more involved in buying decisions instead of deferring technology decisions to their outside law firms. Therefore, those in the Leaders category were expected to illustrate a good mix of corporate and law firm buying centers. The rationale behind this category is that law firms often help influence corporate buying decisions so both are important players in the buying cycle. However, Gartner also highlighted that vendors who get the majority of their revenues from the “legal solution provider channel” or directly from “law firms” may soon face problems.

The final characteristic Gartner considered for the Leaders quadrant is related to financial performance and growth. In measuring this component, Gartner explained that a number of factors were considered. Primary among them is whether the Leaders are keeping pace with or even exceeding overall market growth. (See “Forecast:  Enterprise E-Discovery Software, Worldwide, 2012 – 2017,” Tom Eid, December, 2012).

Companies landing in Gartner’s Magic Quadrant for eDiscovery Software have reason to celebrate their position in an increasingly competitive market. To review Gartner’s full report yourself, click here. In the meantime, please feel free to share your own comments below as the industry anxiously awaits next year’s Magic Quadrant Report.

Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.

Legal Tech 2013 Sessions: Symantec explores eDiscovery beyond the EDRM

Wednesday, December 19th, 2012

Having previously predicted the ‘happenings-to-be’ as well as recommended the ‘what not to do’ at LegalTech New York, the veteran LTNY team here at Symantec has decided to build anticipation for the 2013 event via a video series starring the LTNY un-baptized associate.  Get introduced to our eDiscovery-challenged protagonist in the first of our videos (above).

As for this year’s show we’re pleased to expand our presence and are very excited to introduce eDiscovery without limits, along with a LegalTech that promises sessions, social events and opportunities for attendees in the same vein.   In regards to the first aspect – the sessions – the team of Symantec eDiscovery counsels will moderate panelist sessions on topics ranging across and beyond the EDRM.  Joined by distinguished industry representatives they’ll push the discussion deeper in 5 sessions with a potential 6 hours of CLE credits offered to the attendees.

Matt Nelson, resident author of Predictive Coding for Dummies will moderate “How good is your predictive coding poker face?” where panelists tackle the recently controversial subjects of disclosing the use of Predictive Coding technology, statistical sampling and the production of training sets to the opposition.

Allison Walton will moderate, “eDiscovery in 3D: The New Generation of Early Case Assessment Techniques” where panelists will enlighten the crowd on taking ECA upstream into the information creation and retention stages and implementing an executable information governance workflow.  Allison will also moderate “You’re Doing it Wrong!!! How To Avoid Discovery Sanctions Due to a Flawed Legal Hold Process” where panelists recommend best practices towards a defensible legal hold process in light of potential changes in the FRCP and increased judicial scrutiny of preservation efforts.

Phil Favro will moderate “Protecting Your ESI Blindside: Why a “Defensible Deletion” Offense is the Best eDiscovery Defense” where panelists debate the viability of defensible deletion in the enterprise, the related court decisions to consider and quantifying the ROI to support a deletion strategy.

Chris Talbott will moderate a session on “Bringing eDiscovery back to Basics with the Clearwell eDiscovery Platform”, where engineer Anna Simpson will demonstrate Clearwell technology in the context of our panelist’s everyday use on cases ranging from FCPA inquires to IP litigation.

Please browse our microsite for complete supersession descriptions and a look at Symantec’s LTNY 2013 presence.  We hope you stay tuned to eDiscovery 2.0 throughout January to hear what Symantec has planned for the plenary session, our special event, contest giveaways and product announcements.

Spotlighting the Top Electronic Discovery Cases from 2012

Friday, December 14th, 2012

With the New Year quickly approaching, it is worth reflecting on some of the key eDiscovery developments that have occurred during 2012. While legislative, regulatory and rulemaking bodies have undoubtedly impacted eDiscovery, the judiciary has once again played the most dramatic role.  There are several lessons from the top 2012 court cases that, if followed, will likely help organizations reduce the costs and risks associated with eDiscovery. These cases also spotlight the expectations that courts will likely have for organizations in 2013 and beyond.

Implementing a Defensible Deletion Strategy

Case: Brigham Young University v. Pfizer, 282 F.R.D. 566 (D. Utah 2012)

In Brigham Young, the plaintiff university had pressed for sanctions as a result of Pfizer’s destruction of key documents pursuant to its information retention policies. The court rejected that argument because such a position failed to appreciate the basic workings of a valid corporate retention schedule. As the court reasoned, “[e]vidence may simply be discarded as a result of good faith business procedures.” When those procedures operate to inadvertently destroy evidence before the duty to preserve is triggered, the court held that sanctions should not issue: “The Federal Rules protect from sanctions those who lack control over the requested materials or who have discarded them as a result of good faith business procedures.”

Summary: The Brigham Young case is significant since it emphasizes that organizations should implement a defensible deletion strategy to rid themselves of data stockpiles. Absent a preservation duty or other exceptional circumstances, organizations that pare back ESI pursuant to “good faith business procedures” (such as a neutral retention policy) will be protected from sanctions.

**Another Must-Read Case: Danny Lynn Elec. v. Veolia Es Solid Waste (M.D. Ala. Mar. 9, 2012)

Issuing a Timely and Comprehensive Litigation Hold

Case: Apple, Inc. v. Samsung Electronics Co., Ltd, — F. Supp. 2d. — (N.D. Cal. 2012)

Summary: The court first issued an adverse inference instruction against Samsung to address spoliation charges brought by Apple. In particular, the court faulted Samsung for failing to circulate a comprehensive litigation hold instruction when it first anticipated litigation. This eventually culminated in the loss of emails from several key Samsung custodians, inviting the court’s adverse inference sanction.

Ironically, however, Apple was subsequently sanctioned for failing to issue a proper hold notice. Just like Samsung, Apple failed to distribute a hold until several months after litigation was reasonably foreseeable. The tardy hold instruction, coupled with evidence suggesting that Apple employees were “encouraged to keep the size of their email accounts below certain limits,” ultimately led the court to conclude that Apple destroyed documents after its preservation duty ripened.

The Lesson for 2013: The Apple case underscores the importance of issuing a timely and comprehensive litigation hold notice. For organizations, this likely means identifying the key players and data sources that may have relevant information and then distributing an intelligible hold instruction. It may also require suspending aspects of information retention policies to preserve relevant ESI. By following these best practices, organizations can better avoid the sanctions bogeyman that haunts so many litigants in eDiscovery.

**Another Must-Read Case: Chin v. Port Authority of New York, 685 F.3d 135 (2nd Cir. 2012)

Judicial Approval of Predictive Coding

Case: Da Silva Moore v. Publicis Groupe, — F.R.D. — (S.D.N.Y. Feb. 24, 2012)

Summary: The court entered an order that turned out to be the first of its kind: approving the use of predictive coding technology in the discovery phase of litigation. That order was entered pursuant to the parties’ stipulation, which provided that defendant MSL Group could use predictive coding in connection with its obligation to produce relevant documents. Pursuant to that order, the parties methodically (yet at times acrimoniously) worked over several months to fine tune the originally developed protocol to better ensure the production of relevant documents by defendant MSL.

The Lesson for 2013: The court declared in its order that predictive coding “is an acceptable way to search for relevant ESI in appropriate cases.” Nevertheless, the court also made clear that this technology is not the exclusive method now for conducting document review. Instead, predictive coding should be viewed as one of many different types of tools that often can and should be used together.

**Another Must-Read Case: In Re: Actos (Pioglitazone) Prods. Liab. Litig. (W.D. La. July 10, 2012)

Proportionality and Cooperation are Inextricably Intertwined

Case: Pippins v. KPMG LLP, 279 F.R.D. 245 (S.D.N.Y. 2012)

Summary: The court ordered the defendant accounting firm (KPMG) to preserve thousands of employee hard drives. The firm had argued that the high cost of preserving the drives was disproportionate to the value of the ESI stored on the drives. Instead of preserving all of the drives, the firm hoped to maintain a reduced sample, asserting that the ESI on the sample drives would satisfy the evidentiary demands of the plaintiffs’ class action claims.

The court rejected the proportionality argument primarily because the firm refused to permit plaintiffs or the court to analyze the ESI found on the drives. Without any transparency into the contents of the drives, the court could not weigh the benefits of the discovery against the alleged burdens of preservation. The court was thus left to speculate about the nature of the ESI on the drives, reasoning that it went to the heart of plaintiffs’ class action claims. As the district court observed, the firm may very well have obtained the relief it requested had it engaged in “good faith negotiations” with the plaintiffs over the preservation of the drives.

The Lesson for 2013: The Pippins decision reinforces a common refrain that parties seeking the protection of proportionality principles must engage in reasonable, cooperative discovery conduct. Staking out uncooperative positions in the name of zealous advocacy stands in sharp contrast to proportionality standards and the cost cutting mandate of Rule 1. Moreover, such a tactic may very well foreclose proportionality considerations, just as it did in Pippins.

**Another Must-Read Case: Kleen Products LLC v. Packaging Corp. of America (N.D. Ill. Sept. 28, 2012)

Conclusion

There were any number of other significant cases from 2012 that could have made this list.  We invite you to share your favorites in the comments section or contact us directly with your feedback.

Where There’s Smoke There’s Fire: Powering eDiscovery with Data Loss Prevention

Monday, November 12th, 2012

New technologies are being repurposed for Early Case Assessment (ECA) in this ever-changing global economy chockfull of intellectual property theft and cybertheft. These increasingly hot issues are now compelling lawyers to become savvier about how the technologies they use to identify IP theft and related issues in eDiscovery. One of the more useful, but often overlooked tools in this regard is Data Loss Prevention (DLP) technology. Traditionally a data breach and security tool, DLP has emerged as yet another tool in the Litigator’s Tool Belt™ that can be applied in eDiscovery.

DLP technology utilizes Vector Machine Learning (VML) to detect intellectual property, such as product designs, source code and trademarked language that are deemed proprietary and confidential. This technology eliminates the need for developing laborious keyword-based policies or fingerprinting documents. While a corporation can certainly customize these policies, there are off the shelf materials that make the technology easy to deploy.

An exemplary use case that spotlights how DLP could have been deployed in the eDiscovery context is the case of E.I. Du Pont de Nemours v. Kolon Industries. In DuPont, a jury issued a $919 million verdict after finding that the defendant manufacturer stole critical elements of the formula for Kevlar, a closely guarded and highly profitable DuPont trade secret. Despite the measures that were taken to protect the trade secret, a former DuPont consultant successfully copied key information relating to Kevlar on to a CD that was later disseminated to the manufacturer’s executives. All of this came to light in the recently unsealed criminal indictments the U.S. Department of Justice obtained against the manufacturer and several of its executives.

Perhaps all of this could have been avoided had a DLP tool been deployed. A properly implemented DLP solution in the DuPont case might have detected the misappropriation that occurred and perhaps prompted an internal investigation. At the very least, DLP could possibly have mitigated the harmful effects of the trade secret theft. DLP technology could potentially have detected the departure/copying of proprietary information and any other suspicious behavior regarding sensitive IP.

As the DuPont case teaches, DLP can be utilized to detect IP theft and data breaches. In addition, it can act as an early case assessment (ECA) tool for lawyers in both civil and criminal actions. With data breaches, where there is smoke (breach) there is generally fire (litigation). A DLP incident report can be used as a basis for an investigation, and essentially reverse engineer the ECA process with hard evidence underlying the data breach. Thus, instead of beginning an investigation with a hunch or tangential lead, DLP gives hard facts to lawyers, and ultimately serves as a roadmap for effective legal hold implementation for the communications of custodians. Instead of discovering data breaches during the discovery process, DLP allows lawyers to start with this information, making the entire matter more efficient and targeted.

From an information governance point of view, DLP also has a relationship with the left proactive side of the Electronic Discovery Reference Model. The DLP technology can also be repurposed as Data Classification Services for automated document retention. The policy and technology combination of DCS/DLP speak to each other in harmony to accomplish appropriate document retention as well as breach prevention and notification. It follows that there would be similar identifiers for both policy consoles in DCS/DLP, and that these indicators enable the technology to make intelligent decisions.

Given this backdrop, it behooves both firm lawyers and corporate counsel to consider getting up to speed on the capabilities of DLP tools. The benefits DLP offers in eDiscovery are too important to be ignored.

Federal Directive Hits Two Birds (RIM and eDiscovery) with One Stone

Thursday, October 18th, 2012

The eagerly awaited Directive from The Office of Management and Budget (OMB) and The National Archives and Records Administration (NARA) was released at the end of August. In an attempt to go behind the scenes, we’ve asked the Project Management Office (PMO) and the Chief Records Officer for the NARA to respond to a few key questions. 

We know that the Presidential Mandate was the impetus for the agency self-assessments that were submitted to NARA. Now that NARA and the OMB have distilled those reports, what are the biggest challenges on a go forward basis for the government regarding record keeping, information governance and eDiscovery?

“In each of those areas, the biggest challenge that can be identified is the rapid emergence and deployment of technology. Technology has changed the way Federal agencies carry out their missions and create the records required to document that activity. It has also changed the dynamics in records management. In the past, agencies would maintain central file rooms where records were stored and managed. Now, with distributed computing networks, records are likely to be in a multitude of electronic formats, on a variety of servers, and exist as multiple copies. Records management practices need to move forward to solve that challenge. If done right, good records management (especially of electronic records) can also be of great help in providing a solid foundation for applying best practices in other areas, including in eDiscovery, FOIA, as well as in all aspects of information governance.”    

What is the biggest action item from the Directive for agencies to take away?

“The Directive creates a framework for records management in the 21st century that emphasizes the primacy of electronic information and directs agencies to being transforming their current process to identify and capture electronic records. One milestone is that by 2016, agencies must be managing their email in an electronically accessible format (with tools that make this possible, not printing out emails to paper). Agencies should begin planning for the transition, where appropriate, from paper-based records management process to those that preserve records in an electronic format.

The Directive also calls on agencies to designate a Senior Agency Official (SAO) for Records Management by November 15, 2012. The SAO is intended to raise the profile of records management in an agency to ensure that each agency commits the resources necessary to carry out the rest of the goals in the Directive. A meeting of SAOs is to be held at the National Archives with the Archivist of the United States convening the meeting by the end of this year. Details about that meeting will be distributed by NARA soon.”

Does the Directive holistically address information governance for the agencies, or is it likely that agencies will continue to deploy different technology even within their own departments?

“In general, as long as agencies are properly managing their records, it does not matter what technologies they are using. However, one of the drivers behind the issuance of the Memorandum and the Directive was identifying ways in which agencies can reduce costs while still meeting all of their records management requirements. The Directive specifies actions (see A3, A4, A5, and B2) in which NARA and agencies can work together to identify effective solutions that can be shared.”

Finally, although FOIA requests have increased and the backlog has decreased, how will litigation and FOIA intersecting in the next say 5 years?  We know from the retracted decision in NDLON that metadata still remains an issue for the government…are we getting to a point where records created electronically will be able to be produced electronically as a matter of course for FOIA litigation/requests?

“In general, an important feature of the Directive is that the Federal government’s record information – most of which is in electronic format – stays in electronic format. Therefore, all of the inherent benefits will remain as well – i.e., metadata being retained, easier and speedier searches to locate records, and efficiencies in compilation, reproduction, transmission, and reduction in the cost of producing the requested information. This all would be expected to have an impact in improving the ability of federal agencies to respond to FOIA requests by producing records in electronic formats.”

Fun Fact- Is NARA really saving every tweet produced?

“Actually, the Library of Congress is the agency that is preserving Twitter. NARA is interested in only preserving those tweets that a) were made or received in the course of government business and b) appraised to have permanent value. We talked about this on our Records Express blog.”

“We think President Barack Obama said it best when he made the following comment on November 28, 2011:

“The current federal records management system is based on an outdated approach involving paper and filing cabinets. Today’s action will move the process into the digital age so the American public can have access to clear and accurate information about the decisions and actions of the Federal Government.” Paul Wester, Chief Records Officer at the National Archives, has stated that this Directive is very exciting for the Federal Records Management community.  In our lifetime none of us has experienced the attention to the challenges that we encounter every day in managing our records management programs like we are now. These are very exciting times to be a records manager in the Federal government. Full implementation of the Directive by the end of this decade will take a lot of hard work, but the government will be better off for doing this and we will be better able to serve the public.”

Special thanks to NARA for the ongoing dialogue that is key to transparent government and the effective practice of eDiscovery, Freedom Of Information Act requests, records management and thought leadership in the government sector. Stay tuned as we continue to cover these crucial issues for the government as they wrestle with important information governance challenges. 

 

Responsible Data Citizens Embrace Old World Archiving With New Data Sources

Monday, October 8th, 2012

The times are changing rapidly as data explosion mushrooms, but the more things change the more they stay the same. In the archiving and eDiscovery world, organizations are increasingly pushing content from multiple data sources into information archives. Email was the first data source to take the plunge into the archive, but other data sources are following quickly as we increase the amount of data we create (volume) along with the types of data sources (variety). While email is still a paramount data source for litigation, internal/external investigations and compliance – other data sources, namely social media and SharePoint, are quickly catching up.  

This transformation is happening for multiple reasons. The main reason for this expansive push of different data varieties into the archive is because centralizing an organization’s data is paramount to healthy information governance. For organizations that have deployed archiving and eDiscovery technologies, the ability to archive multiple data sources is the Shangri-La they have been looking for to increase efficiency, as well as create a more holistic and defensible workflow.

Organizations can now deploy document retention policies across multiple content types within one archive and can identify, preserve and collect from the same, singular repository. No longer do separate retention policies need to apply to data that originated in different repositories. The increased ability to archive more data sources into a centralized archive provides for unparalleled storage, deduplication, document retention, defensible deletion and discovery benefits in an increasingly complex data environment.

Prior to this capability, SharePoint was another data source in the wild that needed disparate treatment. This meant that legal hold in-place, as well as insight into the corpus of data, was not as clear as it was for email. This lack of transparency within the organization’s data environment for early case assessment led to unnecessary outsourcing, over collection and disparate time consuming workflows. All of the aforementioned detractors cost organizations money, resources and time that can be better utilized elsewhere.

Bringing data sources like SharePoint into an information archive increases the ability for an organization to comply with necessary document retention schedules, legal hold requirements, and the ability to reap the benefits of a comprehensive information governance program. If SharePoint is where an organization’s employees are storing documents that are valuable to the business, order needs to be brought to the repository.

Additionally, many projects are abandoned and left to die on the vine in SharePoint. These projects need to be expired and that capacity must be recycled for a higher business purpose. Archives currently enable document libraries, wikis, discussion boards, custom lists, “My Sites” and SharePoint social content for increased storage optimization, retention/expiration of content and eDiscovery. As a result, organizations can better manage complex projects such as migrations, versioning, site consolidations and expiration with SharePoint archiving.  

Data can be analogized to a currency, where the archive is the bank. In treating data as a currency, organizations must ask themselves: why are companies valued the way they are on Wall Street? For companies that perform service or services in combination with products, they are valued many times on customer lists, data to be repurposed about consumers (Facebook), and various other databases. A recent Forbes article discusses people, value and brand as predominant indicators of value.

While these valuation metrics are sound, the valuation stops short of measuring the quality of the actual data within an organization, examining if it is organized and protected. The valuation also does not consider the risks of and benefits of how the data is stored, protected and whether or not it is searchable. The value of the data inside a company is what supports all three of the aforementioned valuations without exception. Without managing the data in an organization, not only are eDiscovery and storage costs a legal and financial risk, the aforementioned three are compromised.

If employee data is not managed/monitored appropriately, if the brand is compromised due to lack of social media monitoring/response, or if litigation ensues without the proper information governance plan, then value is lost because value has not been assessed and managed. Ultimately, an organization is only as good as its data, and this means there’s a new asset on Wall Street – data.

It’s not a new concept to archive email,  and in turn it isn’t novel that data is an asset. It has just been a less understood asset because even though massive amounts of data are created each day in organizations, storage has become cheap. SharePoint is becoming more archivable because more critical data is being stored there, including business records, contracts and social media content. Organizations cannot fear what they cannot see until they are forced by an event to go back and collect, analyze and review that data. Costs associated with this reactive eDiscovery process can range from $3,000-30,000 a gigabyte, compared to the 20 cents per gigabyte for storage. The downstream eDiscovery costs are obviously costly, especially as organizations begin to deal in terabytes and zettabytes. 

Hence, plus ca change, plus c’est le meme chose and we will see this trend continue as organizations push more valuable data into the archive and expire data that has no value. Multiple data sources have been collection sources for some time, but the ease of pulling everything into an archive is allowing for economies of scale and increased defensibility regarding data management. This will decrease the risks associated with litigation and compliance, as well as boost the value of companies.

From A to PC – Running a Defensible Predictive Coding Workflow

Tuesday, September 11th, 2012

So far in our ongoing predictive coding blog series, we’ve touched on the “whys” and “whats” of predictive coding, and now I’d like to address the “hows” of using this new technology. Given that predictive coding is groundbreaking technology in the world of eDiscovery, it’s no surprise that a different workflow is required in order to run the review process.

The traditional linear review process utilizes a “brute force” approach of manually reading each document and processing it for responsiveness and privilege. In order to reduce the high cost of this process, many organizations now farm out documents to contract attorneys for review. Often, however, contract attorneys possess less expertise and knowledge of the issues, which means that multiple review passes along with additional checks and balances are often needed in order to ensure review accuracy. This process commonly results in a significant number of documents being reviewed multiple times, which in turn increases the cost of review. When you step away from an “eyes-on review” of every document and use predictive coding to leverage the expertise of more experienced attorneys, you will naturally aim to review as few documents as possible in order to achieve the best possible results.

How do you review the minimum number of documents with predictive coding? For starters, organizations should prepare their case to use predictive coding by performing an early case assessment (ECA) in order to cull down to your review population prior to review. While some may suggest that predictive coding can be run without any ECA up front, you will actually save a significant amount of review time if you put in the effort to cull out the profoundly irrelevant documents in your case. Doing so will prevent a “junk in, junk out” situation where leaving too much junk in the case will result in having to necessarily review a number of junk documents throughout the predictive coding workflow.

Next, segregating documents that are unsuitable for predictive coding is important. Most predictive coding solutions leverage the extracted text content within documents to operate. That means any documents that do not contain extracted text, such as photographs and engineering schematics, should be manually reviewed so they are not overlooked by the predictive coding engine. The same concept applies to any other document that has other reviewable limitations, such as encrypted and password protected files. All of these documents should be reviewed separately as to not miss any relevant documents.

After culling down to your review population, the next step in preparing to use predictive coding is to create a Control Set by drawing a randomly selected statistical sample from the document population. Once the Control Set is manually reviewed, it will serve two main purposes. First, it will allow you to estimate the population yield, otherwise referred to as the percentage of responsive documents contained within the larger population. (The size of the control set may need to be adjusted to insure the yield is properly taken into account). Second, it will serve as your baseline for a true “apples-to-apples” comparison of your prediction accuracy across iterations as you move through the predictive coding workflow. The Control Set will only need to be reviewed once up front to be used for measuring accuracy throughout the workflow.

It is essential that the documents in the Control Set are selected randomly from the entire population. While some believe that taking other sampling approaches give better peace of mind, they actually may result in unnecessary review. For example, other workflows recommend sampling from the documents that are not predicted to be relevant to see if anything was left behind. If you instead create a proper Control Set from the entire population, you can get the necessary precision and recall metrics that are representative of the entire population, which in turn represents the documents that are not predicted to be relevant.

Once the Control Set is created, you can begin training the software to evaluate documents by the review criteria in the case. Selecting the optimal set of documents to train the system (commonly referred to as the training set or seed set) is one of the most important steps in the entire predictive coding workflow as it sets the initial accuracy for the system, and thus it should be chosen carefully. Some suggest creating the initial training set by taking a random sample (much like how the control set is selected) from the population instead of proactively selecting responsive documents. However, the important thing to understand is that any items used for training should accurately represent the responsive items instead. The reason selecting responsive documents for inclusion in the training set is important is related to the fact that most eDiscovery cases generally have low yield – meaning the prevalence of responsive documents contained within the overall document population is low. This means the system will not be able to effectively learn how to identify responsive items if enough responsive documents are not included in the training set.

An effective method for selecting the initial training set is to use a targeted search to locate a small set of documents (typically between 100-1000) that is expected to be about 50% responsive. For example, you may choose to focus on only the key custodians in the case and use a combination of tighter keyword/date range/etc search criteria. You do not have to perform exhaustive searches, but a high quality initial training set will likely minimize the amount of additional training needed to achieve high prediction accuracy.

After the initial training set is selected, it must then be reviewed. It is extremely important that the review decisions made on any training items are as accurate as possible since the systems will be learning from these items, which typically means that the more experienced case attorneys should be used for this review. Once review is finished on all of the training documents, then the system can learn from the tagging decisions in order to be able to predict the responsiveness or non-responsiveness of the remaining documents.

While you can now predict on all of the other documents in the population, it is most important to predict on the Control Set at this time. Not only may this decision be more time effective than applying predictions to all the documents in the case, but you will need predictions on all of the documents in the Control Set in order to assess the accuracy of the predictions. With predictions and tagging decisions on each of the Control Set documents, you will be able to get accurate precision and recall metrics that you can extrapolate to the entire review population.

At this point, the accuracy of the predictions is likely to not be optimal, and thus the iterative process begins. In order to increase the accuracy, you must select additional documents to use for training the system. Much like the initial training set, this additional training set must also be selected carefully. The best documents to use for an additional training set are those that the system would be unable to accurately predict. Rather than choosing these documents manually, the software is often able to mathematically determine this set more effectively than human reviewers. Once these documents are selected, you simply continue the iterative process of training, predicting and testing until your precision and recall are at an acceptable point. Following this workflow will result in a set of documents identified to be responsive by the system along with trustworthy and defensible accuracy metrics.

You cannot simply produce all of these documents at this point, however. The documents must still go through a privileged screen in order to remove any documents that should not be produced, and also go through any other review measures that you usually take on your responsive documents. This does, however, open up the possibility of applying additional rounds of predictive coding on top of this set of responsive documents. For example, after running the privileged screen, you can train on the privileged tag and attempt to identify additional privileged documents in your responsive set that were missed.

The important thing to keep in mind is that predictive coding is meant to strengthen your current review workflows. While we have outlined one possible workflow that utilizes predictive coding, the flexibility of the technology lends itself to be utilized for a multitude of other uses, including prioritizing a linear review. Whatever application you choose, predictive coding is sure to be an effective tool in your future reviews.

Breaking News: Court Issues 20-Year Product Injunction in Trade Secret Theft/eDiscovery Sanctions Case

Friday, August 31st, 2012

The court in E.I. du Pont de Nemours v. Kolon Industries returned a stunning, 20-year worldwide product injunction yesterday in a trade secret theft case involving Kevlar®. Almost a year after a jury returned a $919 million verdict for DuPont, the court found that defendant Kolon Industries’ actions merited a permanent injunction. Not only is Kolon barred from competing with DuPont’s Kevlar® product for the next 20 years, it must also give a court-appointed expert access to its computer files to confirm that it has swept out and returned all of the stolen trade secrets.

Remarkably enough, Kolon’s troubles are still not over. While Kolon has moved to stay the injunction pending its appeal, it must still wait for the court’s attorney fee order that will likely result in a substantial award for DuPont. It is worth noting that a significant portion of that award will encompass the fees and costs that DuPont incurred to address Kolon’s eDiscovery spoliation, which culminated in the game-changing adverse inference instruction the court read to the jury.

Given that the DuPont trial proceedings are essentially over, it is worth analyzing whether the results for the parties might have been different absent the eDiscovery sanctions. Had Kolon been able to prevent the key evidence from being destroyed, perhaps it could have mitigated the disastrous results with a smaller jury verdict or perhaps even a settlement. While perhaps nothing more than speculation, it is clear that an information governance strategy would have helped Kolon with its preservation duties and efforts to obtain an early assessment of the likely outcome. Such results would certainly have been an improvement over the game-ending jury instruction stemming from Kolon’s eDiscovery and information retention deficiencies.

The 2012 EDGE Summit (21st Century Technology for Information Governance) Debuts In Nation’s Capitol

Monday, April 23rd, 2012

The EDGE Summit this week is one of the most prestigious eDiscovery events of the year as well as arguably the largest for the government sector. This year’s topics and speakers are top notch. The opening keynote speaker will be the Director of Litigation for the National Archives and Records Administration (NARA), Mr. Jason Baron. The EDGE Summit will be the first appearance for Mr. Baron since the submission deadline for the 480 agencies to submit their reports to his Agency in order to construct the Directive required by the Presidential Mandate. Attendees will be eager to hear what steps NARA is taking to implement a Directive to the government later this year, and the potential impact it will have on how the government approaches its eDiscovery obligations. The Directive will be a significant step in attempting to bring order to the government’s Big Data challenges and to unify agencies with a similar approach to an information governance plan.

Also speaking at EDGE is the renowned Judge Facciola who will be discussing the anticipated updates the American Bar Association (ABA) is expected to make to the Model Rules of Professional Conduct. He plans to speak on the challenges that lawyers are facing in the digital age, and what that means with regard to competency as a practicing lawyer. He will focus as well on the government lawyer and how they can better meet their legal obligations through education, training, or knowing when and how to find the right expert. Whether it is the investigating party for law enforcement, producing party under the Freedom of Information Act (FOIA), or defendant in civil litigation, Judge Facciola will also discuss what he sees in his courtroom every day and where the true knowledge gaps are in the technological understanding of many lawyers today.

While the EDGE Summit offers CLE credit, it also has a very unique practical aspect as well. There will be a FOIA-specific lab, a lab on investigations, one on civil litigation and early case assessment (ECA) and also one on streamlining the eDiscovery workflow process. Those that plan on attending the labs will get the hands-on experience with technology that few educational events offer. It is rare to get in the driver’s seat of the car on the showroom floor and actually drive, which is what EDGE is providing for end users and interested attendees. When talking about the complex problems government agencies face today with Big Data, records management, information governance, eDiscovery, compliance, security, etc. it is necessary to give users a way to  truly visualize how these technologies work.

Another key draw at the Summit will be the panel discussions which will feature experienced government lawyers who have been on the front lines of litigation and have very unique perspectives. The legal hold panel will cover some exciting aspects of the evolution of manual versus automated processes for legal hold. Mr. David Shonka, the Deputy General Counsel of the Federal Trade Commission, is on the panel and he will discuss the defensibility of the process the FTC used and the experience his department had with two 30 (b) (6) witnesses in the Federal Trade Commission v. Lights of America, Inc (CD California, Mar 2011). The session will also cover how issuing a legal hold is imperative once the duty to preserve has been triggered. There are a whole new generation of lawyers that are managing the litigation hold process in an automated way, and it will be great to discuss both the manual and automated approaches and talk about best practices for government agencies. There will also be a session on predictive coding and discussion about the recent cases that have involve the use of technology assisted review. While we are not at the point of mainstream adoption for predictive coding, it is quite exciting to think about the government going from a paper world straight into solutions that would help them manage their unique challenges as well as save them time and money.

Finally, the EDGE Summit will conclude with closing remarks from The Hon. Michael Chertoff, former Secretary of the U.S. Department of Homeland Security from 2005 to 2009. Mr. Chertoff presently consults with high-level strategic counsel to corporate and government leaders on a broad range of security issues, from risk identification and prevention to preparedness, response and recovery. All of these issues now involve data and how to search, collect, analyze, protect and store it. Security is one of the most important aspects of information governance. The government has unique challenges including size and many geographical locations, records management requirements, massive data volume and case load, investigations, heightened security and defense intelligence risks. This year, in particular, will be a defining year; not only because of the Presidential Mandate, but because of the information explosion and the stretch of global economy. This is why the sector needs to come together to share best practices and hear success stories.  Otherwise, they won’t be able to keep up with the data explosion that’s threatening private and public sectors alike.

eDiscovery Down Under: New Zealand and Australia Are Not as Different as They Sound, Mate!

Thursday, March 29th, 2012

Shortly after arriving in Wellington, New Zealand, I picked up the Dominion Post newspaper and read its lead article: a story involving U.S. jurisdiction being exercised over billionaire NZ resident Mr. Kim Dotcom. The article reinforced the challenges we face with blurred legal and data governance issues presented by the globalization of the economy and the expansive reach of the internet. Originally from Germany, and having changed his surname to reflect the origin of his fortune, Mr. Dotcom has become all too familiar in NZ of late. He has just purchased two opulent homes in NZ, and has become an internationally controversial figure for internet piracy. Mr. Dotcom’s legal troubles arise out of his internet business that enables illegal downloads of pirated material between users, which allegedly is powering the largest copyright infringement in global history. It is approximated that his website constitutes 4% of the internet traffic in the world, which means there could be tons of discovery in this case (or, cases).

The most recent legal problems Mr. Dotcom faces are with U.S. authorities who want to extradite him to face copyright charges worth $500 million by his Megaupload file-sharing website. From a criminal and record-keeping standpoint, Mr. Dotcom’s issues highlight the need for and use of appropriate technologies. In order to establish a case against him, it’s likely that search technologies were deployed by U.S. intelligence agencies to piece together Mr. Dotcom’s activities, banking information, emails and the data transfers on his site. In a case like this, where intelligence agencies would need to collect, search and cull email from so many different geographies and data sources down to just the relevant information, using technologies that link email conversation threads and give insight into a data collection set from a transparent search point of view would provide immense value. Additionally, the Immigration bureau in New Zealand has been required to release hundreds of documents about Mr. Dotcom’s residency application that were requested under the Official Information Act (OIA). The records that Immigration had to produce were likely pulled from their archive or records management system in NZ, and then redacted for private information before production to the public.

The same tools are needed in Australia and New Zealand to build a criminal case or to comply with the OIA that we use here in the U.S for investigatory and compliance purposes, as well as for litigation. The trend in information governance technology in APAC is trending first toward government agencies who are purchasing archiving and eDiscovery technologies more rapidly than private companies. Why is this? One reason could be that because the governments in APAC have a larger responsibility for healthcare, education and the protection of privacy; they are more invested in the compliance requirements and staying off the front page of the news for shortcomings. APAC private enterprises that are small or mid-sized and are not yet doing international business do not have the same archiving and eDiscovery needs large government agencies do, nor do they face litigation in the same way their American counterparts do. Large global companies should assume no matter where they are based, that they may be availed to litigation where they are doing business.

An interesting NZ use case on the enterprise level is that of Transpower (the quasi-governmental energy agency), where compliance with both the “private and public” requirements are mandatory. Transpower is an organisation that is government-owned, yet operates for a profit. Sally Myles, an experienced records manager that recently came to Transpower to head up information governance initiatives, says,

“We have to comply with the Public Records Act of 2005, public requests for information are frequent as we and are under constant scrutiny about where we will develop our plants. We also must comply with the Privacy Act of 1993. My challenge is to get the attention of our leadership to demonstrate why we need to make these changes and show them a plan for implementation as well as cost savings.”

Myles’ comments indicate NZ is facing many of the same information challenges we are here in the US with storage, records management and searching for meaningful information within the organisation.

Australia, New Zealand and U.S. Commonalities

In Australia and NZ, litigation is not seen as a compelling business driver the same way it is in the U.S. This is because many of the information governance needs of organisations are driven by regulatory, statutory and compliance requirements and the environment is not as litigious as it is in the U.S. The Official Information Act in NZ, and the Freedom of Information in Australia, are analogous to the Freedom of Information Act (FOIA) here in the U.S. The requirements to produce public records alone justify the use of technology to provide the ability to manage large volumes of data and produce appropriately redacted information to the public. This is true regardless of litigation. Additionally, there are now cases like DuPont or Mr. Dotcom’s, that legitimatize the risk of litigation with the U.S. The fact that implementing an information governance product suite will also enable a company to be prepared for litigation is a beneficial by-product for many entities as they need technology for record keeping and privacy reasons anyway. In essence, the same capabilities are achieved at the end of the day, regardless of the impetus for implementing a solution.

The Royal Commission – The Ultimate eDiscovery Vehicle

One way to think about the Australian Royal Commission (RCs) is to see it as a version of the U.S.’ government investigation. A key difference, however, is that in the case of the U.S. government, an investigation is typically into private companies. Conversely, a Royal Commission is typically an investigation into a government body after a major tragedy and it is initiated by the Head of State. A RC is an ad-hoc, formal, public inquiry into a defined issue with considerable discovery powers. These powers can be greater than those of a judge and are restricted to the scope and terms of reference of the Commission. RCs are called to look into matters of great importance and usually have very large budgets. The RC is charged with researching the issue, consulting experts both within and outside of government and developing findings to recommend changes to the law or other courses of actions. RCs have immense investigatory powers, including summoning witnesses under oath, offering of indemnities, seizing of documents and other evidence (sometimes including those normally protected, such as classified information), holding hearings in camera if necessary and—in a few cases—compelling government officials to aid in the execution of the Commission.

These expansive powers give the RC the opportunity to employ state of the art technology and to skip the slow bureaucratic decision making processes found within the government when it comes to implementing technological change. For this reason, initially, eDiscovery will continue to increase in the government sector at a more rapid pace than in the private in the Asia Pacific region. This is because litigation is less prevalent in the Asia Pacific, and because the RC is a unique investigatory vehicle with the most far-reaching authority for discovering information. Moreover, the timeframes for RCs are tight and their scopes are broad, making them hair on fire situations that move quickly.

While the APAC information management environment does not have the exact same drivers the U.S. market does, it definitely has the same archiving, eDiscovery and technology needs for different reasons. Another key point is that the APAC archiving and eDiscovery market will likely be driven by the government as records, search and production requirements are the main compliance needs in Australia and NZ. APAC organisations would be well served by beginning to modularly implement key elements of an information governance plan, as globalization is driving us all to a more common and automated approach to data management.