Archive for the ‘pre-processing’ Category

Losing Weight, Developing an Information Governance Plan, and Other New Year’s Resolutions

Tuesday, January 17th, 2012

It’s already a few weeks into the new year and it’s easy to spot the big lines at the gym, folks working on fad diets and many swearing off any number of vices.  Sadly perhaps, most popular resolutions don’t even really change year after year.  In the corporate world, though, it’s not good enough to simply recycle resolutions every year since there’s a lot more at stake, often with employee’s bonuses and jobs hanging in the balance.

It’s not too late to make information governance part of the corporate 2012 resolution list.  The reason is pretty simple – most companies need to get out of the reactive firefighting of eDiscovery given the risks of sloppy work, inadvertent productions and looming sanctions.  Yet, so many are caught up in the fog of eDiscovery war that they’ve failed to see the nexus between the upstream, proactive good data management hygiene and the downstream eDiscovery chaos.

In many cases the root cause is the disconnect between differing functional groups (Legal, IT, Information Security, Records Management, etc.).  This is where the emerging umbrella concept of Information Governance comes to play, serving as a way to tackle these information risks along a unified front. Gartner defines information governanceas the:

“specification of decision rights, and an accountability framework to encourage desirable behavior in the valuation, creation, storage, use, archiving and deletion of information, … [including] the processes, roles, standards, and metrics that ensure the effective and efficient use of information to enable an organization to achieve its goals.”

Perhaps more simply put, what were once a number of distinct disciplines—records management, data privacy, information security and eDiscovery—are rapidly coming together in ways that are important to those concerned with mitigating and managing information risk. This new information governance landscape is comprised of a number of formerly discrete categories:

  • Regulatory Risks – Whether an organization is in a heavily regulated vertical or not, there are a host of regulations that an organization must navigate to successfully stay in compliance.  In the United States these include a range of disparate regimes, including the Sarbanes-Oxley Act, HIPPA, the Securities and Exchange Act, the Foreign Corrupt Practices Act (FCPA) and other specialized regulations – any number of which require information to be kept in a prescribed fashion, for specified periods of time.  Failure to turn over information when requested by regulators can have dramatic financial consequences, as well as negative impacts to an organization’s reputation.
  • Discovery Risks – Under the discovery realm there are any number of potential risks as a company moves along the EDRM spectrum (i.e., Identification, Preservation, Collection, Processing, Analysis, Review and Production), but the most lethal risk is typically associated with spoliation sanctions that arise from the failure to adequately preserve electronically stored information (ESI).  There have been literally hundreds of cases where both plaintiffs and defendants have been caught in the judicial crosshairs, resulting in penalties ranging from outright case dismissal to monetary sanctions in the millions of dollars, simply for failing to preserve data properly.  It is in this discovery arena that the failure to dispose of corporate information, where possible, rears its ugly head since the eDiscovery burden is commensurate with the amount of data that needs to be preserved, processed and reviewed.  Some statistics show that it can cost as much as $5 per document just to have an attorney privilege review performed.  And, with every gigabyte containing upwards of 75,000 pages, it is easy to see massive discovery liability when an organization has terabytes and even petabytes of extraneous data lying around.
  • Privacy Risks – Even though the US has a relatively lax information privacy climate there are any number of laws that require companies to notify customers if their personally identifiable information (PII) such as credit card, social security, or credit numbers have been compromised.  For example, California’s data breach notification law (SB1386) mandates that all subject companies must provide notification if there is a security breach to the electronic database containing PII of any California resident.  It is easy to see how unmanaged PII can increase corporate risk, especially as data moves beyond US borders to the international stage where privacy regimes are much more staunch.
  • Information Security Risks Data breaches have become so commonplace that the loss/theft of intellectual property has become an issue for every company, small and large, both domestically and internationally.  The cost to businesses of unintentionally exposing corporate information climbed 7 percent last year to over $7 million per incident.  Recently senators asked the SEC to “issue guidance regarding disclosure of information security risk, including material network breaches” since “securities law obligates the disclosure of any material network breach, including breaches involving sensitive corporate information that could be used by an adversary to gain competitive advantage in the marketplace, affect corporate earnings, and potentially reduce market share.”  The senators cited a 2009 survey that concluded that 38% of Fortune 500 companies made a “significant oversight” by not mentioning data security exposures in their public filings.

Information governance as an umbrella concept helps organizations to create better alignment between functional groups as they attempt to solve these complex and interrelated data risk challenges.  This coordination is even more critical given the way that corporate data is proliferating and migrating beyond the firewall.  With even more data located in the cloud and on mobile devices a key mandate is managing data in all types of form factors. A great first step is to determine ownership of a consolidated information governance approach where the owner can:

  • Get C-Level buy-in
  • Have the organizational savvy to obtain budget
  • Be able to define “reasonable” information governance efforts, which requires both legal and IT input
  • Have strong leadership and consensus building skills, because all stakeholders need to be on the same page
  • Understand the nuances of their business, since an overly rigid process will cause employees to work around the policies and procedures

Next, tap into and then leverage IT or information security budgets for archiving, compliance and storage.  In most progressive organizations there are likely ongoing projects that can be successfully massaged into a larger information governance play.  A great place to focus on initially is information archiving, since this one of the simplest steps an organization can take to improve their information governance hygiene.  With an archive organizations can systematically index, classify and retain information and thus establish a proactive approach to data management.  It’s this ability to apply retention and (most importantly) expiration policies that allows organizations to start reducing the upstream data deluge that will inevitably impact downstream eDiscovery processes.

Once an archive is in place, the next logical step is to couple a scalable, reactive eDiscovery process with the upstream data sources, which will axiomatically include email, but increasingly should encompass cloud content, social media, unstructured data, etc.  It is important to make sure  that a given  archive has been tested to ensure compatibility with the chosen eDiscovery application to guarantee that it can collect content at scale in the same manner used to collect from other data sources.  Overlaying both of these foundational pieces should be the ability to place content on legal hold, whether that content exists in the archive or not.

As we enter 2012, there is no doubt that information governance should be an element in building an enterprise’s information architecture.  And, different from fleeting weight loss resolutions, savvy organizations should vow to get ahead of the burgeoning categories of information risk by fully embracing their commitment to integrated information governance.  And yet, this resolution doesn’t need to encompass every possible element of information governance.  Instead, it’s best to put foundational pieces into place and then build the rest of the infrastructure in methodical and modular fashion.

Information Governance Gets Presidential Attention: Banking Bailout Cost $4.76 Trillion, Technology Revamp Approaches $240 Billion

Tuesday, January 10th, 2012

On November 28, 2011, The White House issued a Presidential Memorandum that outlines what is expected of the 480 federal agencies of the government’s three branches in the next 240 days.  Up until now, Washington, D.C. has been the Wild West with regard to information governance as each agency has often unilaterally adopted its own arbitrary policies and systems.  Moreover, some agencies have recently purchased differing technologies.  Unfortunately,  with the President’s ultimate goal of uniformity, this centralization will be difficult to accomplish with a range of disparate technological approaches.

Particular pain points for the government traditionally include retention, search, collection, review and production of vast amounts of data and records.  Specifically, these pain points include examples of: FOIA requests gone awry, the issuance of legal holds across different agencies leading to spoliation, and the ever present problem of decentralization.

Why is the government different?

Old Practices. First, in some instances the government is technologically behind (its corporate counterparts) and is failing to meet the judiciary’s expectation that organizations effectively store, manage and discover their information.  This failing is self-evident via  the directive coming from the President mandating that these agencies start to get a plan to attack this problem.  Though different than other corporate entities, the government is nevertheless held to the same standards of eDiscovery under the Federal Rules of Civil Procedure (FRCP).  In practice, the government has been given more leniency until recently, and while equal expectations have not always been the case, the gap between the private and public sectors in no longer possible to ignore.

FOIA.  The government’s arduous obligation to produce information under the Freedom of Information Act (FOIA) has no corresponding analog for private organizations, who are responding to more traditional civil discovery requests.  Because the government is so large with many disparate IT systems, it is cumbersome to work efficiently through the information governance process across agencies and many times still difficult inside one individual agency with multiple divisions.  Executing this production process is even more difficult if not impossible to do manually without properly deployed technology.  Additionally, many of the investigatory agencies that issue requests to the private sector need more efficient ways to manage and review data they are requesting.  To compound problems, within the US government there are two opposing interests are at play; both screaming for a resolution, and that solution needs to be centralized.  On the one hand, the government needs to retain more than a corporation may need to in order to satisfy a FOIA request.

Titan Pulled at Both Ends. On the other hand, without classification of the records that are to be kept, technology to organize this vast amount of data and some amount of expiry, every agency will essentially become their own massive repository.  The “retain everything mentality” coupled with the inefficient search and retrieval of data and records is where they stand today.  Corporations are experiencing this on a smaller scale today and many are collectively further along than the government in this process, without the FOIA complications.

What are agencies doing to address these mandates?

In their plans, agencies must describe how they will improve or maintain their records management programs, particularly with regard to email, social media and other electronic communications.  They must also move away from such a paper-centric existence.  eDiscovery consultants and software companies are helping agencies through this process, essentially writing their plans to match the President’s directive.  The cloud conversation has been revisited, and agencies also have to explain how they will use cloud-based services and storage solutions, as well as identify gaps in existing laws or regulations that presently prevent improved management.  Small innovations are taking place.  In fact, just recently the DOJ added a new search feature on their website to make it easier for the public to find documents that have been posted by agencies on their websites.

The Office of Management and Budget (OMB), National Archives and Records Administration (NARA), and Justice Department will use those reports to come up with a government-wide records management framework that is more efficient, maintains accountability by documenting agency actions and promotes “appropriate” public access to records.  Hopefully, the framework they come up with will be centralized and workable on a realistic timeframe with resources sufficiently allocated to the initiative.

How much will this cost?

The President’s mandate is a great initiative and very necessary, but one cannot help but think about the costs in terms of money, time and resources when considering these crucial changes.  The most recent version of a financial services and general government appropriations bill in the Senate extends $378.8 million to NARA for this initiative.  President Obama appointed Steven VanRoekel as the United States CIO in August 2011 to succeed Vivek Kundra.  After VanRoekel’s speech at the Churchill Club in October of 2011, an audience member asked him what the most surprising aspect of his new job was.  VanRoekel said that it was managing the huge and sometimes unwieldy resources of his $80 billion budget.  It is going to take even more than this to do the job right, however.

Using conservative estimates, assume for an agency to implement archiving and eDiscovery capabilities as an initial investment would be $100 million.  That approximates $480 billion for all 480 agencies.  Assume a uniform information governance platform gets adopted by all agencies at a 50% discount due to the large contracts and also factoring in smaller sums for agencies with lesser needs.  The total now comes to $240 billion.  For context, that figure is 5% of what was spent by Federal Government ($4.76 trillion) on the biggest bailout in history in 2008. That leaves a need for $160 billion more to get the job done. VanRoekel also commented at the same meeting that he wants to break down massive multi-year information technology projects into smaller, more modular projects in the hopes of saving the government from getting mired in multi-million dollar failures.   His solution to this, he says, is modular and incremental deployment.

While Rome was not built in a day, this initiative is long overdue, yet feasible, as technology exists to address these challenges rather quickly.  After these 240 days are complete and a plan is drawn the real question is, how are we going to pay now for technology the government needed yesterday?  In a perfect world, the government would select a platform for archiving and eDiscovery, break the project into incremental milestones and roll out a uniform combination of solutions that are best of breed in their expertise.

Lessons Learned for 2012: Spotlighting the Top eDiscovery Cases from 2011

Tuesday, January 3rd, 2012

The New Year has now dawned and with it, the certainty that 2012 will bring new developments to the world of eDiscovery.  Last month, we spotlighted some eDiscovery trends for 2012 that we feel certain will occur in the near term.  To understand how these trends will play out, it is instructive to review some of the top eDiscovery cases from 2011.  These decisions provide a roadmap of best practices that the courts promulgated last year.  They also spotlight the expectations that courts will likely have for organizations in 2012 and beyond.

Issuing a Timely and Comprehensive Litigation Hold

Case: E.I. du Pont de Nemours v. Kolon Industries (E.D. Va. July 21, 2011)

Summary: The court issued a stiff rebuke against defendant Kolon Industries for failing to issue a timely and proper litigation hold.  That rebuke came in the form of an instruction to the jury that Kolon executives and employees destroyed key evidence after the company’s preservation duty was triggered.  The jury responded by returning a stunning $919 million verdict for DuPont.

The spoliation at issue occurred when several Kolon executives and employees deleted thousands emails and other records relevant to DuPont’s trade secret claims.  The court laid the blame for this destruction on the company’s attorneys and executives, reasoning they could have prevented the spoliation through an effective litigation hold process.  At issue were three hold notices circulated to the key players and data sources.  The notices were all deficient in some manner.  They were either too limited in their distribution, ineffective since they were prepared in English for Korean-speaking employees, or too late to prevent or otherwise ameliorate the spoliation.

The Lessons for 2012: The DuPont case underscores the importance of issuing a timely and comprehensive litigation hold notice.  As DuPont teaches, organizations should identify what key players and data sources may have relevant information.  A comprehensive notice should then be prepared to communicate the precise hold instructions in an intelligible fashion.  Finally, the hold should be circulated immediately to prevent data loss.

Organizations should also consider deploying the latest technologies to help effectuate this process.  This includes an eDiscovery platform that enables automated legal hold acknowledgements.  Such technology will allow custodians to be promptly and properly apprised of litigation and thereby retain information that might otherwise have been discarded.

Another Must-Read Case: Haraburda v. Arcelor Mittal U.S.A., Inc. (D. Ind. June 28, 2011)

Suspending Document Retention Policies

Case: Viramontes v. U.S. Bancorp (N.D. Ill. Jan. 27, 2011)

Summary: The defendant bank defeated a sanctions motion because it modified aspects of its email retention policy once it was aware litigation was reasonably foreseeable.  The bank implemented a retention policy that kept emails for 90 days, after which the emails were overwritten and destroyed.  The bank also promulgated a course of action whereby the retention policy would be promptly suspended on the occurrence of litigation or other triggering event.  This way, the bank could establish the reasonableness of its policy in litigation.  Because the bank followed that procedure in good faith, it was protected from court sanctions under the Federal Rules of Civil Procedure 37(e) “safe harbor.”

The Lesson for 2012: As Viramontes shows, an organization can be prepared for eDiscovery disputes by timely suspending aspects of its document retention policies.  By modifying retention policies when so required, an organization can develop a defensible retention procedure and be protected from court sanctions under Rule 37(e).

Coupling those procedures with archiving software will only enhance an organization’s eDiscovery preparations.  Effective archiving software will have a litigation hold mechanism, which enables an organization to suspend automated retention rules.  This will better ensure that data subject to a preservation duty is actually retained.

Another Must-Read Case: Micron Technology, Inc. v. Rambus Inc., 645 F.3d 1311 (Fed. Cir. 2011)

Managing the Document Collection Process

Case: Northington v. H & M International (N.D.Ill. Jan. 12, 2011)

Summary: The court issued an adverse inference jury instruction against a company that destroyed relevant emails and other data.  The spoliation occurred in large part because legal and IT were not involved in the collection process.  For example, counsel was not actively engaged in the critical steps of preservation, identification or collection of electronically stored information (ESI).  Nor was IT brought into the picture until 15 months after the preservation duty was triggered. By that time, rank and file employees – some of whom were accused by the plaintiff of harassment – stepped into this vacuum and conducted the collection process without meaningful oversight.  Predictably, key documents were never found and the court had little choice but to promise to inform the jury that the company destroyed evidence.

The Lesson for 2012: An organization does not have to suffer the same fate as the company in the Northington case.  It can take charge of its data during litigation through cooperative governance between legal and IT.  After issuing a timely and effective litigation hold, legal should typically involve IT in the collection process.  Legal should rely on IT to help identify all data sources – servers, systems and custodians – that likely contain relevant information.  IT will also be instrumental in preserving and collecting that data for subsequent review and analysis by legal.  By working together in a top-down fashion, organizations can better ensure that their eDiscovery process is defensible and not fatally flawed.

Another Must-Read Case: Green v. Blitz U.S.A., Inc. (E.D. Tex. Mar. 1, 2011)

Using Proportionality to Dictate the Scope of Permissible Discovery

Case: DCG Systems v. Checkpoint Technologies (N.D. Ca. Nov. 2, 2011)

The court adopted the new Model Order on E-Discovery in Patent Cases recently promulgated by the U.S. Court of Appeals for the Federal Circuit.  The model order incorporates principles of proportionality to reduce the production of email in patent litigation.  In adopting the order, the court explained that email productions should be scaled back since email is infrequently introduced as evidence at trial.  As a result, email production requests will be restricted to five search terms and may only span a defined set of five custodians.  Furthermore, email discovery in DCG Systems will wait until after the parties complete discovery on the “core documentation” concerning the patent, the accused product and prior art.

The Lesson for 2012: Courts seem to be slowly moving toward a system that incorporates proportionality as the touchstone for eDiscovery.  This is occurring beyond the field of patent litigation, as evidenced by other recent cases.  Even the State of Utah has gotten in on the act, revising its version of Rule 26 to require that all discovery meet the standards of proportionality.  While there are undoubtedly deviations from this trend (e.g., Pippins v. KPMG (S.D.N.Y. Oct. 7, 2011)), the clear lesson is that discovery should comply with the cost cutting mandate of Federal Rule 1.

Another Must-Read Case: Omni Laboratories Inc. v. Eden Energy Ltd [2011] EWHC 2169 (TCC) (29 July 2011)

Leveraging eDiscovery Technologies for Search and Review

Case: Oracle America v. Google (N.D. Ca. Oct. 20, 2011)

The court ordered Google to produce an email that it previously withheld on attorney client privilege grounds.  While the email’s focus on business negotiations vitiated Google’s claim of privilege, that claim was also undermined by Google’s production of eight earlier drafts of the email.  The drafts were produced because they did not contain addressees or the heading “attorney client privilege,” which the sender later inserted into the final email draft.  Because those details were absent from the earlier drafts, Google’s “electronic scanning mechanisms did not catch those drafts before production.”

The Lesson for 2012: Organizations need to leverage next generation, robust technology to support the document production process in discovery.  Tools such as email analytical software, which can isolate drafts and offer to remove them from production, are needed to address complex production issues.  Other technological capabilities, such as Near Duplicate Identification, can also help identify draft materials and marry them up with finals that have been marked as privileged.  Last but not least, technology assisted review has the potential of enabling one lawyer to efficiently complete the work that previously took thousands of hours.  Finding the budget and doing the research to obtain the right tools for the enterprise should be a priority for organizations in 2012.

Another Must-Read Case: J-M Manufacturing v. McDermott, Will & Emery (CA Super. Jun. 2, 2011)

Conclusion

There were any number of other significant cases from 2011 that could have made this list.  We invite you to share your favorites in the comments section or contact us directly with your feedback.

For more on the cases discussed above, watch this video:

Top Ten eDiscovery Predictions for 2012

Thursday, December 8th, 2011

As 2011 comes quickly to a close we’ve attempted, as in years past, to do our best Carnac impersonation and divine the future of eDiscovery.  Some of these predictions may happen more quickly than others, but it’s our sense that all will come to pass in the near future – it’s just a matter of timing.

  1. Technology Assisted Review (TAR) Gains Speed.  The area of Technology Assisted Review is very exciting since there are a host of emerging technologies that can help make the review process more efficient, ranging from email threading, concept search, clustering, predictive coding and the like.  There are two fundamental challenges however.  First, the technology doesn’t work in a vacuum, meaning that the workflows need to be properly designed and the users need to make accurate decisions because those judgment calls often are then magnified by the application.  Next, the defensibility of the given approach needs to be well vetted.  While it’s likely not necessary (or practical) to expect a judge to mandate the use of a specific technological approach, it is important for the applied technologies to be reasonable, transparent and auditable since the worst possible outcome would be to have a technology challenged and then find the producing party unable to adequately explain their methodology.
  2. The Custodian-Based Collection Model Comes Under Stress. Ever since the days of Zubulake, litigants have focused on “key players” as a proxy for finding relevant information during the eDiscovery process.  Early on, this model worked particularly well in an email-centric environment.  But, as discovery from cloud sources, collaborative worksites (like SharePoint) and other unstructured data repositories continues to become increasingly mainstream, the custodian-oriented collection model will become rapidly outmoded because it will fail to take into account topically-oriented searches.  This trend will be further amplified by the bench’s increasing distrust of manual, custodian-based data collection practices and the presence of better automated search methods, which are particularly valuable for certain types of litigation (e.g., patent disputes, product liability cases).
  3. The FRCP Amendment Debate Will Rage On – Unfortunately Without Much Near Term Progress. While it is clear that the eDiscovery preservation duty has become a more complex and risk laden process, it’s not clear that this “pain” is causally related to the FRCP.  In the notes from the Dallas mini-conference, a pending Sedona survey was quoted referencing the fact that preservation challenges were increasing dramatically.  Yet, there isn’t a consensus viewpoint regarding which changes, if any, would help improve the murky problem.  In the near term this means that organizations with significant preservation pains will need to better utilize the rules that are on the books and deploy enabling technologies where possible.
  4. Data Hoarding Increasingly Goes Out of Fashion. The war cry of many IT professionals that “storage is cheap” is starting to fall on deaf ears.  Organizations are realizing that the cost of storing information is just the tip of the iceberg when it comes to the litigation risk of having terabytes (and conceivably petabytes) of unstructured, uncategorized and unmanaged electronically stored information (ESI).  This tsunami of information will increasingly become an information liability for organizations that have never deleted a byte of information.  In 2012, more corporations will see the need to clean out their digital houses and will realize that such cleansing (where permitted) is a best practice moving forward.  This applies with equal force to the US government, which has recently mandated such an effort at President Obama’s behest.
  5. Information Governance Becomes a Viable Reality.  For several years there’s been an effort to combine the reactive (far right) side of the EDRM with the logically connected proactive (far left) side of the EDRM.  But now, a number of surveys have linked good information governance hygiene with better response times to eDiscovery requests and governmental inquires, as well as a corresponding lower chance of being sanctioned and the ability to turn over less responsive information.  In 2012, enterprises will realize that the litigation use case is just one way to leverage archival and eDiscovery tools, further accelerating adoption.
  6. Backup Tapes Will Be Increasingly Seen as a Liability.  Using backup tapes for disaster recovery/business continuity purposes remains a viable business strategy, although backing up to tape will become less prevalent as cloud backup increases.  However, if tapes are kept around longer than necessary (days versus months) then they become a ticking time bomb when a litigation or inquiry event crops up.
  7. International eDiscovery/eDisclosure Processes Will Continue to Mature. It’s easy to think of the US as dominating the eDiscovery landscape. While this is gospel for us here in the States, international markets are developing quickly and in many ways are ahead of the US, particularly with regulatory compliance-driven use cases, like the UK Bribery Act 2010.  This fact, coupled with the menagerie of international privacy laws, means we’ll be less Balkanized in our eDiscovery efforts moving forward since we do really need to be thinking and practicing globally.
  8. Email Becomes “So 2009” As Social Media Gains Traction. While email has been the eDiscovery darling for the past decade, it’s getting a little long in the tooth.  In the next year, new types of ESI (social media, structured data, loose files, cloud context, mobile device messages, etc.) will cause headaches for a number of enterprises that have been overly email-centric.  Already in 2011, organizations are finding that other sources of ESI like documents/files and structured data are rivaling email in importance for eDiscovery requests, and this trend shows no signs of abating, particularly for regulated industries. This heterogeneous mix of ESI will certainly result in challenges for many companies, with some unlucky ones getting sanctioned because they ignored these emerging data types.
  9. Cost Shifting Will Become More Prevalent – Impacting the “American Rule.” For ages, the American Rule held that producing parties had to pay for their production costs, with a few narrow exceptions.  Next year we’ll see even more courts award winning parties their eDiscovery costs under 28 U.S.C. §1920(4) and Rule 54(d)(1) FRCP. Courts are now beginning to consider the services of an eDiscovery vendor as “the 21st Century equivalent of making copies.”
  10. Risk Assessment Becomes a Critical Component of eDiscovery. Managing risk is a foundational underpinning for litigators generally, but its role in eDiscovery has been a bit obscure.  Now, with the tremendous statistical insights that are made possible by enabling software technologies, it will become increasingly important for counsel to manage risk by deciding what types of error/precision rates are possible.  This risk analysis is particularly critical for conducting any variety of technology assisted review process since precision, recall and f-measure statistics all require a delicate balance of risk and reward.

Accurately divining the future is difficult (some might say impossible), but in the electronic discovery arena many of these predictions can happen if enough practitioners decide they want them to happen.  So, the future is fortunately within reach.

Clearwell Doubles Down on Review

Monday, August 22nd, 2011


(Editor’s note: This special guest post was written by Chitran
g Shah, Clearwell Principal Product Manager. He is an RIT alum and avid hiker who works with our engineering team and lead customers to optimize the product for large-scale review. – Kurt)

As we’ve previously shared, our product strategy throughout 2009 and 2010 was to expand the product footprint across the EDRM as customers were demanding a single, end-to-end eDiscovery product. During this period we successfully expanded from our roots in processing, search and analysis to review and production (August 2009), identification and collection (September 2010) and legal hold workflow (March 2011). Over the last several months, our focus has been to go deep in each of these modules and provide features that deliver even greater return on investment to our customers.

Today, I am excited to announce significant new features and feature enhancements to the Clearwell Review and Production Module and say a few words about what motivated us to build these features and how they enable our customers to further streamline their legal review workflow.

There are several exciting features in this release, but I would to like to highlight three in particular:

1. Ability to seamlessly import production load files

Most matters require reviewing relevant documents alongside the documents received from third parties, opposing parties, and even previous litigations. With the new load file import feature, users can now streamline the process of importing load files with three simple steps.

In Step 1, a step-by-step wizard-like interface guides users though the selection of formatting information such as field delimiters and nested value delimiters, metadata information such as bates numbers, family relationships, tags, folders and any number of custom attributes, and content information such as images, extracted text and native files. When the load file has both extracted texts and native files, the wizard gives users an option to specify which content should be used for searching.

In Step 2, the system performs a deep validation of the load file and generates a report documenting any inconsistencies such as missing bates numbers or missing values for required fields found in the load file. As a result, customers have the ability to quickly find and fix any issues with the load file before the import begins.

In Step 3, the system imports the documents and builds analytics. Once this step completes, the imported documents, including all metadata and content, are available for viewing and searching.

All the analytics capabilities customers are familiar with, such as discussion threads and concept search, are also available for documents imported from load files. This allows users to quickly discover documents in the load file that are conceptually similar to natively processed documents, for example.

2. Support for large scale reviews and productions

As the volume of electronically stored information (ESI) continues to grow, our customers find themselves reviewing and exporting more and more documents, and they need a solution that can cope with the massive growth in data. At the same time, they don’t want to spend large sums of money building a server farm in anticipation of the growth. They want the flexibility to add capacity when needed and remove it when not needed.

Clearwell’s scale-out architecture enables administrators to easily add appliances and allocate them to a particular matter and to a specific task using a point-and-click interface.

For example, if an administrator needs to increase the number of reviewers from 200 to 400 in order to meet a tight deadline, he or she can easily add 2 appliances to the cluster and assign them for review. Once the review completes, the administrator can now easily re-assign these appliances for production, allowing users to easily meet deadlines while reducing their overall hardware costs.

This flexibility allows our customers to maximize the use of their hardware resources while providing infinite review, export and production scalability.

3. Streamlined management of exports and productions

Clearwell provides powerful export options, and while our customers use them extensively for creating a variety of different production formats, they typically standardize on a few. Clearwell’s new case export and production templates provide a quick and easy way for case administrators to define the export format once and use it across multiple cases. When exporting documents, users can simply select a template from the list of visible templates in that case. This capability significantly reduces the overhead associated with managing export formats and allows our customers to produce documents in a consistent format across multiple matters.

Additionally, new production pre-mediation reports automatically identify problem documents and group them by issue type for quick resolution. This enables users to preemptively identify and resolve document production issues without delaying entire productions.

Says Wendy Butler Curtis, chair of Orrick, Herrington & Sutcliffe’s eDiscovery Working Group, “Legal review is one of the most challenging phases of the eDiscovery process. As electronic data volumes continue to grow, it is increasingly important to leverage technologies that can streamline and improve legal review, ensure defensibility and reduce costs. Solutions like the Clearwell eDiscovery Platform enable legal teams to create an iterative eDiscovery workflow that allows for more efficient and effective large-scale review.”

We will be showcasing the new features at ILTA (Booth 816) this week in Nashville, so come see us and let us know what you think.

(Chitrang Shah is a Principal Product Manager at Clearwell Systems, now a part of Symantec, and the lead Product Manager for Clearwell’s Processing & Analysis and Review & Production Modules)

Clearwell Expands Its E-Discovery Platform with New Modules for Pre-Processing, Review, and Production

Monday, August 17th, 2009

Earlier today, Clearwell announced Version 5.0 of its e-discovery platform. Unlike prior versions which focused on processing, early case analysis, and first-pass review, this release extends Clearwell’s capabilities in two directions: upstream, by adding pre-processing; and downstream, by adding document-by-document review and production. I wanted to say a few words about what motivated these changes, and why the new release greatly increases Clearwell’s value to enterprises, government agencies, law firms, and litigation support service providers.

Over the past year, the benefits of early case analysis and first pass review have driven hundreds of companies to adopt Clearwell. They have saved huge amounts of money and time, and often become evangelists for the product. But despite that, we continually hear that the overall e-discovery process remains expensive, unpredictable, and risky. When we investigated why, we found the problem lies less in the features of the products being used than in the number of products used.

Once data is collected, a typical e-discovery process today may involve as many 4 different tools: one for filtering by custodians or date range, another for de-duplication and keyword search, another for load file creation, and yet another for review and production. Each time data moves between these tools, and there’s a handoff from one to another, there’s the risk that document counts do not tie out, data does not convert correctly, or any of a hundred other things go wrong. This risk is magnified by the fact that e-discovery is highly iterative: custodians are often added or keywords changed as new information comes to light, forcing people to redo many steps of the process. As a result, timelines are unpredictable and it’s hard to stick to a budget, even with extensive project management which itself is not cheap.

Since the problem lies in the handoffs between different products, it’s impossible to solve this problem by making any one part of the process better. The only solution is to have a single product that can manage collected data from soup (filtering / pre-processing) to nuts (production). Prior to today’s announcement, that product did not exist: there was no single, integrated product that could do everything from process data to review and produce it. And that, in summary, is why Clearwell is releasing Version 5.0.

With Clearwell’s new product, there are no handoffs, no uncertainty about how long it will take to export out of one tool and into another. There’s no need to cobble together a string of different products or train lawyers on multiple different interfaces and workflows. As a result, the risks of cost overruns or missed deadlines are greatly reduced.

To our mind, this is just part of a natural evolutionary process that affects many markets, not just e-discovery. Who wants to carry a Palm Pilot, iPod, and a mobile phone when you can carry a single device like the iPhone? Who wants a cable receiver and a TiVo when you can get both in a single set-top box?  As markets mature, there develops a logical package of functionality that customers prefer to buy from a single, integrated provider.

You can sign up for a product demonstration at our website, or come see the product at ILTA next week (Booth 606). Take a look – and let us know what you think.