Posts Tagged ‘ediscovery software’

Enterprise Strategy Group (ESG)’s Legal Trends Survey Reveals Alarming Inattention to eDiscovery Spending

Monday, December 5th, 2011

In their latest survey, entitled “E-Discovery Market Trends: A View from the Legal Department,” Enterprise Strategy Group (ESG) analysts Brian Babineau and Katey Wood analyze a number of interesting statistics and provide a range of insightful conclusions.  By surveying general counsel from large, mid-market (500-999 employees) and enterprise-class organizations in North America they were able to dive into a range of eDiscovery topics, including pain points, operational expenses and prioritizations on a go-forward basis.  Some are more intuitive than others, but in either case the results serve as good calibration metrics for those who endeavor to understand the corporate eDiscovery state of the nation.

“Most corporations are not tracking e-discovery spending…” In what may be the most notable finding of this ESG report, 60% of survey respondents claim that they did not track annual eDiscovery spending in 2010.  The authors correctly note that the eDiscovery process, “which can be highly unpredictable due to its project-by-project nature to begin with, has historically been outsourced to service providers charging at variable rates and often billed back to companies via their law firms.”  Despite the significant challenges of tracking eDiscovery spending, it’s nevertheless irresponsible for organizations to keep their heads in the sand regarding such a significant operational expense.

As the old saw goes, “you can’t manage what you can’t measure,” so it’s almost inconceivable to think that so many organizations aren’t tracking such a significant expense category.  For organizations who want to create a repeatable business process, as opposed to the fire-drill chaos that is typically associated with eDiscovery, it’s vitally important to accurately capture core eDiscovery metrics.  For starters, it’s useful to understand basic collection parameters, such as of the typical numbers of key custodians, average data volumes per custodian, data expansion rates, de-duplication statistics, etc.  Once these metrics are in place, it then becomes possible to manage the process and reduce costs.

Katey went on to expound in an exclusive quote for EDD 2.0:

“E-discovery can be managed as a strategic business process with an understanding of costs, performance and outcomes. When there’s no basis for reporting or comparison, it’s pin the tail on the donkey.  Corporate litigants won’t ever know they’re getting their money’s worth if they don’t even know what they’re spending.”

“E-Discovery accuracy/efficiency isn’t being measured, in large part.” Similar to the failure to measure eDiscovery costs, a full two thirds of GCs (67%) aren’t tracking the “efficiency and/or accuracy of e-discovery document review.” Until corporate counsel can link expectations of competency/efficiency with oversight and performance metrics, outside law firms will likely avoid having their feet held to the fire.  This passive stance makes transparency and process improvement difficult at best.  Additionally, this model of having expectations for efficiency, with low or no accountability, doesn’t bode well for the quick adoption of enabling technologies like predictive coding, since the driver has to inherently be the need/desire for increased efficiency (which axiomatically equals lower law firm review bills).

“Corporate information governance and litigation readiness (especially defensible deletion) are a priority, but not yet a reality.” From an internal prioritization perspective, more than two thirds (69%) of respondents identified their desire to expire/delete data more consistently, “thereby limiting unnecessary data retention for future litigation requests.”  Savvy enterprises correctly recognized the “multi-prong threat of unregulated data retention: the large amounts of irrelevant data ultimately produced for legal review, the greater difficulty of hanging onto potentially litigious documents past their required retention periods.”

This finding is very encouraging, and it ties into the upward momentum the industry is seeing regarding information governance generally – particularly linking the reactive (right) side of the EDRM with the logically connected and proactive (left) side of the EDRM.  As a good first step it’s critical to see organizations now associating good information governance hygiene with lower costs and better eDiscovery response times.  The ESG finding also triangulates with results from the recent Information Retention and eDiscovery Survey, which found that companies having good information governance hygiene were often able to respond much faster and more successfully to an eDiscovery/investigation requests, often suffering fewer negative consequences.

The only downside to the positive information governance trend, as reported by the survey, was that,

“while there are great benefits to defensible deletion, internal initiatives for implementing it too often are stymied by difficulty in obtaining cross functional consensus and authorization, particularly as it touches so many other critical processes like regulatory compliance and legal hold.”

“Legal hold processes are still very manual.” Another similar question revealed that many companies are attempting to get their information governance house in order, but are still in the very early stages.  When asked about their  current legal hold notification and tracking process, a whopping 69% of organizations said that they are using a “manual process performed by internal staff using e-mail and spreadsheets, etc.”  And, another 6% said they either had no formal process or tracking mechanism.

Given the risks attendant to flaws in the preservation process this area is ripe for improvement.  The good news is that 54% of survey respondents are intending to improve their legal hold process, with 25% planning improvement within the next 12 months.  This is a healthy acknowledgement that there is risk, and with a modicum of investment (time, personnel, procedures, and technology) the legal hold area can be brought up to current best practices.

The ESG survey is a welcome temperature gauge into the state of corporate legal departments.  It notes, in conclusion, “with the staggering growth, diversity and dispersion of data, the pain e-discovery is currently causing large and serial litigants are only a symptom of the larger problem of unwieldy and under-developed information management affecting all businesses.”  With data insights from the ESG survey, it’s becoming clear that foundational information governance elements (like deploying auditable legal hold procedures, tracking eDiscovery spending, updating data maps, etc.) are desperately needed by the many organizations that want to turn eDiscovery into a repeatable business process.  The good news is that many of these organization have improvements in mind for the next 12 months, and the challenge will be to make sure these proactive projects maintain the same level of organizational urgency that it often present for more reactive tasks.

Gibson Dunn’s Mid-Year eDiscovery Report Highlights Changes in Sanctions Landscape

Monday, August 15th, 2011

In past years we’ve covered Gibson Dunn’s Mid-Year E-Discovery Report which is always a good read, chock full of take-aways about the eDiscovery market.  In my mind, they do an excellent job of synthesizing the ever-expanding volume of case law and comparing those trends with historical averages.  This year’s report is no exception, and for those who don’t get to read all the cases, this is a stellar way to keep up on eDiscovery trends.  Without trying to summarize the entire 23 page document, there were a number of findings that stood out and should be perused by anyone with even a passing interest in the space.

Legal Holds/Preservation. As we all know, eDiscovery sanctions (at least here in the US) are critical business/legal drivers, particularly with regard to the legal hold area (which is the riskiest part of the EDRM).  As the Gibson report points out, the actual award of sanctions has remained relatively flat (56% in the first half of 2011 versus 55% for the full year in 2010) –  but, more important than this relatively stable metric, it’s very clear that the plaintiff’s bar has caught on to the ability to win cases by revealing shoddy (or just undocumented) legal hold procedures, even in some instances where data isn’t lost.  This is why the report notes a dramatic increase in the seeking of eDiscovery sanctions – 68 at mid-year 2011 versus 31 at mid-year 2010.  This doubling of attempts to pierce an entity’s legal hold regime should be a wake-up call to in-house practitioners and chief legal officers, since the attempt and success rates will likely only increase over time.

While there is still some considerable debate, at least for those following Judge Scheindlin’s Pension Committee logic, anything less than a formal, written legal hold policy is per se negligent.  Although it’s conceivable that  a reviewing court won’t use this rigorous standard, anything less formal will strike most organizations as simply too risky.  Ongoing compliance with the legal hold process is also another difficult task for many organizations, one which is considerably easier with an automated solution that is able to track acknowledgements and send reminders over time.  It’s all too easy for companies to think that once they’ve discharged their initial legal hold duty they’re in the clear – but as these obligations morph (with more custodians/data types) and elongate (from months to years) over time, keeping on top of the legal hold processes becomes that much more important.

Sanctions. The Gibson report also importantly points out that there’s currently a split in jurisdictions where some courts can levy sanctions for bad faith, while others can merely require proof of negligence.  Here, the important take-away is that a defendant entity doesn’t typically get to forum shop and therefore they can’t really tell which type of jurisdiction they’ll end up in as a litigant.  So, they need to build their eDiscovery processes to meet the high water (i.e., most rigorous) standard.  In most cases, it’s therefore prudent to be prepared to be sanctioned for merely negligent conduct – anything less can potentially be safe but that risk calculation needs to be considered carefully.

The other perilous part of the equation is that once sanctions are deemed warranted, the court has almost unlimited discretion to levy whatever blend of sanctions it thinks is appropriate.  In Green v. Blitz, for example, the court ordered a laundry list of sanctions, some of which were pretty unfathomable:

1. Defendant had to pay plaintiff $250,000

2. Defendant had to provide a copy of the court’s order to plaintiffs “in every lawsuit proceeding against it” for the past two years

3. Defendant had to file the court’s order in every case that it is involved in for the next 5 years

The bottom line is that sanctions, despite the fear factor, can be used to drive positive proactive conduct – namely in the shape of eDiscovery best practices.

Outside Counsel Duties. Here, the Gibson report notes that outside counsel’s Zubulake duties continue to increase over time, with a number of cases continuing the trend of holding attorneys responsible for ensuring that their clients properly implement legal holds, institute sound sampling protocols and conduct sufficient quality control steps.  This line of discussion can be useful when talking to outside counsel where we’re starting to see how their increasing responsibilities can lead to malpractice exposure, as seen in the recent McDermott case.

Search/Analysis. Lately there’s been a ton of buzz about predictive coding, but (despite the hype) it still doesn’t appear ready for prime time yet.  The Gibson report noted that there were no reported cases that addressed the use of predictive coding or other advanced search technologies.  My sense is that without some semblance of judicial approval or strong client backing, outside counsel (who are concerned about their malpractice exposure, per above) aren’t quickly going to be the first ones into the pool.  Unless an enterprise client demands that they use this type of technology, most will wait for judicial approval and that’s probably still a way off.  While next generation search technologies are more promise than reality right now, there is still a mandate to implement a defensible search methodology.  These are needed initially to demonstrate transparency in the eDiscovery process and to then withstand the challenges levied by counsel in the case of an inadvertent production.

In sum, the Gibson report shows the ongoing maturation of the eDiscovery space.  But, any niche market led by case law and/or attorneys deciding to adopt new technologies won’t be quick to change.  In many instances, therefore, the best practices will be decided a combination of standards bodies and vendors who are being pushed by their more forward thinking clients to get and stay on the cutting edge.

Bit by Bit: Building a Better eDiscovery Collection Solution

Friday, July 29th, 2011

Is there a place in eDiscovery today for hard drive imaging and bit by bit copies, which collect deleted items or slack/unused hard disk space?  The answer is yes with some important limitations.  For the vast majority of matters, ESI can be collected without imaging drives or utilizing proprietary container files.  However, I occasionally still encounter folks who are victims of the dated and costly misconception that eDiscovery always requires the bit-level imaging of hard drives.

There are situations, though, where the existence of data (as opposed to its content) is central to the matter – when companies suspect employees of stealing proprietary information or when employees leave a company under suspicious circumstances.  In these and other similar situations, it may make sense to have the employee’s workstation hard drive imaged for full forensic analysis.  Even in these scenarios, I find that companies are more likely to hire an external investigator to perform this task to allay suspicions of tampering or bias, and the company generally would prefer that this investigator be the one to testify about this sensitive data acquisition.  Then, for ESI beyond the target employee’s hard drive, other collection methods may be used.  As we’re now midway through 2011 – a year in which I expect to see eDiscovery fully embraced by many corporations as a true business process – I wanted to analyze why the forensic disk image myth still exists, where it came from, and what the law really requires of an eDiscovery collections process.

Traditionally, cases that mentioned full forensic imaging of hard drives began their captions with United States v. or State v. because they were criminal matters.  In traditional civil litigation – even the behemoth eDiscovery cases that get all the bloggers blogging – forensic imaging simply is not required or needed.  In fact, in most cases, it will dramatically increase the cost associated with electronic discovery – this process adds unnecessary complexity in downstream phases of eDiscovery and leads to vast over-collection.  Why collect the Microsoft Office suite 50 times when what you are really required to preserve and collect are the files created with those programs?  When using disk imaging, program files are collected which drives up storage costs and requires the post-collection step of deNISTing (removing system files based on the NIST list).  Why not leave those system files behind and perform a targeted collection of only user-created content?    In addition, the primary rules governing civil litigation – the Federal Rules of Civil Procedure and Federal Rules of Evidence – simply do not require exact duplication of electronic files.  I am amazed that there are so many experts who are still pushing full forensic imaging and duplication in every case.  In fact, this goes against best practices published by The Sedona Conference, EDRM, and in the E-Discovery textbook co-authored by Judge Shira A. Sheindlin.

In comment 8c of the Sedona Principles, the authors call making forensic image backups of computers “the first step of an expensive, complex, and difficult process of data analysis that can divert litigation into side issues and satellite disputes involving the interpretation of potentially ambiguous forensic evidence.”  The comment goes on to say that “it should not be required unless exceptional circumstances warrant the extraordinary cost and burden.”  In a whitepaper authored for EDRM by three eDiscovery experts from KPMG, LLC, the authors discussed the high cost of forensic bit-level imaging and, instead, suggested that targeted collection of ESI would be sufficient in the vast majority of non-criminal matters.  They state, “[t]he challenge of Smart EDM [Evidence and Discovery Management] is to obtain targeted files in a forensically sound manner – chain-of-custody established, proven provenance, and metadata intact – without having to resort to drive imaging.”

In Electronic Discovery and Digital Evidence: Cases and Materials, written by Judge Shira A. Scheindlin, Daniel J. Capra, and The Sedona Conference, the authors state that,

“because imaging software is commonly available, and because the vast majority of training programs in the field of electronic discovery revolve around forensics, there is a growing tendency to want to ‘image everything.’  But unless an argument can be made that the matter at hand will benefit from a forensic collection and additional examination, there is no reason to do a forensic collection just because the technology exists to do it.”

So, with the top experts in the field saying the days of “image everything” should be over, why does it still happen?  Why are the victims of this antiquated workflow still paying the exorbitant costs of a solution that does not really meet their requirements?  Perhaps a historical perspective will be helpful in explaining.

Why Drive Imaging and Proprietary Containers?

I do not think there is any debate on the benefit of having a bit-level image of a hard drive in a criminal investigation.  However, traditionally, the investigators using these methods needed a way to get the imaged drive safely back to a lab for further analysis.  Companies or law enforcement agencies that hired third-party investigators to image drives had to transport the data, maintaining chain of custody, and preserving all contents in an un-alterable state through several phases of the investigation.  And, in criminal matters, it was especially important to maintain the integrity of the evidence when the electronic evidence was central to the government’s case.  Remember, the burden of proof in a criminal matter is “beyond a reasonable doubt” (along with a host of constitutional considerations).  Alteration of key evidence could certainly create reasonable doubt and hose the prosecution’s case (or, worse, the evidence gets tossed by the Court before the trial even begins).  The container file ensures that no matter who handles the evidence, checksums can prove that the contents were not altered since the initial imaging.

Many vendors now offer logical image containers as an alternative to doing a full bit-level image of the drive.  However, in corporate eDiscovery, this is still overkill because the tools and solutions being used downstream still have to unpack or parse these proprietary container formats for processing and analysis.  In fact, even software from the vendors who created these container formats must “crack them open” to get to the contents within.  This seems to add a layer of complexity that has not been needed since the days of the external examiner coming in with her forensic toolkit to do drive images. The format was created to solve a very specific problem, and little thought was given to the use of this format in a holistic process like what is typically seen in civil eDiscovery.   There is no longer a need for a container for portability of evidence because it is most likely going to be processed in place after collection while residing on a secure evidence store on the company’s network.  I have heard “what if our collections methods are challenged?”  And to that, I would respond that we are not in criminal court and that the requirement in civil court is reasonableness, not perfection.  Now, if an employee is suspected of wrongdoing and the potential deletion of files will dramatically alter the case, then by all means, hire a forensic investigator and follow all of the protocols established over the last several decades in computer forensic science.

Fast forward to the 21st century

Corporations are bringing eDiscovery in-house; they are building a business process around it to minimize risk and drive enormous cost savings, and in today’s world of civil litigation, there simply is not a need for these drive images or proprietary containers.  First of all, the burden of proof in a civil matter is “by a preponderance of the evidence.”  What this means is that the burden is satisfied if there is greater than 50% chance that a proposition is true.  This is a much lower standard than in criminal cases.  But, burden of proof goes more to the weight evidence is given by the court or jury.  Before that is even considered, evidence must pass several hurdles of admissibility.  As we will explore, these standards of admissibility have also been the recipients of significant bolstering from vendors over the years.

The Path to Admissibility

There are several hurdles to admissibility for any type of evidence, and because they are not within the scope of this post, I will forego any discussion of relevance, FRE 403, or the hearsay rules.  I will focus on the issues that tend to be associated with electronic evidence: authentication and the “best evidence rule”.  There are some examiners and perhaps even vendors that would argue electronic evidence is simply not admissible if not collected using bit-level imaging (and sometimes 2 copies – one that is referred to by examiners as the “best evidence” copy and another “working copy” to be analyzed).  This is simply not true.  What we will find is that the collection method will go more to the weight of the evidence rather than the minimum showing needed for admissibility (hence, the discussion of burden of proof above).

All evidence must be authenticated pursuant to FRE 901.  This is a “don’t pass Go” threshold requirement for admissibility.  FRE 901 is satisfied by “evidence sufficient to support a finding that the matter in question is what its proponent claims.”  Notwithstanding a “self-authenticating” piece of evidence pursuant to FRE 902, the proponent must establish the identity of the exhibit by stipulation, circumstantial evidence, or the testimony of a witness with knowledge of its identity and authorship.  Typically, objections to this process would tend to go toward whether the exhibit is an original, was altered, or the witness with whom the proponent is attempting to authenticate the exhibit is not able to so based on lack of personal knowledge or some other defect.  Mostly these objections deal with the authenticity of the contents of the exhibit, and the rules in Article X of the FRE are helpful here.  Rule 1001 defines an “original” with respect to data stored in a computer or similar device as “any printout or other output readable by sight, shown to reflect the data accurately.”  This is a far cry from a bit-by-bit forensic image!  Rule 1002 – often referred to as the “Best Evidence Rule” – requires that “[t]o prove the content of a writing, recording, or photograph, the original writing, recording, or photograph is required, except as otherwise provided in these rules or by Act of Congress.”  Not only do these rules not require exact duplication of the electronic files, but they do not require imaging the entire 80GB hard drive to collect the 100MB of files that are potentially relevant to the case.  What they do require, though, is the ability to show that a document being proffered is the same document that was originally created.  In Re Vee Vinhnee, 336 B.R. 437, 444 (B.A.P. 9th 2005). Also, Judge Grimm sets out an extremely comprehensive analysis of what is required for the admissibility of electronic evidence in civil litigation in Lorraine v. Markel American Insurance Company, 241 F.R.D. 534 (D.Md. May 4, 2007).  In Lorraine, he notes that In Re Vee Vinhee may set out the most demanding test for admissibility of ESI.

Maintaining Forensic Integrity

So, how do I combat the claims that “they must have altered that document” or “Your, honor, I swear that line about ‘acceptable losses’ was not in the safety memo when I created it”?  This is where hash value becomes a wonderful thing.  Computing the hash of an electronic file, or computing a hexadecimal checksum based on analysis of the contents of an electronic document, is essentially like recording the DNA of an electronic file.  If the file is altered, its hash value would be different.  So, by computing the hash value at the source, in transit, and at the destination, I can ensure that the electronic file is in exactly the same state as it was at the source (or, that the collected document is the same as the document originally created).  Now, add the ability to report on that information and those container files and full forensic disk images really do become extreme overkill.

The important distinction here is that the term “forensic” does not refer to a type of technology or the products of a specific vendor – despite claims and propaganda to the contrary.  Forensic refers to the methodology used by the person collecting the evidence – whether it is finger prints from a weapon or electronic files from an employee’s laptop.  Forensic imaging, however, refers to the process by which an entire hard disk is copied bit by bit to create an exact duplicate of that hard drive in a forensic manner.  It is entirely possible for a collection of ESI to be “forensically sound” by simply employing the technique described above of taking hash values at each stage of the process to be able to prove that the files were not altered during collection.  As long as chain of custody is also maintained (much easier to do now that we are not using multiple tools, vendors, locations, and people to do the job), then the process should meet the threshold admissibility requirements of the Federal Rules of Evidence.

Opponents will still bring up claims that the evidence must have been altered, or the expert familiar only with forensic imaging technologies will try to use the argument that only vendor X’s technology is “court vetted,” so any other method is not acceptable.  But, to these opponents, I would argue two points:

  1. No technology is “court vetted”.  The operator’s use of the technology in the specific case (in a specific jurisdiction) was acceptable to the court to meet the threshold showings required by FRE 901, 1001, and 1002 – as well as any rules of procedure governing the production of discovery in either a civil or criminal matter.  Wow – that would be a very long footnote on a marketing slide…probably why it is not usually mentioned.
  2. The process is forensically sound, and you can prove that the documents were not altered from collection through production by referencing the hash value and maintaining copies of the original native files analyzed on a secured preservation store.  This would exceed the requirements of FRE 901, 1001, and 1002 – but would provide protection against claims going to the “weight” of the evidence by opponents who would cry foul.

What Now?

So, where does all of this leave us?  First, in the vast majority of civil litigation matters where electronic discovery is being performed, forensic bit by bit imaging of computer hard drives is simply not required.  Vendors have promoted this practice over the years, but all this has done is over-complicate the eDiscovery process for many unsuspecting litigants and dramatically increase costs because the model simply does not scale.  Moreover, the effort and cost required to deal with these full drive images downstream in the process is often overlooked by these vendors and overzealous consultants.  Next, we now know there is a better way – targeted, forensically-sound collection of ESI using streamlined and automated solutions that maintain custodian relationship – even for shared data sources – throughout the eDiscovery lifecycle, preventing form of production disputes and other calamities that have plagued this industry for the last decade.  There is a better way to collect ESI that will provide exponential cost savings all the way to production.

Clearwell Is Now Officially Part of Symantec

Monday, July 11th, 2011

Today, I am delighted to report that Clearwell Systems has become part of Symantec. We have, of course, been working closely together since obtaining regulatory approval for the acquisition last month, but this makes it official: Symantec can now offer customers Clearwell’s market-leading eDiscovery platform as well as its market-leading Symantec Enterprise Vault archiving solution. We are excited to be part of the Symantec team, and to work alongside so many talented people to create the next generation of eDiscovery and information governance solutions.

There are already a large number of joint customers using the Clearwell and Symantec solutions as part of an integrated eDiscovery and archiving workflow, and we are well underway towards building more robust integration between Clearwell and Symantec Enterprise Vault. In updating our product roadmaps, all our decisions are guided by feedback from customers who have told us over and over again that they want to:

  • Reduce costs across all phases represented in the Electronic Discovery Reference Model, from information management through review and production
  • Reduce risk by improving the defensibility and repeatability of their archiving and eDiscovery processes
  • Streamline their end to end archiving and eDiscovery lifecycle to meet legal and regulatory deadlines
  • Start managing information and conducting eDiscovery in as little as one day; whether on-premise, as a hosted solution or in the cloud
  • Meet their enterprise-wide archiving and eDiscovery needs, whether they have less than 25 to more than one million users

As we’ve discussed before, our plan as part of Symantec is to deliver a seamless, integrated archiving and eDiscovery management workflow that benefits all our customers. To keep everyone in the loop, we will continue to post updates and answer questions on the integrated product portfolio here and on the Symantec eDiscovery blog.

For more on the acquisition, and the response from our customers, partners and the industry at large, visit: http://www.symantec.com/clearwell.

Clearwell Lives On, But It’s Farewell To “Clearwell Systems Inc.”

Thursday, June 23rd, 2011

Very soon, Clearwell Systems will become part of Symantec and cease to exist as an independent company. This will bring to a close 6 ½ wonderful years, during which Clearwell has grown from the two founders into a profitable, 240-person company. All told, our team has shipped 6 major versions of the Clearwell E-Discovery Platform, signed over 400 customers and 75 partners in 14 different countries, and become widely recognized as leaders in our industry. As a result, Clearwell’s valuation has increased from effectively zero to the $410 million which Symantec is paying our shareholders to acquire the company, making this by far the largest acquisition of an e-discovery software company to date.

For 6 of Clearwell’s 6 ½ years in existence, it has been my privilege to lead the company as its CEO. These have been, by far, the most rewarding, stressful, exhausting, and exhilarating years of my career. So in this, my final blog post, I would like to reflect on how we got here, and take this opportunity to thank some of the many people who made it possible.

***

In my view, there’s no single thing that makes a company successful. Rather, it’s a distinctive mixture of the right idea at the right time, executed the right way, by the right team, which gets the right lucky breaks and is propelled forward ahead of the competition by surging customer demand. That, in summary, is the story of Clearwell.

Right idea at the right time:

In the early days of a company’s life, when there’s no product and no hint of a customer, the only thing that you have is the idea. This is not the specific idea of what the company will do (that comes later); it’s the idea that there’s a huge change, a shift in the tectonic plates, that creates the opportunity to build a substantial new company. Much of this is about timing. Many changes are obvious over a 10-year timeframe, but it’s very hard to gauge which of them will occur in the 2-4 years that investors are willing to fund a startup venture.

The founding team at Clearwell was attracted by two big trends which combined to produce a profound change. One trend was that, by the mid-2000s, almost all communication within an organization had started to flow through email, as opposed to voicemail, memos, or hallway conversations. The other was that storage costs had fallen to the point where it was almost free to store all the email that people were generating. We realized that these two trends in combination had resulted in the creation of a user-generated written record of everything happening within an organization – something which had never existed before. Our hunch was that there had to be some way of unlocking value from this written record, while still respecting privacy.

Executed the right way:

We came to Clearwell with very specific ideas about how to build a world-class software company. These are too numerous and varied to capture here, but I will give you a few examples. In product development, we have always sought to build our enterprise products as if they were consumer products, so we made sure that they are intuitive and easy to use without any training. We designed them with the sales process in mind, by making them very easy to install and evaluate, so that prospects can try them out for free prior to purchase.  When it comes to marketing, we sought to promote a better way of doing e-discovery, rather than just pitch features, by championing the importance of early case assessments (ECA). With respect to pricing, we made the entry-point price as low as possible to encourage adoption, and pegged it to a metric that scales in line with value.  Strategically, we chose processing, analysis, and review as our entry point into the e-discovery market, because that’s where software provides the biggest, most immediate ROI.

In every area of the business, we brought a distinctive approach, all centered around our view of the ideal customer experience – the experience we would want to have, if we were our customers.

Right team:

The standard playbook for recruiting is to hire people who have done it before, ideally in the same domain. We took a different approach, and instead hired primarily based on personal qualities. Some of our team had no prior experience in enterprise software; many (including me) had never worked in e-discovery before coming to Clearwell. But we all share one thing in common: a relentless drive to win in the marketplace by building better products and providing better service than anyone else.

That hunger to win will trump experience every time. It’s the reason why engineers work through the weekend to resolve customer issues without being asked, or why a salesperson will travel 4 days out of every week to call on customers. It’s something that gets built into the company culture and then self-perpetuates. Our team tripled in size in the space of 18 months, and I never cease to be impressed by the fresh ideas and boundless energy coming from the new generations of “first-timers”.

Right lucky breaks:

Every successful company needs the rub of the green, and there have been many occasions when I’ve marveled at our good fortune. But perhaps our biggest break was that the Federal Rules of Civil Procedure (FRCP) changed for the first time in 38 years in December 2006, defining rules for the treatment of electronic information in the courts. This accelerated the movement from paper to electronically stored information and coincided perfectly with our entry into the market, drawing us into the electronic discovery domain.

Surging customer demand:

It’s an amazing feeling when you achieve “product/market fit”, as we did at the beginning of 2009. The user community among law firms and litigation support firms embraced our technology for ECA, taking our user base from hundreds to thousands. Enterprises woke up to the money that could be saved by bringing electronic discovery in-house, proactively issuing RFPs and creating new positions specifically responsible for e-discovery. Federal agencies began to adopt e-discovery solutions to sift through the vast quantities of data coming to them as part of their regulatory and investigative duties. Essentially, e-discovery became a core business process, just like finance, sales or HR – it became something that every organization had to do. And just as other departments use applications like salesforce.com (sales), Success Factors (HR), or NetSuite (finance) to manage those business processes, so it was that legal departments realized that they needed an application like Clearwell to manage the e-discovery process.

All of a sudden, the business accelerated, sales took off, and we felt ourselves being pulled in every direction at once. In response, we expanded our platform, moving from 1 product to an integrated platform of 4 products; and, we increased our geographic coverage by building out the sales team across North America and establishing beachheads in Europe and Asia. The Clearwell team worked around the clock to respond to customer demand, while at the same time recruiting and training as we added people at a furious pace. We learned that hyper-growth can be painful, but in a good way.

***

When things go well, the CEO often takes a disproportionate share of the credit. I must confess, it would be nice to think that the company’s success is due to some kind of brilliance or magic touch on my part, but the reality is quite different. This has been a team effort from beginning to end and there is a very long list of people who deserve recognition. It’s impossible to capture them all, but I’m going to do my best, by saying a heart-felt “thank you” to:

  • Venkat Rangan and Charu Rudrakshi who started the company, raised the first round of funding, and set the DNA of the engineering team;
  • Jim Goetz at Sequoia Capital who acted more as co-founder than investor in the company’s first year, and has since been incredibly supportive of the management team;
  • Tom Dyal at Redpoint Ventures for his support and insightful advice on strategy; Bill Coughran at Google for helping us think through how best to scale engineering; John Dillon at EngineYard for teaching me what it means to sell software; and, Scott Dettmer at Gunderson Dettmer for his finesse and deft touch in managing the most delicate negotiations;
  • Andy Byrne, Anup Singh, Kamal Shah, Ryan Snyder, Soumitro Tagore, Trevor Eddy, and Venkat Rangan for creating a truly outstanding management team built on trust and mutual understanding – it is quite remarkable that in 6 years, the company has only ever had 1 VP Business Development, 1 CFO, 1 VP Marketing, 1 VP Sales, and 1 CTO;
  • Amar Laud, Amy Johnson, Andy Kashyap, Aruna Mantripragada, Bill Duffy, Brandon Cook, Cat Lee, Chitrang Shah, Cris Barrett, Dave Fraleigh, David Speicher, Dean Gonsowski, Donna Hui , Doug Kaminski, Ed Hinton, Jason Montgomery, Jason Reeve, Joe Schwartz, Krista Jones, Kurt Leafstrand, Malay Desai, Manish Sampat, Mark Wentworth, Mike Lee, Peter McLaughlin, Sangeeta Relan, Sean Wilcox, Steve Rapp, Subbu Gooty, Teddy Cha, Tom Kennedy, Tom Wells and Umair Hamid for being the leaders who have really defined the company, and without whom we would never have got anything done;
  • Clearwell “Class of 2005” for their super-human efforts in shipping Version 1 and launching the company; Clearwell “Class of 2006, 2007 and 2008” for tirelessly iterating until we cracked the code for a profitable business model; and, Clearwell “Class of 2009, 2010, and 2011” for driving the huge expansion of our operations, both in the US and overseas;
  • John Petruzzi from Constellation Energy, Joe Tawasha from Charles Schwab, Don McLaughlin from Qwest, Pallab Chakraborty at Oracle, Jesse Hartman at the Department of Health and Human Services, and Ron Best at MTO for being bleeding edge customers who took a chance on a fledgling technology;
  • Jeff Fehrman from Onsite; Greg Mazares, Keith Lieberman and the infamous Taylor brothers at Encore; and Paul Tombleson at KPMG UK – for being the first service providers to embrace Clearwell’s technology;
  • Debra Logan and John Bace at Gartner; Barry Murphy and Greg Buckles at eDiscovery Journal; Brian Babineau and Katey Wood at ESG; Brian Hill at Forrester; Chris Dale of the eDisclosure Information Project; George Socha and Tom Gelbmann; Nick Patience at 451Group; and Vivian Tero at IDC – for doing so much to help define e-discovery software as a space and make it intelligible to end-customers;
  • Deepak Mohan and Brian Dye at Symantec for sponsoring an acquisition that will massively accelerate the adoption of Clearwell’s technology; and,
  • Finally, Enrique Salem and the entire Symantec M&A and Integration Teams for giving us such a warm welcome into the Symantec family.

***

It has been a remarkable journey. I feel proud, and humbled, to have been a part of it.

Staying on Target in Electronic Discovery

Thursday, June 23rd, 2011

Clearwell just announced major enhancements to our Identification and Collection Module that together usher in a new generation of targeted collection capabilities for e-discovery. Why are we excited about this? Because it promises to provide our customers with a dramatic increase in their ability to perform quick and efficient collections across the enterprise with a small fraction of the cost and effort traditionally required.

Before Clearwell, vendors could only rely on building their own indexes when attempting to collect content by keyword from unstructured document sources. They did this in one of two ways.

The first method was to build one-off indexes with each collection, indexing content and then discarding the index after collection is complete. This minimized the amount of infrastructure required to maintain the index, but was painfully slow and wasteful of computing and network resources. These sorts of solutions came from vendors who originally focused on the forensic investigation side of the world, whose tools had been designed around small-scale collection from individual devices and hard drives. Unfortunately, they simply don’t scale to meet the demands of today’s large enterprises with their ever-increasing data volumes.

The second method was to attempt to create an uber-index of all of the information in an enterprise and keep it continually updated so that it would be ready at a moment’s notice for your collection needs. This approach proved to be incredibly challenging to implement, required a huge amount of infrastructure to maintain, and, worst of all, didn’t really work: creating the uber-index, as it turns out, was uber-difficult.

In talking with hundreds of customers over the last couple of years, we realized that there was a better “third way,” which combined the lightweight nature of the first method with the comprehensiveness of the second. How? By leveraging the indexes that enterprises already have in place. From comprehensive, robust archiving solutions like Symantec Enterprise Vault to the fully-searchable indexes found on Microsoft SharePoint, Exchange, and file servers, the way of finding the information you need quickly for e-discovery is, by and large, already out there. It’s simply a matter of building an e-discovery platform sophisticated enough to leverage those indexes and, when necessary, be intelligent enough to build its own when not available from another source. That’s exactly what we’ve done with Clearwell’s targeted keyword collection feature.

One of the most exciting things about this approach is that, while it works great for today’s enterprise information infrastructure, it is perhaps even more powerful in tomorrow’s. As your company’s information stores gradually shift toward the cloud, leveraging the indexes in the cloud becomes essential to being able to access the information that lives there in a fast and efficient manner. It’s simply not feasible to be able to use the “one-off” or “uber-index” approaches when data is living in a cloud infrastructure, since data access rates are often slower because they are occurring over a wider-area network.  Last year, Clearwell was the first e-discovery platform to support direct access of cloud Exchange and SharePoint environments, and now with keyword collection we have made another great stride forward in achieving our customer’s vision for next generation e-discovery. And there’s still more to come as we accelerate our product development by integrating with Symantec’s world-class information management team. Stay tuned!

McDermott Sued Over Alleged Electronic Discovery Gaffes

Wednesday, June 22nd, 2011

The electronic discovery world is buzzing about the malpractice case filed again Amlaw 100 firm McDermott Will & Emery.  There are a few good summaries here and here, but the gist of the complaint is that McDermott failed to properly supervise the electronic discovery efforts for their client J-M Manufacturing (J-M) in response to a qui-tam investigation.  According to a lawsuit filed by J-M in a California state court, McDermott inadvertently produced 3,900 privileged documents that were handed over to the federal government (and subsequently to a 3rd party).

In terms of the nitty-gritty, the complaint alleges that McDermott used electronic discovery vendor Stratify (formerly part of Iron Mountain, now absorbed into Autonomy) to process and host the data.  Then, McDermott apparently retained a bevy of contract attorneys to review collected ESI from the 160 custodians, ultimately producing 250,000 documents that were presumably relevant, but not privileged.  The complaint contains the following particulars:

“12. Defendants owed PLAINTIFF a duty to render legal services competently. Defendants breached that duty by, inter alia, producing privileged documents to parties adverse to JME in litigation without obtaining its informed consent, failing to supervise attorneys and vendors MWE contracted with to perform the review and production of documents, and charging JME fees and costs for performance of such work that was not properly performed, or not performed at all.”

Surprisingly, this entire discussion is about a mere complaint filed against a large firm, who assuredly will wage numerous procedural challenges.   Thus, it’s questionable whether this case even sees the light of day.  So, why is it showing up on the radar of so many experts and pundits?  First of all, as Ralph Losey notes:

“This malpractice suit is an important and widely talked about event because it represents the first time, to my knowledge, that a law firm has been sued for e-discovery malpractice. We have all been waiting for this to happen. It was inevitable.”

But, novelty alone doesn’t usually make headlines, unless where there’s also smoke there’s probably fire.  Given the rise in electronic discovery sanctions against counsel, it has long been a fait accompli that a corporate client who experienced spoliation sanctions or an inadvertent production would start pointing fingers at other participants in the process, including the law firm that directed the e-discovery effort or the service provider who hosted the review process.  A recent Duke article noted that “[c]onsistent with the overall increase in sanction cases,…counsel sanctions for e-discovery have steadily increased since 2004.”  The article identified various levels of misconduct as the basis for counsel sanctions — “four cases involved negligence, seven cases involved gross negligence, nine cases involved reckless disregard, and ten cases involved intentional conduct or bad faith.”  Significantly, the article also noted that sanctions can be based on the “counsel’s personal execution of discovery tasks or on the counsel’s role in coordinating and overseeing the client’s discovery.”  That latter element seems to be the case with the claims against McDermott, and coupled with an inadvertent production (the third rail of electronic discovery) it doesn’t seem too shocking that a malpractice action would get filed.

This lawsuit does serve as a cautionary tale for those firms that continue to do things the old fashioned (i.e., 1.0) way.  While not an exhaustive list, this means some or all of the following: employing custodian self collections, using blind key word searches, failing to do sufficient data sampling (at the search and production phases), opting to not utilize early case assessment approaches, lack of search strategy and iteration, failing to optimize the review process, etc.  Surprisingly, old school approaches to electronic discovery are staggeringly common.  In fact, I’ve recently talked to some well traveled practitioners who’ve actually felt like their firms have gone backwards in recent years as prices for basic, block and tackling e-discovery services have plummeted.

If nothing else, we know that attorneys are hyper vigilant about their malpractice insurance.  And, it’s not too hard to see how premiums may go up with increasing e-discovery claims, successful or not.  So, while it’s unclear what will happen to McDermott, if it can happen to an Amlaw 28 firm (with roughly 1,000 lawyers) it can probably happen to any firm who’s not being as diligent as they should.

As a final note of supreme irony, McDermott will likely have to conduct electronic discovery as they defend their electronic discovery malpractice claims.  I wonder if they’ll use Stratify and outside contract attorneys.  I’d guess not.

Apple, Code Name K48 and E-Discovery

Wednesday, June 22nd, 2011

According to a complaint filed by the U.S. government, the FBI secretly recorded an employee at one of Apple’s suppliers passing confidential information about the soon to be released Apple iPad in an October, 2009 telephone conversation.  The recording, along with other evidence, led to the arrest of the employee and others on charges on of wire fraud and conspiracy to commit securities fraud on December 16, 2010 as part of a major insider-trading investigation.  In the conversation, a director for Flextronics named Walter Shimoon is heard saying:

“they [Apple] have a code name for something new … It’s … It’s totally … It’s a new category altogether… It doesn’t have a camera, what I figured out. So I speculated that it’s probably a reader. … Something like that. Um, let me tell you, it’s a very secretive program … It’s called K, K48. That’s the internal name. So, you can get, at Apple you can get fired for saying K48.”

Four months later, the first Apple iPad, code named K48, was unveiled to the public.    To read more about the case background, read the press release issued by the U.S. Attorneys’ Office on December 16, 2010.

The case is interesting from an eDiscovery standpoint because it highlights challenges related to finding critical evidence as part of an investigation or lawsuit when people are intentionally using code words to hide information.  Finding or overlooking important documents that have been disguised can make or break your case, so determining whether or not key players are using code words is an important part of a thorough investigation.  Equally important to the investigation is segregating relevant and irrelevant documents quickly before key evidence is lost or destroyed without being required to conduct a painstaking page by page review of each document.

How Does Technology Help?

The good news is that even though technology innovation has resulted in massive data growth requiring the review and analysis of more documentary evidence during lawsuits and investigations, advances in eDiscovery technology have also made sifting through this information faster and easier.  In other words, technology can help solve the data growth problem technology created.

One of the newest advances is the use of “transparent concept search” technology to find important electronic files in lieu of basic “keyword” or “traditional” concept searching technology.  In many situations investigators or lawyers simply aren’t aware code words are being used to hide activity, so critical evidence is often overlooked.  For example, in the present case assume the investigator is unaware that “K48” is the internal code name used for the first iPad.  A simple keyword search for the term “iPad” may not retrieve critical documents about the “iPad” because the code name K48 is being used to disguise the product name.  If this is the only search methodology used, information could easily be overlooked during the investigation due to the limitations of simple keyword search technology.

On the other hand, running the same search using a traditional concept searching tool is likely to retrieve documents containing the word “iPad” as well as other conceptually related documents.  The problem is that the user has no ability to control the breadth of the search using traditional concept searching technology.  That means even though a traditional concept search for the term “iPad” is likely to include documents containing the term “K48” and “iPad,” it is also likely to retrieve a large number of irrelevant documents containing terms like “iPod, iTouch and iTunes that may appear to be conceptually related to the search term “iPad.”  The problem may seem trivial initially, but when investigators are required to read hundreds or thousands of irrelevant documents about the iPod, iTouch or iTunes in an effort to find relevant documents about the iPad, the time and cost of the investigation can skyrocket.

Next Generation Transparent Concept Search Technology

To solve this problem, next generation transparent concept search technology takes traditional concept searching a step further by empowering investigators to reap the advantages of traditional concept searching while actually reducing instead of increasing e-discovery expenses.  The secret is that transparent concept searching technology significantly reduces the time and expense resulting from over-inclusive document retrieval by allowing users to eliminate documents containing concepts that are not relevant to the intended search.  This is accomplished by providing a transparent view of concepts related to a search so that users can actually visualize and select (or deselect) the range of concepts to be included in a search before the search is executed.

For example, using transparent concept search technology to search for the term “iPad” would reveal conceptually related terms like “K48” just like traditional concept searching.  However, a transparent concept search would also provide a list of all concepts related to the keyword “iPad” prior to the search such as “K48, iPod, iTouch, Shimoon, iTunes, etc.  Prior to executing the search, the user could de-select irrelevant concepts and limit the search to “iPad”, “Shimoon”, “internal” and “K48” to make sure only the most relevant documents are retrieved. (See Figure 1).  In addition to decreasing the cost associated with segregating relevant and irrelevant documents, the transparent approach to concept searching results in strategic advantages for investigators and legal teams because the most relevant evidence is found quickly so cases can be assessed faster, with more accuracy, and before evidence disappears.

Figure 1: Transparent concept search reveals all concepts related to the keyword “iPad” so users can not only identify key documents they may have otherwise overlooked, but they can also select which concepts (“internal” “K48” “Shimoon”) to include in the search so only the most relevant documents are retrieved.

Conclusion

Not knowing what to search for as part of eDiscovery or investigations is often the biggest organizational challenge that basic keyword and traditional concept search technology has not been able to solve.  Next generation transparent concept search technology overcomes the inherent limitations of basic keyword and traditional concept searching technology by empowering users to uncover, assess, and review evidence faster and with more accuracy, thereby giving litigators or investigators new strategic advantages on every case.

E-Discovery Goes Mainstream

Tuesday, June 21st, 2011

These days, being mentioned on a late-night talk show is pretty much a stamp of “going mainstream”. This is true of celebrities (notably the One-Man Band that is Charlie Sheen), public figures (Captain “Sully” Sullenberger, who piloted the US Airways plane to a safe landing on the Hudson River), and even infomercial goods (who isn’t familiar by now with the Snuggie?)

In the e-discovery world, we realized just how mainstream this industry is becoming when we made mention on The Daily Show with Jon Stewart. With guest star Fareed Zakaria, fresh off the release of his new book, on set to discuss the American economy and the impact of technology on corporations, audiences were treated to this nugget:

Zakaria:   Machines can do things that people used to. There’s now computer programs that can do stuff that lawyers used to be able to do – discovery and things like that. May not be such a bad thing…

Stewart:   What can lawyers do that computers can’t do?

Lawyer jokes are never in short supply, and leave it to Jon Stewart not to miss a timely jab when one can be thrown. But we took notice because, of all the examples Zakaria could have used for technology’s impact on businesses everywhere — he chose to highlight the role of e-discovery software.

This was far from the first “mainstream” move for the e-discovery industry. In March, The New York Times published a featured – and top-emailed – article on advances in electronic discovery software. In May, leading analyst firm Gartner published the Magic Quadrant for E-Discovery Software, its first Magic Quadrant on the electronic discovery industry. And then in June, there it was: electronic discovery, right alongside CNN’s Fareed Zakaria and all Jon Stewart’s comedic antics on The Daily Show. Taken together, it’s clear that e-discovery is a hot topic on the minds of business folks and, increasingly, mainstream audiences. We’re eager to see where it comes up next – and secretly hoping the SNL sketch team is taking note.

Patents and Innovation in Electronic Discovery

Monday, June 13th, 2011

In the world of technology we live in, a huge amount of benefit is created when people apply certain well-known techniques to solve problems and create value to the broader community. Such techniques are often the result of painstakingly long and laborious research, driven primarily by academic institutions with private industry either funding such research directly or by co-opting them in their own work. When the industry as a whole recognizes a certain methodology, it gains popular usage.

In information retrieval, searching and retrieving relevant content from unstructured text has been a vexing problem, and we’ve had decades of the brightest minds applying their collective intelligence and the rigors of peer review to validate and establish the most effective way to solve a retrieval problem. And, research forums such as TREC, SIGIR and other information retrieval conferences establish a venue for advancing the state of the art. So, when Recommind announced that they have been issued a patent on Predictive Coding, I took notice, especially since it touches a nerve with those who believe research should be openly shared.

The patent lists six claims that describe a workflow whereby humans review and code a document and the coding decisions applied to the document sample are projected or applied to the larger collection of documents. Anyone who has even the slightest exposure to information retrieval research will recognize this as a very common interactive relevance feedback mechanism. Relevance feedback as a way to perform information retrieval has been studied for well over forty years, with a paper as early as 1968 by Rocchio J.J., titled Relevance Feedback in Information Retrieval. It falls under a category of methods broadly known as machine learning.

Any supervised machine learning system involves creating a training sample and using that sample to project into a larger population. The fact that one could claim patentable ideas on something that is so widely known and used is puzzling.  Any workflow that employs machine learning would include the steps of creating an initial control set, coding that by human review, and applying the learned tags to a larger population.  In fact, the Wiki article Learning to rank describes precisely the workflow that is claimed in the patent and as part of our participation in the TREC Legal Track 2009, Clearwell submitted a paper with iterative sampling based evaluation and automatic expansion of initial query.  In that paper, we describe exactly the workflow postulated by the six claims of the patent.

In terms of other prior art that would potentially invalidate the patent, the list is long. Let’s start with Text Classification. Text Classification using Support Vector Machines (SVM) was first published by Thorsten Joachims in 1998, in the Proceedings of Sixteenth International Conference on Machine Learning, as well as his book Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms, published by The Springer International Series in Engineering and Computer Science.  Now a well-recognized Professor of Computer Science at Cornell University, that work is widely cited as a seminal work on the area of machine learning and text classification. Interestingly, this work was cited by the Patent Examiner as prior art, but the inventors missed listing it. Nevertheless, that work and further work by several academics such as Leopold and Kindermann has already established the use of Support Vector Machines as a useful technique for machine learning. To claim the novelty of its use in automatically coding documents is, in my opinion, a hollow claim.

Another technology mentioned in passing is Latent Semantic Indexing (LSI). This is proposed as a retrieval technique by Deerwester, S., Dumais, S.T., Furnas, G.W.,Landauer, T.K., Harshman R. in their paper, Indexing by Latent Semantic Analysis, in Journal of the ASIS, 41(6):391-407, 1990. The use of LSI for semantic analysis, concept searching and text classification is also very widespread, and once again, it seems ridiculous to claim that it is something novel or innovative.

Next, let’s examine the use of sampling to validate the initial control set. Use of sampling for validation of a control set of documents is in fact such a widely known technique that most e-discovery productions employ sampling. In fact, the Sedona Commentary on Achieving Quality and the EDRM Search Guide recommend use of sampling to validate automated searches. Furthermore, several E-discovery opinions such as Judge Grimm’s opinion in Victor Stanley [Victor Stanley, Inc. v. Creative Pipe, Inc. , 2008 WL 2221841 (D. Md., May 29, 2008)]  suggests that any technique that reduces the universe of documents produced must employ sampling to validate automated searches.

In short, we think the claims issued in the patent and the associated workflow are so commonly used that the workflow is neither novel nor non-obvious to a trained practitioner, and there is enough prior art on each of the individual technologies to warrant a re-examination and eventual invalidation of the patent. In any event, it is fairly easy for anyone to pick up existing prior art and devise a similar workflow that achieves the same or better outcome, and attempt to enforce the patent will likely be challenged.

But there is an even bigger issue at stake here beyond the status of Recommind’s patent: namely, shouldn’t the e-discovery vendor community continue to work, as it has for years, toward what is in the best interest of the legal community and, more broadly, the justice system? Recommind’s thinly veiled threats about requiring industry participants to license their technology are an affront to those who have invested years developing the technology and practicing the approach in real-world e-discovery cases. Spend a few minutes trolling (no pun intended) around on archive.org and you’ll see that early predictive coding companies like H5 were practicing machine learning and predictive workflows in e-discovery over two years before Recommind announced their first version of Axcelerate.

Wouldn’t a better outcome be for corporations and law firms to benefit from the innovation that comes from free competition in the marketplace, while still honoring the sort of novel, non-obvious innovation that warrants patent protection? Legitimate patents that actually encourage and protect investments by an organization are fine, but process patents that attempt to patent a workflow are bad for business. With such an approach, the full promise of automated document review (which, as any truly honest vendor should admit, still has much more room to grow and develop) can be fully realized in a way that both provides vendors with the fair and just economic rewards they deserve while helping the legal system become radically more efficient.