Posts Tagged ‘Ron Friedman’

What’s Different About E-Discovery Search?

Monday, May 5th, 2008

raiders-warehouse.jpgIn his latest article, Craig Ball argues that lawyers “need to learn more about the science of search.” Craig says that at least part of the reason for this is that searching in e-discovery is challenging and different from the searching to which lawyers are accustomed.

“Lawyers believe themselves adept at keyword search in e-discovery because they’ve mastered keyword search in online legal research. The correlation is superficial at best. Unlike the crazy quilt of ESI, the language of reported cases is precise, consistent and structured. Misspellings are rare. Legal research is Disneyland. E-discovery is Baghdad.”

I had a conversation on a similar topic with Ron Friedman last month after my last post where he made a similar argument about lawyers needing to learn e-discovery search tools.1

I think Craig and Ron make excellent points. E-Discovery search is different and it’s important for lawyers, investigators, litigation support professionals and other practitioners to understand how. The natural questions that arise from their arguments are: what is different about e-discovery search? How is it different from other familiar searches, such web search and legal research search? The answers are important because it can help guide e-discovery experts on how to train lawyers and even guide attorneys during review. It is also important for developing e-discovery best practices and e-discovery search software.

I think the first step in answering these questions is to agree on the definition of e-discovery search, or better said the types of e-discovery search since there are several. To address this appropriately would take a least another full post or a paper. As a result, I will leave the detailed discussion of these matters to another time, but for this discussion I will focus on searches used to identify potentially relevant documents for purposes of matter assessment (i.e., understanding the nature of the case: who did what, where, when and why) and for document production to the opposing party.

I have observed five major characteristics of e-discovery search that as a whole differentiate it from other searches. I would be interested to hear additional views on what is different about e-discovery search, so please comment on this post.

Recall
First, the cost of missing a relevant document, or low recall, can be very high in e-discovery. Missing a document that you should have produced could result in sanctions and adversely impact the case outcome. Missing key documents could also affect your legal strategy causing you to make sub-optimal decisions. Missing relevant documents can be costly in other searches as well. For example, in legal research, not identifying case law that is critical to your case could also have a detrimental impact on your legal strategy. However, low recall is on average costlier and more likely in e-discovery. In contrast to e-discovery and legal searchers, web search users are typically not very concerned with missing relevant documents. For the most part, they are interested in the most relevant documents, not all of the relevant documents. This is why Google rarely actually provides all the results for a search (you can try this yourself by paging to the end).

Precision
Second, the cost of returning false positives, otherwise known as low precision, in e-discovery searches is high. The results of e-discovery searches including false positives are typically produced and reviewed by humans at costs as high as several dollars per document. On the other hand, false positives have a minimal cost in web search because users either won’t see them if they are ranked low or will ignore them after minimal review. False positives can be costly during legal research in certain scenarios, such as when the stakes and nature of case are such that many search results need to be exhaustively reviewed, but typically the costs are lower.

Varied Language
Third, documents searched during e-discovery often include personal emails and files and frequently use varied language including jargon, slang, abbreviations, technical terminology, misspellings, and machine-created junk. This is Craig’s “Baghdad” point. In contrast, as Craig points out, documents searched during legal research, such as opinions, motions, etc. are typically well-structured documents with no misspellings, relatively consistent language etc. Even web sites are generally “cleaner” than typical e-discovery documents.

Complexity
Fourth, users are often looking for different information when performing searches during discovery. E-Discovery searches are often aimed at comprehensively understanding “who did what, when, where and why” in a matter where the people involved may be trying to hide this information and where there may be no single “starting point”. As a result, e-discovery searchers often adopt strategies that involve large numbers of queries, and will follow the evidence and iteratively refine their searches for combinations of topics, people, places, etc. Legal searches can also be fairly complex, but as with other differences this is one of degree. These searches typically don’t involve hundreds of queries and terms, are often more narrowly defined and have a “starting point”. Web searches tend to be even simpler. Most are one or two words.

Transparency
Finally, e-discovery search is part of a legal process. The searches themselves are subject to negotiation with and review by opposing counsel and the court. This process can also take place over long time frames. As such, there is a great need for transparency in the development and execution of e-discovery searches. It is also important for e-discovery searchers to develop a defensible audit trail to prove what searches were run and what results were produced when. This is not the case in web or legal research.

These differences have a number of implications for e-discovery search best practices, training, software and more. I will discuss these in more detail in future posts. However, I think these differences make clear why Craig and Ron are right to suggest that people who are new to e-discovery can benefit from specialized training and tools. Similarly for those of us who are deeply involved in e-discovery, I believe these differences point to the fact that there is still a lot of work to be done in developing best practices and software to make it easier for lawyers and other users to perform e-discovery searches effectively.

1 Ron also wrote another interesting post on this topic which can be found at PrismLegal.com.