B&B_NEW_LOGO_400

Deciding when to use technology-assisted review

Amid the increasing use of artificial intelligence, I am frequently asked about predictive coding, a form of technology-assisted review (TAR), and its applications—when to use it, its value as a tool, and its potential pitfalls, among others. Simply put, predictive coding is a search and review tool that uses AI to identify responsive documents based on human inputs and decisions. Predictive coding or TAR is most frequently used during the search and review phase of a case to find responsive documents within a larger set (typically, and ideally, a much larger set). According to TAR expert Dr. Maura Grossman, TAR is most aptly described as:

A process for prioritizing or categorizing an entire collection of documents using computer technologies that harness human judgments of one or more subject matter expert(s) on a small subset of the documents, and then extrapolate those judgments to the remaining documents in the collection.1

Apart from predictive coding, Knowledge Engineering or the “rule-based” approach is another form of TAR; Dr. Grossman describes predictive coding, which uses machine learning, as requiring a training set of responsive or non-responsive documents for teaching the algorithm to distinguish between responsive and non-responsive material. These training sets can vary substantially depending on the provider and the execution of the TAR process. 

In recent years, TAR has become a widely accepted method. In Da Silva Moore v. Publicis Groupe 287 F.R.D. 182 (S.D.N.Y. 2012) Hon. Andrew J. Peck wrote, “This judicial opinion now recognizes that computer-assisted review is an acceptable way to search for relevant ESI in appropriate cases.” Global Aerospace, Inc. v. Landow Aviation, L.P., No, CL 61040 (Vir. Cir. Ct. 2012) was the first state court case to permit TAR; the production was undisputed. Since 2012, TAR’s reputation as a valuable ESI review tool has grown.

Predictive coding can be a practical, cost-effective, and time-saving tool—when it’s used correctly in appropriate cases. Several factors deserve consideration in deciding whether TAR is the right approach, but I will examine a few key questions. First, how many documents or text files are up for review? In Global Aerospace, Inc. v. Landow Aviation, L.P., two million documents were collected.2 In this instance, manual review would be cost-prohibitive and absurdly time-consuming. Imagine attempting to read and identify responsive items out of a population of two million documents! In this case, leveraging machine learning while limiting (though not completely eliminating) human intervention was reasonable given the sheer number of documents under review. Second, from how many custodians will data be collected? When many documents, and many custodians, are in question, using TAR is often advisable depending on the cost structure. 

But I have also seen cases in which TAR was considered or used without a truly justifiable reason. In one instance, TAR was being assessed as an option to assist in reviewing about 160 documents, to be collected from about five different custodians. The cost and time needed to effectively implement a predictive coding process would outweigh the benefit of not needing to manually review the documents for relevancy. It should also be noted that while predictive coding is useful in identifying responsive documents, emails, or many other kinds of text files, it is not appropriate for cases involving other types of structured or non-textual ESI, such as databases, audio or video files, or certain kinds of images. 

In deciding whether or not to use TAR, it is important to weigh the potential benefits with respect to the overall value of the case, the budget, the kinds and amount of data involved, and the number of custodians. If it is ultimately determined that TAR is the preferred choice, carefully selecting the right approach, provider, and process is critical in ensuring the best outcome. Like any technology, predictive coding and TAR tools evolve and can come with risks of their own. Balancing human involvement and oversight with the conveniences afforded by this approach is crucial.


MARK LANTERMAN is CTO of Computer Forensic Services. A former member of the U.S. Secret Service Electronic Crimes Taskforce, Mark has 28 years of security/forensic experience and has testified in over 2,000 matters. He is a member of the MN Lawyers Professional Responsibility Board.  


Notes

1 Maura R. Grossman & Gordon V. Cormack, The Grossman-Cormack Glossary of Technology-Assisted Review with Foreword by John M. Facciola, U.S. Magistrate Judge, 7 FED. COURTS L. REV. 1 (2013), http://www.fclr.org/fclr/articles/html/2010/grossman.pdf  

2 https://www.ediscoverylaw.com/files/2013/11/MemoSupportPredictiveCoding.pdf