Episode 17: Do Portable Models (Brains) Actually Work in TAR?

A number of TAR (technology assisted review) vendors are offering what they call “portable models” for different types of cases. The claim is that the models are already trained on your topics and therefore can find relevant documents with minimal training. Do these work? At what cost or security risk? Sadly, none of the vendors are backing their claims with scientific research.

Tune into our latest TAR Talk Podcast to learn more about Portable Models. You may be surprised at what you learn.

Recently, two leading TAR experts, Dr. Jeremy Pickens and Tom Gricks, Esq. presented a peer-reviewed science paper entitled: “On the Effectiveness of Portable Models versus Human Expertise under Continuous Active Learning” at a recent international information retrieval conference.* It compares the use of a portable model with a human-led continuous active learning training process to see which approach might be more effective.

The authors did their research using a set of about 300,000 emails that Jeb Bush made available from his two terms as governor of Florida. These emails have been use to test machine learning algorithms at several of the Text Retrieval Conferences (TREC) sponsored by the National Institute of Technology (NIST). The TREC administrators reviewed the emails and created 35 topics for testing in the legal track. Coders marked individual documents as relevant to each of the topics, thus providing a baseline for testing the effectiveness of different machine learning algorithms.

For their research, Pickens and Gricks gave the portable model several advantages over human training that rarely, if ever, obtain in real life:

They trained the model on the same documents as they tested, splitting the collection in two randomly so the training would not overlap with the test set.
Human training was limited to a half hour per topic. None of the individuals involved had prior knowledge of the topic.

They also tested against random seed training, which was typical in a TAR 1.0 process.
In our TAR Talk program, we discuss the results of these tests, which are set out in detail in Pickens’ and Grick’s paper. In brief, they found that the model did slightly better in some cases, about the same in others and much worse in others. They also pointed out the security implications of sharing a model based on one set of client data with others not privy to that data. It turns out there are techniques for analyzing machine learning models that can allow a sophisticated voyeur to deduce the data that trained the original model.

Our hope is that vendors will test their models and share those results so legal professionals can determine whether the models provide value or perhaps a security risk. We certainly found this a fascinating topic, one which we will return to as more data surfaces.

*Proceedings of the Second International Workshop of AI and Intelligent Assistance for Legal Professional in the Digital Workplace (LegalAIIA 2021), held in conjunction with ICAIL 2021. June 21, 2021. São Paulo, Brazil. Copyright ©2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). Published at http://ceur-ws.org.