Building a Better Training Protocol for AI Review
How ReviewPartner turns protocol gaps into structured questions for the review manager
By John Tredennick
Every document review starts with a training protocol. The review manager uses it to teach reviewers about the case, the legal issues, the key people, entities and time periods, and what is being requested for production.
For an AI review, the training protocol is a set of instructions (a prompt) that an AI reviewer applies to every document in the collection. Get it right and the review works. Get it wrong, and every document in the collection pays the price.
The problem is that the training protocol is usually written before anyone has worked through the documents. Legal teams do their best, but every collection has its own language and contains documents no one could have anticipated. The real questions surface once reviewers begin work, and by then the protocol is already in use. Changing it means retraining the team and rechecking earlier calls.
Generic AI review makes this worse, not better. To be sure, the AI will apply whatever criteria it is given, faster and more consistently than a human. But when the protocol is wrong, the wrong rules will be applied to every document in the collection. The question is not whether AI can read and classify documents. The question is whether the criteria it is applying have been tested against the documents being reviewed.
A Different Approach
ReviewPartner starts with an initial training protocol, like every other review. The difference is what happens next.Â
Sampling
Before the AI reviewers are turned loose on the full collection, ReviewPartner samples and analyzes hundreds of documents against the protocol instructions. The sample is built to include clearly responsive, clearly non-responsive, and the gray-zone documents in between, where the protocol is most likely to be tested.
Review Reports
For each sampled document, ReviewPartner provides a relevance score, a document summary, responsiveness comments, and a request for clarification where the training protocol is ambiguous or hard to apply.
As an example, here is a report on a document sampled from a hypothetical matter involving Jeb Bush and the Florida hurricane insurance crisis.
ReviewPartner provides this level of analysis on each document it reviews.
Responsiveness Assessment Report
After finishing the sample set, ReviewPartner analyzes the results and creates a responsiveness assessment for the review manager. It typically starts with an overview of its findings.
The report then turns to definitional deficiencies that should be addressed before the review can proceed.
Each section of the report includes links to representative documents along with their relevance scores.Â
Refinement Questions
Based on the issues identified in its report, ReviewPartner generates a set of questions for the review manager and the legal team to answer. Each is built around a specific protocol weakness, with the analytical work already done. The system asks focused questions anchored in documents from the sample. The review manager is not being asked to figure out what is wrong. The review manager is given options to fix it.
A typical question describes the ambiguity in plain language, shows which part of the existing protocol is unclear, links to the specific documents that illustrate the issue, and offers pre-formulated answer options representing the reasonable interpretations the system identified.Â
Here is an example from our hypothetical review project:
The review manager can pick from the choices offered or insert their own answer. The pre-formulated options are not a substitute for judgment. They are a starting point that lets the review manager move quickly to the substantive call.
These are the questions that surface late in a traditional review and get resolved inconsistently across a large reviewer team. In ReviewPartner, they surface before full review begins and get resolved once, by the people who should be making the call.
Refining the Training Protocol
Once the review manager and the legal team answer the questions, ReviewPartner again takes the lead. It uses the answers, backed by its analysis, to refine the training protocol. It presents a revised protocol to the review manager, with the option to view it in redline format or edit directly.
Testing the Refined Protocol
The next step is to test the refined protocol against the sampled documents to confirm the changes. ReviewPartner provides a new report and, if appropriate, asks further questions to clarify the protocol. The cycle typically repeats two or three times. With each pass, the number of uncertain determinations decreases and the protocol becomes more precise. The cycle ends when additional rounds produce diminishing improvement and the protocol converges.
At this point, the protocol is ready for validation. It has been refined against the documents and proven through repeated engagement with the actual collection. It is no longer a starting point. It is a finished training protocol built specifically for the matter.
Validation Confirms the Protocol Works
Once the training protocol is ready, ReviewPartner runs a formal validation step against a fresh sample of documents, typically around 1,000 documents drawn from the collection. The AI analyzes each document and produces a determination with reasoning. A human reviewer, typically the review manager or another senior reviewer, reads each document and either agrees or disagrees with the AI’s call.
The validation process produces accuracy figures the review manager can act on: overall agreement between the human reviewer and the AI, plus precision and recall against the validation sample. If the validation results are satisfactory, the review proceeds to full scale. If not, the review manager runs another refinement cycle. The cycle continues until the protocol meets the requirements of the matter and the review is ready to proceed at full scale.
What This Means for the Review
The question-and-answer process is what makes everything else work. Without it, AI document review is just a faster way to apply whatever criteria the legal team drafted at the start of the matter. With it, the training protocol gets built against the actual documents, refined in response to real ambiguities, and validated before a single production document is coded.
Human judgment is concentrated where it matters most. The review manager and the trial team are not supervising a large reviewer team. They are answering the questions that determine what the criteria should be. Everything downstream, the application of those criteria across hundreds of thousands of documents, happens with the speed and consistency that AI provides.
The questions are not a feature of the platform. They are the mechanism through which a better training protocol gets built, and a better training protocol is what makes a better review.
About the Author
John Tredennick (jt@merlin.tech)Â is CEO and Founder of Merlin Search Technologies, a company pioneering AI-powered document intelligence for legal professionals. A former trial lawyer and founder of Catalyst Repository Systems, he is recognized by the American Lawyer as a top six ediscovery pioneer and has been involved in legal technology and document review for more than 30 years.