Eval Site
Your guide to the DiscoveryPartner evaluation site.

Welcome to the DiscoveryPartner Evaluation Site
DiscoveryPartner is a full-featured, automated discovery workflow platform designed to help manage complex investigations and ediscovery workflow. It automates critical stages of the Electronic Discovery Reference Model (EDRM), from data processing and loading through search, analysis, review and production.
At the heart of DiscoveryPartner is Sherlock AI, our powerful machine learning and generative AI engine. Sherlock AI transforms how legal professionals interact with vast document collections, making the process of finding, summarizing, and analyzing relevant information faster and more intuitive than ever before.
Key features of DiscoveryPartner include:
Advanced Search Capabilities: Combine traditional keyword search with AI-powered algorithms to quickly identify relevant documents, even in large, complex datasets.
Intelligent Document Analysis: Leverage generative AI to automatically summarize documents, extract key information, and identify critical patterns across your document set.
Interactive Visualization: Gain insights through dynamic timelines, charts, and word clouds that help you understand the big picture of your case.
Automated Reporting: Generate comprehensive investigation reports, witness kits, and complex analyses in seconds, saving countless hours of manual work.
Multilingual Support: Analyze documents in multiple languages without the need for manual translation.
Customizable Workflows: Create tailored review processes that fit your specific case needs, from linear review to continuous active learning (CAL).
Secure Cloud Infrastructure: Benefit from our unique single-tenant architecture that ensures your data is never commingled with other clients’ information.
Cost-Effective “On/Off” Cloud Utility Pricing: Save up to 60% on hosting costs by turning your site off when not in use, aligning with both budget constraints and sustainability goals.
During this evaluation, we invite you to explore how DiscoveryPartner can revolutionize your discovery and investigation processes. Whether you’re handling a complex litigation, conducting an internal investigation, or managing regulatory compliance, DiscoveryPartner provides the tools you need to uncover critical insights quickly and accurately.
We encourage you to test drive our platform using the sample datasets provided, which include both the Jeb Bush email collection and the Enron dataset. These real-world document sets will allow you to experience firsthand the power and efficiency of Sherlock AI in action.
Ready to transform your discovery workflow? Let’s get started with DiscoveryPartner.
How DiscoveryPartner Works
Although DiscoveryPartner is a powerful investigation platform, we have worked hard to make it as easy to use and intuitive as possible. Here is a quick overview of the basic search, analyze and review capabilities of the platform, with a focus on our cutting-edge Generative AI features.
Keyword Search Capabilities: Start your investigation with our advanced search functionality. Enter keywords or phrases using AND, OR, or NOT connectors. Use the Field wizard to add criteria such as date range, custodian, sender, recipient, or subject. The Tag or Folder Search Wizard allows you to filter on tags or search specific folders.
For detailed guidance, refer to our one-page keyword search guide.AI-Integrated Search: While DiscoveryPartner supports extensive Boolean search capabilities, you can also leverage Sherlock AI’s freeform search to find relevant information without struggling to build complex keyword searches. Simply enclose your search statement with [square brackets], sort by Relevance (high to low), and choose the number of results to return. Our search engine ignores syntax and punctuation as it strives to bring back the most relevant documents.
Read more about Freeform Search here.Using Sherlock AI to Find More Relevant Documents: Once you’ve identified one or more relevant documents, send them to Sherlock AI to find even more. When you find a relevant document, simply click the “Send To Sherlock” icon. Within milliseconds, Sherlock analyzes your document, builds a machine learning model, and applies it to the entire document set. It then presents new potentially relevant documents for your review.
Read more about using Sherlock AI here.Interactive Learning: As you review documents, use the “Thumbs Up” or “Thumbs Down” buttons to provide feedback. Sherlock learns from each interaction, continuously refining its understanding of relevance. This process works similarly to Pandora Internet Radio, but instead of finding great music, Sherlock finds great documents.
GenAI Analysis: Consider copying the top relevant documents to an Analyze folder. DiscoveryPartner harnesses the power of Generative AI to transform how you interact with your document set. Our GenAI capabilities allow you to:
Ask questions across hundreds of transcripts, thousands of pages of medical reports, and months worth of chat/instant messaging discussions, receiving concise answers rather than just search hits.
Generate comprehensive timelines, investigation reports, and witness kits in seconds.
Synthesize information across multiple documents, providing detailed analysis with links to original sources.
Automatically summarize key documents or sets of documents, saving hours of manual review time.
You can read more about using Sherlock GenAI here.
Note: If you don’t see the Analyze folders, we may not have enabled Sherlock GenAI for your site. If you want to try it, speak to your Merlin representative about turning this functionality on.
Multilingual Analysis: Sherlock AI and our Generative AI features work across multiple languages, allowing you to analyze, summarize, and extract insights from multilingual document sets without the need for manual translation.
Production and Export: Once your review is complete, DiscoveryPartner streamlines the production process, allowing for native, image, and text productions of various file types including emails, instant messages, documents, and media files.
Throughout your investigation, you can leverage our unique “On/Off” Cloud Utility Pricing. This feature allows you to turn off the site when not in use, significantly reducing hosting costs while supporting green computing initiatives.
DiscoveryPartner represents the next generation of legal technology, seamlessly blending traditional search methods with advanced AI and GenAI capabilities. By combining the familiarity of keyword search with the power of machine learning and Generative AI, we provide a faster, more intuitive, and more cost-effective way to find, analyze, and leverage relevant information in large document sets. Experience the future of legal discovery and investigations with DiscoveryPartner.
For more detailed information on specific features, please refer to our dedicated guides on Freeform Search, using Sherlock AI, and Sherlock GenAI capabilities.
About the Eval Site
The site contains more than 2.7 million documents from two different collections. The first consists of about two million from Governor Jeb Bush ’s archives during his two terms as governor of Florida (1999 to 2007). The second About 700,000 consist of the native format version of the EDRM Enron V2 test collection. Let’s explore each of these collections in more detail.
The Jeb Bush Collection
The site contains a collection of about 1.9 million emails taken from Governor Jeb Bush’s archives during his two terms as governor of Florida (1999 to 2007). The emails reflect communications to and from Governor Bush and others in his administration. Not surprisingly, they contain plenty of information about issues that his administration faced during his years in office—for example, a bid for the 2000 Olympics, the Bush v. Gore election, political battles over keeping or sending Elian Gonzalez back to Cuba and a whole lot more.
A smaller set of these emails (about 290,000) were used as a basis for testing AI engines for the Total Recall Track during the 2016 TREC (Text Retrieval Conference) program), which is sponsored by NIST (National Institute of Standards and Technology). The program brought together academics, legal and AI professionals from around the world who want to test their algorithms and search techniques in a controlled environment against other algorithms and approaches.
The test used a locked down server to record each participant’s progress in finding relevant documents pertaining to any one of 34 separate topics. The goal was to see which methods or algorithms are the most effective at finding all of the relevant documents in the population. You can read about the participants and their successes in the final report from the 2016 conference here.
The Jeb Bush Topics
Participants in the TREC Total Recall Track used different machine learning methods to find all of the relevant documents relating to these topics. Gordon Cormack led a team of reviewers who went through the Jeb Bush collection to identify these topics.
In testing Sherlock AI-powered search capabilities, try using any of these topics. With Freeform search, simply paste in the topic statement enclosed in [square brackets]. Remember to set the sort option on Relevance.
Bacardi Trademark Lobbying — Documents related to the Jeb Bush administration’s involvement in a trademark dispute between Bacardi and the U.S. Patent and Trademark Office.
Bottled Water — All documents concerning the extraction of water in Florida for bottling by commercial enterprises.
Summer Olympics — All documents concerning a bid to host the Summer Olympic Games in Florida.
Save the Manatee–All documents about this program but not about Manatee county itself.
Space — All documents concerning the space industry, the space program, space travel (whether manned or unmanned, public or private), and the study or exploration of space in Florida.
Eminent Domain — All documents concerning the legality or morality of expropriating land in Florida for commercial development.
Felon Disenfranchisement — All documents concerning the right of felons to vote in Florida, including but not limited to voter purges and reinstatement of voter rights. Individual clemency cases in Florida are not relevant.
Faith-Based Initiatives — All documents concerning grants or other initiatives in Florida to offload social services to so-called faith-based agencies. Services include but are not limited to education, prisons, and emergency relief.
Invasive Species — All documents concerning the problem of invasive species in Florida, that is, non-native plants or animals that threaten the Florida ecosystem.
Climate Change — All documents concerning climate change, global warming, or carbon emissions, whether in Florida or otherwise.
Condominiums — All documents concerning the rules and organizations governing Florida condominium associations and conflicts between owners and managers in Florida. Relevant documents include those concerning the establishment of the Florida office of ombudsman, and issues relating to hiring and firing the ombudsman.
“Stand Your Ground” — All documents concerning a Florida bill permitting the use of deadly force to protect one’s self or one’s property.
2000 Recount — All documents concerning the contested result of the 2000 presidential election.
James V. Crosby — All documents concerning James V. Crosby, including but not limited to his relationship with Governor Bush before being appointed as Florida Secretary of Corrections, his role as Secretary, his firing, and any criminal allegations against Mr. Crosby.
Medicaid Reform — All documents concerning efforts to reform Medicaid.
Marketing — All documents concerning advertising or marketing efforts undertaken by the Florida Governor’s office or any other institution of the State of Florida.
Lost Foster Child Rilya Wilson — All documents concerning the disappearance of lost foster child, Rilya Wilson, and the impact or aftermath in Florida resulting from the loss.
Billboards — All documents concerning rights and control of billboards in Florida. Different legislative efforts should be considered to be separate sub-categories.
Traffic Cameras — All documents involving discussions of the use of unattended cameras to enforce traffic laws in Florida.
Non-Resident Aliens (NRA) — All documents involving discussions of the non-resident alien issue. Documents concerning the National Rifle Association are not relevant.
National Rifle Association (NRA) — All documents concerning the National Rifle Association, its members, and its influences. Documents concerning the non-resident alien issue are not relevant.
Gulf Drilling — All documents involving discussions of off-shore drilling for oil or gas. Drilling of wells for water is not relevant.
Civil Rights Act of 2003 — All documents involving discussions of the Florida Civil Rights Act of 2003.
Jeffrey Goldhagen — All documents related to Jeffrey Goldhagen’s role in the Bush administration, his firing, and reinstatement.
Slot Machines — All documents concerning the definition, legality, and licensing of “slot machines” in Florida.
New Stadiums and Arenas — All documents involving discussions of the construction of new sports stadiums or arenas in Florida.
Cuban Child, Elian Gonzales — All documents involving discussions of the Cuban child, Elian Gonzales, and his whereabouts or status.
Restraints and Helmets — All documents involving discussions of seat belts, child seats, and helmet mandates.
Gay Adoption — All documents involving discussions of the gay adoption issue in Florida.
Abstinence — All discussions of abstinence and abstinence-only programs in Florida to supplant birth control or sex education.
For those with Analyze privileges, take the top 100 most relevant results and copy them to an Analyze folder. You can then explore the document contents and have Sherlock GenAi answer questions and prepare reports.
The Enron Collection
The Eval site also has about 750,000 Enron documents including email, Word documents, spreadsheets and Powerpoint presentations. Before Bush released his emails, the TREC conference used the Enron documents for AI testing.
In this case, we loaded the native format version of the EDRM Enron V2 test collection. The EDRM Enron v2 collection was first used in the 2010 TREC Legal Track. It was derived from the EDRM Enron Dataset V2 prepared by ZL Technologies in consultation with the Legal Track coordinators, and hosted by EDRM.
ZL acquired the full collection of 1.3 million Enron email messages from Lockheed Martin (formerly Aspen Systems) who captured and maintained the dataset on behalf of FERC. After deduplication it came to 455,499 messages plus 230,143 attachments.
Search Exercises for the Enron Collection
These topics were taken from those used at TREC in 2009 and 2010. They all relate to the Enron collection.
All documents relating to the Company’s engagement in structured commodity transactions known as “prepay transactions.”
All documents relating to the Company’s engagement in transactions that the Company characterized as compliant with FAS 140 (or its predecessor FAS 125).
All documents relating to whether the Company had met, or could, would, or might meet its financial forecasts, models, projections, or plans at any time after January 1, 1999.
All documents relating to any intentions, plans, efforts, or activities involving the alteration, destruction, retention, lack of retention, deletion, or shredding of documents or other evidence, whether in hard‐copy or electronic form.
All documents relating to energy schedules and bids, including but not limited to, estimates, forecasts, descriptions, characterizations, analyses, evaluations, projections, plans, and reports on the volume(s) or geographic location(s) of energy loads.
All documents relating to any discussions, communications, or contacts with financial analysts, or with the firms that employ them, regarding (i) the Company’s financial condition, (ii) analysts’ coverage of the Company and/or its financial condition, (iii) analysts’ rating of the Company’s stock, or (iv) the impact of an analyst’s coverage of the Company on the business relationship between the Company and the firm that employs the analyst.
All documents relating to fantasy football, gambling on football, and related activities, including but not limited to, football teams, football players, football games, football statistics, and football performance.
All documents relating to onshore or offshore oil and gas drilling or extraction activities, whether past, present or future, actual, anticipated, possible or potential, including, but not limited to, all business and other plans relating thereto, all anticipated revenues therefrom, and all risk calculations or risk management analyses in connection therewith.
All documents relating to actual, anticipated, possible or potential responses to oil and gas spills, blowouts or releases, or pipeline eruptions, whether past, present or future, including, but not limited to, any assessment, evaluation, remediation or repair activities, contingency plans and/or environmental disaster, recovery or clean-up efforts.
All documents relating to activities, plans or efforts (whether past, present or future) aimed, intended or directed at lobbying public or other officials regarding any actual, pending, anticipated, possible or potential legislation, including but not limited to, activities aimed, intended or directed at influencing or affecting any actual, pending, anticipated, possible or potential rule, regulation, standard, policy, law or amendment thereto.
All documents relating to the design, development, operation, or marketing of enron online, or any other online service offered, provided, or used by the Company (or any of its subsidiaries, predecessors, or successors-in-interest), for the purchase, sale, trading, or exchange of financial or other instruments or products, including but not limited to, derivative instruments, commodities, futures, and swaps.
All documents relating to whether the purchase, sale, trading, or exchange of over-the-counter derivatives, or any other actual or contemplated financial instruments or products, is, was, would be, or will be legal or illegal, or permitted or prohibited, under any existing or proposed rules, regulations, laws, standards, or other proscriptions, whether domestic or foreign.
All documents relating to the environmental impact of any activity or activities undertaken by the company, including any measures taken to conform to, comply with, avoid, circumvent, or influence any existing or proposed rule(s), regulations, laws, standards, or other proscriptions, such as those governing environmental emissions, spills, pollution, noise, and/or animal habitats.
When using Sherlock AI for these topics, you may want to remove some of the non-substantive words in the request simply to avoid sending the AI engine in the wrong direction. This aspect of our machine learning engine is focused on keyword frequency.
Here is an example of a pruned Freeform Search using the last Enron topic.
[environmental impact activities undertaken Enron measures conform to, comply with, avoid, circumvent, influence existing proposed rule(s), regulations, laws, standards proscriptions, governing environmental emissions, spills, pollution, noise, animal habitats]
Thank you for taking the time to explore DiscoveryPartner. We hope this overview has given you a glimpse into how our platform can revolutionize your approach to investigations and e-discovery. As you test our system using the provided Jeb Bush and Enron datasets, we encourage you to experience firsthand the power of our integrated keyword search, Sherlock AI, and GenAI capabilities. We look forward to hearing your questions, comments, and insights about your experience with DiscoveryPartner. Our team is eager to discuss how we can further assist you with your specific investigation and discovery needs. Please don’t hesitate to reach out – we’re here to help you unlock the full potential of AI-powered legal technology in your practice.