Overview

Last updated: 2026-03-23

TAR validation uses statistical sampling to answer a simple question: is your review good enough? L1 (Control Set) measures precision and recall from responsive documents. L2 (Elusion Test) checks the discard pile to confirm you're meeting recall targets.

TAR Validation dialog

The dialog provides Create and Continue tabs for managing validation tests

Select Tag

Validation Type

Confidence Level

Margin of Error

What is technology-assisted review validation

Your team tags documents as responsive or not responsive, but how do you know those calls are correct? TAR validation answers that question statistically. Instead of re-reviewing every document, it draws a random sample and has qualified reviewers grade each one independently. Those grades are compared against the existing tag assignments to produce standard metrics.

Statistical sampling for TAR validation

Hintyr calculates the required sample size automatically based on the confidence level, margin of error, and total document population you specify. It draws the sample randomly from the relevant population, so the results represent the entire collection. You don't need to pick specific documents; Hintyr handles random selection for you.

The confidence level (typically 95%) tells you how certain you can be that sample results reflect the true population. The margin of error (typically 5%) defines the acceptable range above and below your measured value. Together, these two parameters determine how many documents you'll need to grade.

L1 - Control Set validation for precision and recall

The Control Set (TAR 1.0) draws a random sample from documents tagged as responsive. Each sampled document goes to a reviewer who grades it as Responsive or Not Responsive without seeing the original tag. The results calculate:

  • Precision - The share of documents tagged responsive that actually are responsive. High precision means few irrelevant documents slipped in.
  • Recall - The share of all truly responsive documents your team correctly identified. High recall means few responsive documents were missed.

L2 - Elusion Test for recall validation

The Elusion Test (TAR 2.0) draws a random sample from documents not tagged as responsive, sometimes called the discard pile. Reviewers grade each one to see if responsive documents were missed. The results calculate:

  • Elusion rate - The share of discard-pile documents that are actually responsive. A low elusion rate means very few responsive documents were missed.
  • Recall validation - Hintyr uses the elusion rate to calculate whether your recall target has been met. If the elusion rate is low enough, the validation passes.

L2 also requires you to set a target recall (for example, 75%). Once grading is complete, Hintyr reports whether measured recall meets or exceeds that target.

When to run a TAR validation test

TAR validation is most valuable at specific milestones during your review:

  • During review - Run an L1 Control Set periodically to monitor precision and catch inconsistencies early.
  • Before production - Run an L2 Elusion Test to confirm that recall targets have been met before producing documents to the opposing party.
  • After major changes - If you adjust review criteria, retrain reviewers, or add a large batch of new documents, run a fresh validation to confirm that quality has not degraded.
  • For court reporting - Validation metrics provide defensible evidence that your review process was thorough and statistically sound.

Frequently asked questions

What is technology-assisted review validation?
TAR validation is a statistical quality-control process that draws random document samples from your tagged populations and has human reviewers grade them. The results produce metrics such as precision, recall, and elusion rate that measure review quality. Courts have recognized these metrics as defensible evidence of review thoroughness in cases such as Da Silva Moore v. Publicis Groupe and Rio Tinto PLC v. Vale S.A.
When should I run a validation?
Run an L1 (Control Set) during or after review to measure precision. Run an L2 (Elusion Test) before production to confirm your recall target is met before producing documents under FRCP Rule 34. You can also re-validate after major changes to your review criteria or document population.
Do I need to run both L1 and L2?
No. You can run L1 only, L2 only, or both together depending on your needs. L1 focuses on precision while L2 focuses on recall validation. For a complete picture, running both is recommended.
How many documents do I need to grade?
The sample size is calculated automatically based on your confidence level, margin of error, and total document population. Typical configurations (95% confidence, 5% margin of error) result in sample sizes that are manageable even for large collections.

Related articles