E-Discovery

What Is Technology-Assisted Review?

A defensible TAR protocol has fourteen years of judicial approval behind it. GenAI document review, by contrast, does not. That gap matters when you certify a production.

Alexander Cohan, Ph.D.

Alexander Cohan, Ph.D.

Legal technology researcher and data scientist specializing in AI governance for litigation teams. Expertise in NLP and AI-assisted document review.

Technology-assisted review concept: classifier ranks document collection while attorney reviews and validates output
"First Principles"

What Technology-Assisted Review Actually Means

Technology-assisted review is a process, not a product. The algorithm ranks documents by predicted relevance. Attorneys make the actual coding calls and audit the results. Courts that approve TAR consistently emphasize attorney involvement alongside the algorithm, not the algorithm in isolation.

That distinction matters. Manual linear review has every reviewer reading every document, which is expensive, and its recall is lower than reviewers want to admit. Keyword search runs Boolean queries against text, which is cheap and misses anything written in different words. TAR sits between them. It uses statistical classification to rank a collection, then routes attorney attention to the documents most likely to be responsive. The output is still attorney work product. The machine just chose the order.

You’re already familiar with the broader e-discovery workflow. TAR fits inside the review phase of that workflow, after collection and processing. What’s worth pinning down is why courts care about the process language. Federal Rule of Civil Procedure 26(g) requires a producing party to certify that its search was reasonable. Reasonableness is a methodology question, not a tool question. A judge evaluating a TAR protocol is asking whether the steps were sound and the validation was real.

Sedona Principle 6 anchors the framing here. The producing party gets to choose its own search methodology, including TAR. That principle has held up across jurisdictions for more than a decade. It’s the framing most TAR opinions reference.

The reason this post exists: TAR has matured into a stable body of practice. GenAI document review and TAR e-discovery sit in different states of maturity. Treat them as related cousins, not the same thing.

"Predictive Coding"

TAR 1.0: Predictive Coding and the Da Silva Moore Era

TAR 1.0 follows a clean four-step pattern. Senior attorneys code a seed set of documents. The classifier trains on those decisions. Reviewers then code a control set drawn from the same population, which the system uses to measure stability. When precision and recall stop improving across training rounds, the model has reached its stabilization point and ranks the rest of the collection in one pass.

That workflow drove the first generation of predictive coding deployments, and it produced the validation methodology courts are still applying today. Da Silva Moore v. Publicis Groupe, 287 F.R.D. 182 (S.D.N.Y. 2012), gave TAR its first judicial blessing and set the template for what defensible validation looks like. The protocol there used a 2,399-document validation sample drawn at 95% confidence with a 2% margin of error. Magistrate Judge Peck wrote that perfection is not required, only reasonableness. That language has been quoted in most subsequent TAR opinions. Those parameters were tied to that case’s collection size and the parties’ stipulation. Sample size scales with the corpus and the confidence level you target.

The empirical case had already been made. Maura Grossman and Gordon Cormack published their landmark study in 2011, comparing technology-assisted processes against exhaustive manual review on TREC Legal Track datasets. TAR achieved 76.7% recall and 84.7% precision. Manual review came in at 59.3% recall and 31.7% precision. Read that again. Manual review, the gold standard small firms still pay reviewers $3 per document to perform, hit barely 60% recall on responsive documents. The TREC dataset is a controlled benchmark rather than a typical production corpus. Subsequent peer-reviewed work has consistently replicated the recall advantage of statistically validated review over exhaustive manual review.

TAR 1.0 has real downsides. Seed-set composition matters enormously, and weak seeds propagate into weak classifiers. The training-then-ranking split means reviewers can’t learn from the data while they work. Disputes over seed transparency consumed entire meet-and-confers in the early years. Vendors mooted the dispute by changing the workflow.

"TAR 2.0"

Continuous Active Learning: The TAR 2.0 Turn

CAL restructured the workflow. There’s no separate training phase. Reviewers start coding documents the system has prioritized, and the model re-ranks the queue after every decision. The most likely-responsive documents bubble back to the top. Reviewers see them next. The classifier keeps learning until responsive documents stop appearing in the queue, at which point the team moves to validation sampling.

That sounds incremental. It wasn’t. CAL did three useful things at once. The seed-set transparency fight disappeared, because there was no fixed seed to argue about. Reviewers got a continuously improving queue, with documents arriving in priority order rather than trickling in randomly. The math also became harder to game, since recall now measured against a held-out random sample rather than the same control set used for training.

Courts noticed quickly. Rio Tinto PLC v. Vale S.A., 306 F.R.D. 125, 127 (S.D.N.Y. 2015), called TAR’s defensibility “black letter law” by the time CAL was rolling through major matters. Judge Peck, writing in Rio Tinto, made the further point at 128 that CAL’s evolving training pool defuses the seed-set disclosure debate that had dominated TAR 1.0 disputes. Hyles v. New York City, No. 10 Civ. 3119 (AT)(AJP), 2016 WL 4077114 (S.D.N.Y. Aug. 1, 2016), came a year later and confirmed the corollary. The producing party chooses the methodology. Opposing counsel cannot force TAR over your objection, and the inverse holds when you choose to use it.

CAL doesn’t make validation optional. It shifts where validation happens. Instead of measuring stabilization on a control set during training, you sample at the end. Random samples test recall directly against the population. Elusion testing checks whether the documents the model classified as non-responsive actually contain a tolerable number of false negatives. Those numbers go in your validation report. They are what defensibility looks like in 2026.

That validation work is where agentic document review fits in. Hintyr automates the random sampling, recall estimation with confidence intervals, and elusion testing that any defensible protocol requires, so the validation report stays defensible regardless of which classifier produced the underlying ranking.

That brings us to the question every litigator is fielding from clients now. What about GenAI?

"Generative AI"

GenAI: Promising, Unproven, Distinct

LLM-based AI document review works differently from anything in the TAR family. There’s no classifier trained on attorney decisions and no fixed seed set. The training corpus isn’t assembled from the production at all. A large language model reads each document and produces a classification, often with a written rationale explaining its reasoning. Vendors emphasize the rationale capability.

The vendor argument deserves a fair hearing. GenAI is, in one defensible reading, a new classification engine plugged into the same TAR process model courts have already approved. You still validate. The sampling looks similar, and the methodology still gets documented for the production you certify. Underneath, the classifier is a language model reading prose rather than a statistical scorer over engineered text features. The added benefit is that LLMs produce per-document rationales that an attorney can review and audit, which is structurally different from the relevance scores TAR 1.0 and CAL outputs. Practitioner commentary in the Sedona Conference Journal in 2024 has worked through versions of this argument.

But here’s the gap. No widely cited peer-reviewed benchmark has been published comparing GenAI-based responsiveness review against TAR or CAL on common datasets. The TREC Legal Track studies that gave TAR its empirical foundation simply do not exist for GenAI. We are not aware of an on-point opinion, federal or state, approving or rejecting LLM-based responsiveness review over objection as of April 2026. So when something goes wrong, you carry the verification burden under Rule 11.

That’s not hypothetical. Mata v. Avianca, Inc., 678 F. Supp. 3d 443 (S.D.N.Y. 2023), and Kohls v. Ellison, 2025 WL 66514 (D. Minn. Jan. 10, 2025), both involved attorneys who relied on generative AI without verification. Mata’s lawyers got sanctioned for fabricated citations in a brief. In Kohls, an expert declaration was excluded after fabricated citations surfaced. Neither case is about document responsiveness review. Both are about the verification duty that follows the lawyer no matter what tool generated the output. ABA Formal Op. 512 (July 29, 2024), “Generative Artificial Intelligence Tools,” reaffirmed that point in clear terms.

"Side by Side"

TAR 1.0 vs. CAL vs. GenAI: Side by Side

Each methodology carries a distinct court record and a distinct validation burden. TAR 1.0 has the longest case-law track and the most rigid workflow. CAL has become the most commonly recommended production methodology among practitioners running TAR workflows in 2026, with judicial approval that builds on TAR 1.0’s foundation. GenAI document review is the new entrant, with no on-point precedent for responsiveness work and no peer-reviewed benchmarks. For a small-firm litigator weighing options on a specific matter, the practical question is rarely which methodology is “best” in the abstract. It is which methodology aligns with the volume, the budget, and the level of judicial scrutiny you expect on the back end. The cards below map the practical differences side by side, with court status and validation burden as the load-bearing distinctions, since those two columns drive most defensibility arguments at the meet-and-confer.

TAR 1.0 / Predictive Coding

Workflow

Attorneys code a seed set; the system trains on it once, then ranks every remaining document by predicted relevance.

Validation

Stabilization point determined via control sets, with a documented elusion test before production.

Best for

Large, diverse corpora where a representative seed set can be assembled before review begins.

Court status

Approved as a defensible methodology since Da Silva Moore v. Publicis Groupe, 287 F.R.D. 182 (S.D.N.Y. 2012).

CAL / Continuous Active Learning

Workflow

Reviewers code documents in real time, and the system re-ranks the queue after each decision until responsive documents stop appearing.

Validation

No discrete training phase. Recall is sampled at the end. Seed-set composition becomes far less significant.

Best for

Low-prevalence collections, time-pressed reviews, and matters where proportionality anchors the cost ceiling.

Court status

Recognized as black-letter law in Rio Tinto PLC v. Vale S.A., 306 F.R.D. 125, 127 (S.D.N.Y. 2015).

GenAI / LLM-Based Review

Workflow

A large language model classifies each document directly, often with a written rationale, and no seed set is required.

Validation

No published peer-reviewed benchmark against TAR or CAL on common datasets. Rule 11 verification falls fully on the attorney.

Best for

Early case assessment, prioritization, and matters where reasoning transparency is more useful than reliable recall numbers.

Court status

As of April 2026, no published U.S. federal opinion has approved or rejected LLM-based responsiveness review over objection.

"Doctrinal Anchors"

The Case Law Every Litigator Should Know

Four cases carry most of the weight. Read them in order.

Da Silva Moore v. Publicis Groupe, 287 F.R.D. 182 (S.D.N.Y. 2012), is the foundation. Magistrate Judge Peck approved a predictive coding protocol over objection and articulated the standard that has held ever since. Perfection is not required. Reasonableness is. The opinion at 191 frames the duty as a process question, which is why every defensibility argument since has focused on documenting methodology rather than proving the algorithm.

In re Biomet M2A Magnum Hip Implant Prods. Liab. Litig., 2013 WL 6405156 (N.D. Ind. Aug. 21, 2013), did the proportionality work. Plaintiffs wanted Biomet to redo its TAR review with a different methodology. The court refused. Proportionality under Rule 26 blocks compulsory redo where the original methodology was reasonable. The court also held that there’s no general obligation to disclose seed-set documents. That holding still controls in most jurisdictions when seed transparency comes up.

Rio Tinto PLC v. Vale S.A., 306 F.R.D. 125, 127 (S.D.N.Y. 2015), is the case that called TAR’s defensibility “black letter law.” It’s the line you cite when opposing counsel pretends TAR is still controversial. Judge Peck went further at 128 and noted that CAL specifically defuses the seed-set disclosure dispute, because there is no fixed seed to disclose.

Hyles, 2016 WL 4077114, at *3, confirmed Sedona Principle 6 directly. The producing party chooses methodology. The court won’t compel TAR over objection. Judge Peck added forward-looking dicta worth quoting in full: “There may come a time when TAR is so widely used that it might be unreasonable for a party to decline to use TAR. We are not there yet.” That language is dicta, not holding. Judge Peck did not require TAR. He observed that the day might come when refusing TAR becomes unreasonable.

One caveat. These cases approve TAR but do not require it, and none of them resolves GenAI’s defensibility question. The pattern of judicial reasoning here, particularly Rule 26(b)(2)(C) proportionality, gives a roadmap for how courts may eventually approach LLM-based review. A roadmap is not the same as a precedent.

"Small-Firm Playbook"

What This Means for Small-Firm Practice

TAR isn’t a BigLaw-only technology. The pricing argument that kept it out of small firms collapsed years ago. Cloud platforms now charge between $0.11 and $0.50 per document for GenAI-assisted review, per ComplexDiscovery’s Winter 2026 pricing survey, with the spread driven by task complexity, model selection, and quality-control overhead. CAL workflows on the same platforms typically run a touch lower since the model is cheaper to operate. Manual linear review at $3 per document, where attorney time dominates the cost, can run 10 to 50 times more than automated workflows on the same corpus. The right comparison depends on attorney rate, responsive prevalence, and platform pricing.

That economics shift shows up in the data. The ILTA 2025 Technology Survey, covering 580 firms and 152,000 attorneys, found 80% of firms using or actively exploring GenAI tools. Nineteen percent still have no formal AI policy in place. Adoption is broad. Governance is patchy.

A practical question for a litigator at a fifteen-attorney firm: at what document volume does TAR pay back? The rough answer is around 10,000 documents. At 10,000 documents, manual review at $3 per document runs $30,000 in document-touch costs alone. CAL on a typical platform runs roughly $4,000 in document fees plus a flat platform minimum, often in the $3,000 to $5,000 range for small-matter tiers. The crossover sits where the platform minimum stops dominating, usually between 8,000 and 12,000 documents in standard configurations. Below that, traditional review with strong keyword targeting often reaches production faster on smaller sets. Above it, the math tilts toward TAR fast.

A small-firm playbook regardless of methodology runs three ways. Document what you chose and why, couple it with sample-based validation because reasonableness without statistics is a story rather than a defense, and keep one foot in TAR’s case-law track until GenAI gets its peer-reviewed benchmark. In practice, that means using GenAI for prioritization and rationale generation while running TAR or CAL for the responsiveness call you certify. Hintyr is agentic document review with TAR validation tooling built in: the sampling and elusion testing run alongside any underlying review process rather than waiting as a separate certification step. If you’re working through the buying decision between GenAI and TAR, validation comes first.

"Common Questions"

Frequently Asked Questions

What is the difference between TAR 1.0 and TAR 2.0?

TAR 1.0 trains a classifier on a fixed seed set, then ranks the rest of the collection in one pass. TAR 2.0, also called continuous active learning, learns continuously as reviewers code documents and re-ranks the queue after each decision. The practical effect: TAR 2.0 doesn’t need a separate training phase, and seed-set composition matters far less.

Can opposing counsel force me to use TAR?

No. Under Sedona Principle 6, confirmed in Hyles v. New York City, the responding party chooses its own search methodology. Courts will not compel TAR over a producer’s objection. They also won’t compel a producer to abandon TAR if that’s the methodology you want to use.

Is GenAI document review the same as TAR?

No. TAR is a documented process with judicial approval reaching back to 2012. GenAI review uses a large language model to classify each document directly, often with a written rationale. As of April 2026, no published federal or state opinion has approved or rejected GenAI-based responsiveness review over objection. The two are related, not interchangeable.

How much does TAR cost compared with manual review?

Cloud platforms now charge between $0.11 and $0.50 per document for GenAI-assisted review, with continuous active learning workflows typically a touch lower. Manual review at $3 per document costs roughly 10 to 50 times more on the same corpus, depending on attorney rate and complexity.

What does a defensible TAR process require?

Document the methodology, run random-sample validation on recall and precision, share enough in meet-and-confer to demonstrate good faith, and keep a quality-control plan. Da Silva Moore set the standard at 287 F.R.D. 182, 191 (S.D.N.Y. 2012): reasonableness, not perfection. The line has held.

Should small firms use TAR or stick with keyword search?

Keyword screening misses documents that lack the chosen terms but share the meaning. TAR ranks every document, including those that don’t trigger your keywords. For matters above 10,000 documents, TAR usually costs less than keyword-plus-manual review and produces higher recall. Below that volume, the math gets closer, and the right answer depends on the case.

This article is for general informational purposes only. It does not constitute legal advice and does not create an attorney-client relationship. Statements about case law and rules reflect publicly available sources as of April 2026 and may not address your jurisdiction or matter. Consult qualified counsel before acting on any of the topics discussed.

Run TAR-grade validation on any review workflow.

Hintyr runs the random sampling, recall estimates with confidence intervals, and elusion testing your validation report needs, whether the underlying review is TAR 1.0, CAL, GenAI, or a hybrid. Or compare review platforms side by side.