E-Discovery

What Is E-Discovery? An AI-Era Guide for Small Law Firms

The 2006 ESI amendments and the 2015 proportionality rewrite still govern; what changed is the review tooling, and a 2026 small-firm litigator has to handle both at once.

Alexander Cohan, Ph.D.

Alexander Cohan, Ph.D.

Legal technology researcher and data scientist specializing in AI governance for litigation teams. Expertise in NLP and AI-assisted document review.

"The short answer"

What Is E-Discovery? A Working Definition for Litigators

What is e-discovery? Short answer: the part of litigation where parties identify, preserve, collect, process, review, and produce electronically stored information (ESI), also called electronic discovery. The Federal Rules of Civil Procedure folded ESI into formal discovery in the 2006 amendments to Rules 16, 26, 33, 34, 37, and 45, and the 2015 proportionality rewrite tightened scope. The 2006 Advisory Committee Note to Rule 34 treated “ESI” as deliberately broad. Email, spreadsheets, photos, audio, mobile chat, Slack channels, cloud decks, the prompt logs of generative AI: all of it counts. So does the metadata.

The reference frame practitioners use is the Electronic Discovery Reference Model. EDRM was developed by George Socha in 2005 and is now stewarded by Mary Mack and Kaylee Walstad. It maps the workflow into nine phases. EDRM is the most widely cited industry model, but it isn’t a court rule, and EDRM compliance does not by itself equal FRCP compliance.

The Sedona Conference’s working group on ESI sets the practitioner consensus on top of the rules. Their Sedona Principles, Third Edition (19 Sedona Conf. J. 1 (2018)) is the document most courts cite when ESI questions get hard. Working definition: e-discovery is litigation’s information layer, governed by the FRCP, organized by EDRM, and policed through Rule 37(e). What changed in the AI era is that verification became a new layer over an unchanged rules framework.

"The reference model"

The Ediscovery Process: EDRM in Nine Phases

EDRM lays out nine stages from Information Governance through Presentation. It’s a map, not a march; revisit earlier stages as the matter evolves. Each pairs what AI changes with what it doesn’t.

5.1 Information Governance

This is the work before any complaint lands: deciding what to keep, delete, and where to store the rest. Rule 34(a)(1) sweeps in “any designated documents or electronically stored information,” and the 2006 Advisory Committee Note treats that phrase as broad by design.

What AI changes: classifiers tag privileged or sensitive content at-rest before any hold exists. The classifier sorts; counsel owns the log.

5.2 Identification

Identification asks: who has the documents, and where do they live? Rule 26(f)(3)(C) requires early conferral on ESI. Zubulake V told counsel they “must take affirmative steps to monitor compliance so that all sources... are identified and searched.” 229 F.R.D. 422, 432 (S.D.N.Y. 2004).

What AI changes: an agentic search across mailboxes can surface custodians your interviews missed, like the contractor cc’d twice or the regional manager who only appears in metadata. The agent finds names; you decide which matter and verify before it shapes scope. Op. 512’s verification duty applies here too, since custodian work is review-type.

5.3 Preservation (legal hold)

Once you “reasonably anticipate litigation,” the duty kicks in. Zubulake IV is canonical: a party “must suspend its routine document retention/destruction policy and put in place a ‘litigation hold.’” 220 F.R.D. 212, 218 (S.D.N.Y. 2003). Rule 37(e)(1) allows curative measures; (e)(2) reserves the worst sanctions for parties that “acted with the intent to deprive.”

What AI changes: honestly, little. Preservation is a human policy job. Issuing the hold, suspending auto-delete, supervising compliance: each demands attorney judgment, and an LLM can’t form intent or supervise people. The $3 million order in GN Netcom, Inc. v. Plantronics, Inc., 2016 WL 3792833 (D. Del. July 12, 2016) turned on a senior officer’s deliberate email destruction and counsel’s failure to supervise the hold; that is the Rule 37(e)(2) intent posture, one year after the 2015 rewrite. Hold-monitoring tools flag deletions; counsel owns supervision.

5.4 Collection

Collection is pulling the data: forensic images, mailbox exports, Slack archives, mobile extractions, cloud captures. The standard is defensibility: preserved metadata, hash verification, documented chain of custody.

What AI changes: almost nothing in the act itself. The forensic technician still images the laptop. AI shows up after collection: classifiers triage custodians, route audio to transcription, surface foreign-language content. Each flag needs attorney review before it shapes the review queue. Op. 512’s verification duty applies as soon as classifier output drives review decisions.

5.5 Processing

Processing decides whether your review is sane. Standard moves: de-duplication, de-NISTing (dropping system files via the NIST reference library), file-type filtering, OCR, email-thread suppression. Rule 34(b)(2)(E)(ii) requires production in a “reasonably usable form.”

What AI changes: a fair amount. Vision models do better OCR on bad scans. Email families thread automatically. Multimodal extraction handles audio and spreadsheet pulls. OCR errors compound downstream, so attorney spot-checks remain part of Op. 512’s verification duty before output enters the queue.

5.6 Review (TAR and predictive coding)

Review is the centerpiece and the most expensive line on the budget. Technology-Assisted Review entered the federal record in 2012 when Magistrate Judge Andrew Peck held “computer-assisted review is an acceptable way to search for relevant ESI in appropriate cases.” Da Silva Moore v. Publicis Groupe, 287 F.R.D. 182, 183 (S.D.N.Y. 2012). Grossman and Cormack had made the empirical case earlier: TAR can “yield results superior to... exhaustive manual review.” 17 Rich. J.L. & Tech. 11 (2011).

What AI changes: a great deal. TAR 1.0, TAR 2.0, and generative review with citations are different animals. The first two are statistical classifiers; the third is a language model that reads documents, answers questions, and points to the source page. ABA Op. 512 puts the duty on the lawyer to supervise the AI: reliance on GAI output without “independent verification” can violate the competence duty. Op. 512, at 3-4 (2024). The Stanford 2024 study by Magesh and colleagues, scoped to legal-research tools, found hallucination rates between 17% and 33%. Document review carries its own error surface. The agent surfaces candidates with citations. But the attorney confirms each one before responsiveness lands.

5.7 Analysis

Analysis is what you do with the responsive set: build the timeline, map the witnesses, find the contradiction. Rule 26(b)(1) keeps proportionality the gatekeeper. Judge Campbell put it plainly in In re Bard IVC Filters: “Relevancy alone is no longer sufficient.” 317 F.R.D. 562, 564 (D. Ariz. 2016).

What AI changes: event extraction accelerates. An agent can scan ten thousand emails, pull every date and dollar figure, and assemble a draft chronology overnight. Contradiction detection flags a deposition statement against an earlier email. Every output needs attorney review before it lands in a brief. Mata v. Avianca emphasized counsel’s “gatekeeping role... to ensure the accuracy” of any filing. 678 F. Supp. 3d 443, 448 (S.D.N.Y. 2023). The agent drafts; you verify.

5.8 Production

Production is the formal handoff: Bates-numbered documents, load files, privilege log, redactions. Rule 34(b)(2)(E)(ii) governs form. Counsel for Paul Manafort filed a January 2019 brief in United States v. Manafort, No. 1:17-cr-00201-ABJ (D.D.C.) whose black-bar redactions failed under copy-paste. It became national news.

What AI changes: privilege and PII redaction is the workflow most genuinely improved by current AI. A classifier proposes privileged and PII redactions across the production set; the attorney accepts, edits, or rejects each one before burn-in. Bates numbering and load-file generation stay scripted. The Manafort lesson stands: a redaction isn’t real until flattened. Op. 512 also reminds counsel to read the terms of any GAI tool, since output flowing between matters is a risk. Op. 512, at 6 (2024).

5.9 Presentation

Presentation is the courtroom exhibit, the demonstrative, the clip played during cross-examination. By the time you get here, the legal work is done; this stage is about which document tells the story.

What AI changes: very little. The courtroom is a human stage. Judges rule on objections in real time, juries watch attorneys not screens, and that’s the right answer. AI helps prepare exhibit packets, demonstratives, and witness outlines. None of that crosses the bar. The trial lawyer carries the exhibit and the duty of candor under Model Rule 3.3.

The opinion does not validate any vendor or any specific tool, and Judge Peck was clear that no review method, manual or assisted, guarantees perfection. Da Silva Moore approved TAR; it didn’t bless GenAI.

"The rule map"

FRCP Rules Every Small Firm Must Know

Five rules carry the weight of e-discovery, and you’ll see them in every order, brief, and meet-and-confer. Start with Rule 26(b)(1). Discovery must be “relevant to any party’s claim or defense and proportional to the needs of the case.” The rule names six factors: importance of issues, amount in controversy, parties’ relative access, parties’ resources, importance of discovery in resolving issues, and burden vs. benefit. That’s the proportionality math the court will run the moment a dispute breaks.

Rule 26(f)(3)(C) is the procedural twin. Your discovery plan must address “any issues about disclosure, discovery, or preservation of electronically stored information, including the form or forms in which it should be produced.” This is where the ESI protocol gets negotiated. Get it in writing early or spend months litigating what you could have agreed to in twenty minutes.

The safety valve sits in Rule 26(b)(2)(B). A party need not produce ESI from sources it identifies as “not reasonably accessible because of undue burden or cost.” The court can still order production on good cause, subject to the proportionality factors. For a small firm facing a request for legacy backup tapes or decommissioned cloud archives, this rule lets you push back without conceding the document is gone.

Form is the second thing you’ll argue about, and Rule 34(b)(2)(E)(ii) governs it. If a request specifies a form, give them that form. Otherwise, produce “in a form or forms in which it is ordinarily maintained or in a reasonably usable form.” Native versus TIFF-with-load-file is the fight; pick one, document why, stick to it.

Sanctions live in Rule 37(e). (e)(1) lets a court order curative measures when ESI is lost and the other party is prejudiced. (e)(2) reserves adverse-inference instructions, presumptions, and default for parties who “acted with the intent to deprive.” Don’t conflate the two. Klipsch v. ePRO shows the cost reality. The Second Circuit affirmed a $2.68 million sanction in a case the parties valued near $20,000, tying the figure to costs Klipsch incurred chasing ePRO’s misconduct rather than to the claim value; that is proportionality applied to discovery abuse, not abandoned. 880 F.3d at 631. The same opinion warns courts shouldn’t condone “excessive and disproportionate discovery demands”: proportionality cuts both ways.

"The verification era"

What Changes When AI Joins the Workflow

The bench has been asking the same question since 2023: did the lawyer read the brief before signing? But Mata v. Avianca is the case everyone learned from. Two attorneys filed a brief with six fictitious citations generated by ChatGPT. Judge Castel was direct: “existing rules impose a gatekeeping role on attorneys to ensure the accuracy of their filings.” 678 F. Supp. 3d 443, 448 (S.D.N.Y. 2023). The same opinion conceded “nothing inherently improper about using a reliable artificial intelligence tool.” Id. The rule isn’t “don’t use AI.” It’s verify, and own what you sign.

The Second Circuit reinforced the point in Park v. Kim, 91 F.4th 610, 612 (2d Cir. 2024): citing a fake opinion is an abuse of the adversary system. Sanctions followed in Wadsworth v. Walmart and Gauthier v. Goodyear. Lawyers don’t get a discount on gatekeeping because the hallucination came from a model.

ABA Formal Opinion 512 (July 2024) wrote the ethics floor. GAI tools are “prone to ‘hallucinations,’ providing ostensibly plausible responses that have no basis in fact or reality,” and reliance without “independent verification” can “violate the duty to provide competent representation.” Op. 512, at 3-4. The Stanford 2024 study by Magesh and colleagues, scoped to legal-research tools rather than document-review systems, found hallucination rates between 17% and 33%; the figure does not transfer directly to review classifiers, but it sets the order of magnitude a verification regime has to contain.

The TAR-versus-GenAI distinction matters. Da Silva Moore approved TAR in 2012; Rio Tinto called the rule black-letter law in 2015. No court has blessed GenAI review under a Rule 26(g) challenge, so certification rests on counsel’s signature. Recent orders treating GenAI prompts and outputs as discoverable ESI (e.g., In re OpenAI, Inc., Copyright Infringement Litig., S.D.N.Y. 2025) address the discoverability of GenAI artifacts, not its use as a review tool; do not let opposing counsel collapse the two. When a firm uses GenAI for responsiveness or privilege calls, the Rule 26(g)(1) certification that the response is “complete and correct” rests on the signing attorney. Model Rule 1.1 cmt. [8] (technological competence, adopted in some form by 39 states as of 2023 per Robert Ambrogi’s tracker) sets the floor. Model Rule 5.3 treats GAI tools as nonlawyer assistance, a supervisory duty. Counsel owns the call.

Disclosure: Hintyr publishes this guide and builds an agentic document review tool for small firms. Verification is the workflow’s center. Whatever tool a firm picks, the architecture is the same: the agent surfaces candidates with citations to the source page so counsel can confirm each one against the record before any responsiveness, privilege, or production call. Flags are proposals for review, never final calls. Op. 512 puts the gatekeeping duty on counsel, not the tool.

"The practitioner moves"

An Ediscovery Playbook for Small Law Firms

Six moves your firm can run on the next matter. None require a specialist team. All require the verification habit Op. 512 sets.

First, get the ESI protocol in writing at the Rule 26(f) conference. Cover scope, custodians, date ranges, search methodology, form of production, and AI tool use. The 2024 ABA Litigation TechReport found only 7% of small firms use predictive coding and 23% AI-assisted search; the gap closes faster when you negotiate AI methodology up front. Document the model version, the seed set, and the validation plan. The Rule 26(b)(1) proportionality playbook covers the factors. Negotiate cost-allocation on not-reasonably-accessible sources at the same conference; 26(b)(2)(B) is harder to invoke later.

Second, issue the legal hold the day your duty triggers, document it, and re-issue as the case grows. Track acknowledgments and follow up on silence. Rule 37(e)(2) reserves the worst sanctions for intent to deprive; (e)(1) measures still bite when steps weren’t reasonable. Rotate the reminder and capture any auto-delete changes in the audit log.

Third, preserve a Rule 502(d) clawback order at the start of any case with a non-trivial production. It’s the cheapest privilege insurance, and the right answer when AI-assisted review missed something. Pair it with a written redaction protocol so opposing counsel knows your verification steps in advance.

Fourth, run a small-firm AI review workflow with verification built in. Counsel reviews seed sets, samples validation output, signs off on responsiveness, and verifies each citation the agent surfaces. Read the tradeoffs between GenAI and TAR 2.0 before you commit; the defensibility posture differs even when outputs look similar.

Fifth, treat redaction as a two-step process: AI proposal, attorney review, then flatten. The Bates number conventions shouldn’t be an afterthought. Run a copy-paste test on every redacted PDF before serving; the Manafort lesson is that flattening makes the redaction real.

Sixth, log the workflow as you go: date, custodian, search syntax, model version, attorney sign-off. Defensibility is documented, not asserted. And a Rule 26(g) certification gets easier when the audit trail is already there.

State bar guidance on generative AI varies by jurisdiction, so the ethics floor above is the ceiling only until your bar speaks.

This post is for informational purposes and does not constitute legal advice. State bar guidance on generative AI varies materially by jurisdiction; consult your jurisdiction’s authority before adopting an AI workflow.

Bring AI-Era Document Review to Your Next Matter

Ask questions in plain English, get cited answers from your case documents, and run redaction, Bates stamping, and production in one place. Built for solo and small-firm litigators who carry the verification duty themselves; the defensibility of any production still rests on counsel’s signature.