E-Discovery

AI Redaction Failures That Expose Privileged Data

Relativity, Everlaw, and GoldFynch all have documented redaction defects. Courts have imposed millions in sanctions for the failures. Here are the AI redaction e-discovery best practices that prevent privilege exposure.

Alexander Cohan, Ph.D.

Alexander Cohan, Ph.D.

Computational scientist with a Ph.D. from UC Irvine and peer-reviewed research in NLP, deep learning, and large-scale data modeling. Over a decade of experience building systems that process complex document sets at scale. Founded Hintyr to bring defensible AI workflows to litigation teams navigating document review, redaction, and production.

Key Takeaways

  • The same copy-paste redaction failure has been exposing privileged data for twenty years—and AI tools introduce new failure modes on top of it.
  • Relativity, Everlaw, and GoldFynch all have documented redaction defects in their own product documentation.
  • Courts have imposed sanctions exceeding $2.5 million for redaction and production failures, and clawback agreements won’t always protect you.
  • An 8-step pre-production QC checklist—mapped to real sanctions cases—catches the failures that AI misses.
  • The duty to protect privileged information is personal and nondelegable; vendor blame has never worked as a defense.
Redacted legal document with visible AI failure highlighting exposed privileged text beneath black bars

The Copy-Paste Test That Keeps Failing

Twenty years. That’s how long the same redaction failure has been exposing confidential information in court filings and productions. And it still happens. AI redaction in e-discovery demands best practices that most firms still don’t follow.

The pattern is simple. A lawyer draws a black box over sensitive text in a PDF. The text looks hidden. But it isn’t removed. Anyone can select the blacked-out area, copy it, and paste it into a text editor. Every word is there.

In January 2019, Paul Manafort’s defense team filed a document in the D.D.C. with black bars covering details about sharing Trump campaign polling data with Russian-linked associate Konstantin Kilimnik. Reporters at The Guardian copy-pasted the “redacted” text and published it within hours.

The document had been processed using PDFium, which does not support true redaction. It draws rectangles on top of text. That’s all.

In Apple v. Samsung (N.D. Cal., 2014), Quinn Emanuel filed an expert report that failed to properly redact Apple’s “Highly Confidential, Attorney Eyes Only” licensing agreements. The confidential terms reached over 200 Samsung employees in violation of a protective order. A Samsung in-house lawyer reportedly told counterparts that “all information leaks.” Sanctions followed, reportedly exceeding $2 million.

This isn’t ancient history. During the Meta FTC antitrust trial (D.D.C., April 2025), Meta’s legal team filed documents with cosmetic redactions that journalists bypassed to expose confidential data from Apple, Snap, and Google. In early 2026, the DOJ released 3.5 million pages of Epstein-related documents with thousands of redaction failures affecting nearly 100 victims. A Wall Street Journal review found at least 43 victims’ full names exposed, including minors. One victim received death threats after 51 entries included her private banking information.

If you’re a solo practitioner or small-firm attorney handling your own productions, this is the single most dangerous mistake you can make. The black box is a lie unless your tool removes the underlying text. Drawing a rectangle is annotation. Removing content is redaction. Confusing the two has cost lawyers millions in sanctions.

How AI Redaction Actually Breaks

The copy-paste problem is the one everyone knows about. But modern e-discovery platforms have subtler failure modes that don’t show up until opposing counsel opens your production in a text editor or loads your spreadsheet into a different tool. These are capable platforms that handle most redaction scenarios correctly. But they have specific documented gaps that matter when they hit.

Relativity

Relativity is the most widely used platform in large-scale litigation, and its own Known Issues page documents the gaps. Defect REL-1233528 (disclosed January 9, 2026): when Relativity’s Redact tool is applied to Microsoft Excel XLS files containing hyperlinks, the redacted content remains fully accessible if opened in a text viewer like Notepad. The redaction looks correct in Excel. It does nothing in Notepad.

A separate Relativity bug, REL-1275064, causes the “Remove annotations” mass operation to apply to more documents than intended. REL-1211591 exports more documents than selected. And redactions applied in Relativity may not carry over when documents transfer to another e-discovery tool.

Everlaw

Everlaw has no publicly reported breaches. But its own documentation reveals two critical limits. Spreadsheet redactions applied at the last row or column are silently dropped on production. They just disappear. And batch redactions exceeding 200,000 per spreadsheet fail without redacting that document. No error message. No warning. The document goes out unredacted.

GoldFynch

GoldFynch has a design choice that could be catastrophic if you don’t read the fine print. Native-format productions include zero redactions applied within the platform. If you redact a document in GoldFynch and select native format for production, your redactions don’t exist in the output.

Concordance

Concordance, a legacy platform launched in 1984, carries over 30 documented known issues including data truncation to 64 characters during indexing. The ABA’s 2022 Litigation TechReport called it “perplexing” that firms still use it. You can compare document review tools to see how current platforms differ.

Cross-Platform AI Failures

Then there are the AI-specific failures that cut across every platform. OCR accuracy hits 98-99% on clean printed text. But it drops to roughly 60% on poor-quality scans. When OCR misreads “O” as “0” or “l” as “I,” pattern-matching tools can’t find SSNs or account numbers. Studies of manual redaction find that 5-10% of sensitive information survives the process.

Named entity recognition faces a different problem: “Grace,” “Hope,” “Chase,” and “Wells” are both personal names and common English words. And metadata is the threat nobody sees on the page: tracked changes, hidden spreadsheet cells, EXIF geolocation data, and file paths all create exposure vectors that face-of-document review will never catch.

Every one of these failures has the same root cause. The tool looks like it worked. The redaction appears on screen. But somewhere in the pipeline, between the platform and the production, the protection vanishes.

What Courts Expect After a Redaction Failure

When privileged documents slip into a production, the question isn’t whether you made a mistake. It’s whether your process was reasonable.

Victor Stanley, Inc. v. Creative Pipe, Inc., 250 F.R.D. 251 (D. Md. 2008) drew the line. Magistrate Judge Paul Grimm found privilege waived for 165 electronically stored documents because the producing party’s keyword search was unreasonable. The opinion didn’t punish imperfection. It punished the inability to explain what was done and why. The defendants couldn’t identify the keywords they used, the rationale for selecting them, whether Boolean operators were applied, or whether anyone tested the results. That silence was fatal.

The consequences scale. In DR Distributors, LLC v. 21 Century Smoking, Inc., 513 F. Supp. 3d 839 (N.D. Ill. 2021), Judge Johnston produced a 256-page sanctions opinion. A separate order awarded $2,526,744.76 in attorney’s fees. Former defense counsel was ordered to complete eight hours of e-discovery CLE. These are basic malpractice risks that compound fast when left unchecked.

Clawback agreements won’t always save you, either. In Irth Solutions, LLC v. Windstream Communications LLC, 2017 WL 3276021 (S.D. Ohio 2017), Windstream produced over 2,200 pages including 43 privileged documents, then produced the same privileged documents again weeks later. The court called it “reckless” and held that clawback agreements cannot “shield a party from the consequences of reckless conduct.”

But there is insurance worth having. FRE 502(d) orders can protect you even when technology fails in ways you didn’t anticipate. In Brookfield Asset Management v. AIG Financial Products, 2013 WL 142503 (S.D.N.Y. 2013), privileged information had been redacted from the face of five draft board minutes, but it remained visible in the metadata. The 502(d) stipulation preserved privilege despite the failure. Multiple courts have gone so far as to call failure to obtain a 502(d) order “akin to malpractice.”

The standard courts apply is reasonableness, not perfection. But reasonableness demands a documented process: what tools you used, what settings you applied, what validation you performed, and what results you recorded. If you can’t explain your workflow under oath, you don’t meet the standard.

How Hintyr Does Redaction Differently

The failures documented above share a common root: redaction tools that treat redaction as a visual layer rather than a content operation. A black box covers the text on screen, but the text itself stays in the file.

Hintyr takes a different approach. When you redact content in Hintyr, the underlying text is removed from the document. Not covered. Not hidden. Gone. There’s no copy-paste test to fail because there’s nothing left to copy. You can read more about the technical approach in our redaction overview.

For PII and PHI detection, Hintyr uses AI-assisted redaction that surfaces sensitive content for attorney review. The AI does the tedious work of scanning thousands of documents for Social Security numbers, medical records, and financial data. But an attorney reviews every suggestion before it’s applied. The AI proposes. A human disposes. That’s the difference between a defensible redaction log and a malpractice claim.

Batch redaction works across document sets, but with scope controls that show you exactly what will change before you commit. You see a preview of every affected document. You confirm the scope. Then the system executes. Scope controls flag exceptions rather than silently skipping files.

Metadata stripping is built into the production pipeline, not bolted on as an afterthought. Tracked changes, hidden spreadsheet cells, EXIF data, embedded comments: all cleaned automatically on export. And redactions hold across formats, whether you export as PDF, TIFF, or native file.

We won’t claim perfection. No tool can guarantee zero errors across every file type and every malformed document a custodian hands over. But Hintyr’s design prioritizes transparency and verification at every step. You can see what the AI flagged, confirm what gets redacted, and verify the output before it leaves your hands.

Hintyr AI redaction interface showing true content removal with pending redaction review and Apply button
Hintyr’s redaction interface: AI flags sensitive content, attorneys review each suggestion, and the “Apply” button commits true content removal.

AI Redaction E-Discovery Best Practices: A Pre-Production QC Checklist

The EDRM model and Sedona Conference guidelines both emphasize quality control before production. But “QC your redactions” isn’t actionable advice. Here’s a concrete checklist, mapped to the failures and sanctions discussed above, that you can run before any production leaves your office.

  1. Verify Document Counts and Bates Sequences

    Compare your production volume against the expected count from your review platform. Check that Bates number sequences are continuous with no gaps or duplicates. Missing documents are easier to catch here than in a meet-and-confer three weeks later; Hintyr’s gap detection report can flag sequence breaks automatically.

  2. Run the Copy-Paste Test

    Select all content. Copy. Paste into a plain text editor. If you can read anything that should have been redacted, stop production immediately. This is the test that caught the Manafort PDF and the Relativity Excel bug. It takes seconds per document.

  3. Confirm Text Layers Match the Image Layer

    Redacting the visible layer of a PDF doesn’t redact the extracted text or OCR layer underneath. Open the extracted text files for your production and search for terms that should have been removed.

  4. Audit Metadata Fields

    Check author names, file paths, “last modified by” fields, tracked changes, and document properties. A properly redacted document body means nothing if the metadata still reads “Draft - Attorney Work Product - Privileged” in the file properties. Hintyr’s production exports handle metadata stripping automatically, but if you’re using other tools, this step is manual and essential.

  5. Validate Load Files Against ESI Protocol

    Test that your production load files (DAT, OPT, or EDRM XML) match the specifications you agreed to in the ESI protocol. Mismatched load files cause delays and, in some courts, sanctions.

  6. Run a Final Privilege Sweep

    Search the production set for privilege terms, attorney names, and law firm domains. This is your last chance to catch a privileged document that slipped through review.

  7. Generate and Preserve Your Audit Trail

    Document what was redacted, when, by whom, and under what authority. Courts increasingly expect this. In DR Distributors, the absence of a clear audit trail contributed to the sanctions. Hintyr’s TAR validation workflow produces defensible documentation of your review and redaction decisions.

  8. Review Every AI Flag

    If you used AI to identify sensitive content, don’t treat its suggestions as final. Review each flagged item before applying redactions. Clear PII patterns (SSNs, credit card numbers) are usually accurate, but contextual flags (medical references in general discussion, for example) need a human eye. This is where the proportionality arguments from Rule 26(b)(1) come into play: your QC effort should be proportional to the sensitivity of the content and the stakes of the case.

No checklist eliminates all risk. But running these eight steps before every production closes the gaps that have led to sanctions, bar complaints, and regulatory penalties.

Why the Vendor Blame Defense Fails

Your e-discovery vendor made the error. Surely the court will consider that.

It won’t. Or more precisely, it will consider it and hold you responsible anyway. The duty to ensure accurate production is personal and nondelegable.

J-M Manufacturing Co. v. McDermott Will & Emery (L.A. Superior Court, Case No. BC462832) set the template. E-discovery vendor Stratify Inc. forgot to pass 180,000 records through a privilege filter, failed to submit another 10,000 documents for attorney review, and in a separate incident released approximately 9,650 files without review at all. Roughly 3,900 privileged documents ended up in the production. Judge George Wu put it bluntly: “I can understand the error once. But this is an error twice, three times, four times...”

In DR Distributors, counsel tried to shift blame to their ESI vendor. A replacement vendor found over 15,000 responsive documents that had never been collected. The court was “particularly annoyed” by the blame-shifting and ruled that attorneys must have “a reasonable understanding of the client’s information systems.” The $2.5 million sanctions bill went to counsel, not the vendor.

Prompt disclosure can soften the blow, but only modestly. In Fluor Federal Solutions v. BAE Systems (W.D. Va., 2023), a vendor de-duplication error caused approximately 79,000 documents to be mistakenly withheld. The court found the production untimely but declined additional sanctions because BAE disclosed the error quickly. Disclosure mitigated. It did not excuse.

ABA Model Rule 1.1, Comment 8 requires lawyers to “keep abreast of changes in the law and its practice, including the benefits and risks associated with relevant technology.” Forty states have formally adopted this duty of technology competence. You can’t delegate it to a vendor and call it compliance.

The regulatory exposure compounds the risk. A redaction failure that exposes protected health information triggers HIPAA penalties up to $2,190,294 per violation. GDPR fines reach 4% of global annual turnover. Under the CCPA, intentional violations carry $7,500 per incident. These penalties fall on the data controller, not the software vendor who processed the files.

Frequently Asked Questions

What causes AI redaction to fail in e-discovery productions?

The most common failures are cosmetic-only redactions (black boxes overlaid on text that remains selectable underneath), OCR misalignment where the AI redacts the wrong coordinates on a scanned document, and metadata stripping failures where privileged content persists in hidden document layers. Relativity, Everlaw, and GoldFynch all have documented defects in their own product documentation.

Can a law firm be sanctioned for an AI redaction failure?

Yes. Courts have imposed sanctions exceeding $2.5 million when redaction and production failures exposed privileged information. In Irth Solutions v. Windstream, the court held that clawback agreements cannot shield reckless production practices. The standard is reasonableness, not perfection, but reasonableness requires a documented process.

What QC steps should firms take before producing AI-redacted documents?

Run the copy-paste test on every redacted document. Confirm text layers match image layers. Audit metadata fields for privileged content. Validate load files against ESI protocol specs. Run a final privilege sweep. Generate an audit trail. For AI-assisted redaction, review every flag before applying.

Does using an AI redaction vendor protect a firm from liability?

No. Courts consistently hold that the duty to protect privileged information is personal and nondelegable. In J-M Manufacturing, counsel was sanctioned despite vendor error. In DR Distributors, the court was “particularly annoyed” by attempts to blame an ESI vendor. The $2.5 million sanctions bill went to counsel, not the vendor.

What should firms look for in an e-discovery platform’s redaction features?

True content removal (not overlay annotation), human-in-the-loop review where attorneys approve every AI-flagged redaction before it’s applied, scope controls that preview batch operations before execution, automated metadata stripping in the production pipeline, and cross-format consistency so redactions hold whether exporting as PDF, TIFF, or native format.

See how Hintyr’s redaction removes content instead of covering it → Learn more

Disclaimer: This blog post is published by Hintyr for informational purposes only and does not constitute legal advice. The discussion of case law, ethics rules, and vendor incidents is general in nature and may not reflect the rules applicable in your jurisdiction. Vendor defects cited here are sourced from publicly available documentation, court records, and news reports; platforms may have addressed these issues in subsequent releases. Attorneys should consult their state bar’s ethics opinions and qualified legal counsel before making production and technology decisions. No attorney-client relationship is created by reading this post.

Ready to stop guessing whether your redactions hold?

Hintyr removes content, not just covers it. Every AI-flagged redaction gets attorney review before it’s applied, and metadata is stripped automatically on export.