E-Discovery
What Is a Bates Number?
It’s 4 pm Thursday. Opposing counsel wants the production by Friday, the load file just kicked back two errors, and someone is asking whether SMITH-04217 was the email or the attachment. Every produced page gets a unique, sequential Bates number, and the conventions behind that ID decide whether your privilege log holds, your load file validates, and your Friday production goes out clean.

Legal technology researcher and data scientist specializing in AI governance for litigation teams. Expertise in NLP and AI-assisted document review.

Bates Number Definition: The Four-Corner Answer
A Bates number is the unique, sequential alphanumeric identifier stamped on each page of a document produced in litigation. It serves as a control number, a chain-of-custody marker, and a cross-reference index that lets counsel, witnesses, and the court point at the same page. That definition comes almost verbatim from the District of Maryland’s 2024 ruling in Bobb v. FinePoints, which has become the cleanest federal articulation of what the marker actually does. Bates labeling and Bates stamping are used interchangeably in modern practice, both meaning the application of that sequential identifier to each produced page.
Practitioners distinguish a Bates number from the internal DocID a review platform assigns at ingestion. The DocID stays with a document across the matter; the Bates is what ships in the production. Some teams reuse the DocID as the Bates root. Most don’t, because Bates restarts with each producing party while DocIDs stay stable across the matter.
The unit of stamping is the page, not the document. A 40-page brief might be page 1 to 40 internally and SMITH-04217 to SMITH-04256 in your production. Internal page numbers reset at the start of every file. Bates numbers don’t.
Bates is not, strictly speaking, mandated by the Federal Rules of Civil Procedure. Rule 34(b)(2)(E) sets the floor: produce documents in a “reasonably usable form.” Bates is the working convention courts and parties use to satisfy that floor. ESI protocols, local rules, and meet-and-confer agreements fill in the rest.
Modern Bates Number Format: Prefix, Padding, and Length
A modern Bates number has three parts. A party prefix. A delimiter. A zero-padded sequence. SMITH-04217. ACME_000123. JONES000456.
Padding is where most workflows quietly break. SMITH-1000 sorts before SMITH-200 in any alphanumeric system you’d actually use. Zero-pad to a fixed width and the sort behaves. Six digits supports up to 999,999 pages, which covers most matters. Seven digits gives breathing room for re-productions and supplemental batches above that.
Prefix conventions vary by firm. Party initials. Matter abbreviations. Custodian codes. Some teams stack them, like SMITH-EML-04217 to mark email custodians inside a larger production. Pick one rule and write it into your ESI protocol.
Drift is the silent killer. Batch one ships as SMITH-04217. Batch two comes back from a co-counsel hand-off as SMITH_04217 with the delimiter swapped. Batch three drops the delimiter entirely. Three weeks later your privilege log won’t join cleanly to the production register, and a paralegal spends a Saturday rewriting strings in a text editor.
How Bates Numbers Work in Document Productions
A Bates number alone is just a string. The production around it is a structured envelope: stamped images, a delimited metadata file, an image cross-reference file, and the native files where the protocol calls for them. Each piece has to agree with the others.
BegBates, EndBates, BegAttach: How Load Files Tie It Together
BegBates is the first page of a document. EndBates is the last. Together they define the Bates range for that document. A 40-page brief stamped SMITH-04217 through SMITH-04256 has BegBates SMITH-04217 and EndBates SMITH-04256, and any reference to “the brief” in your privilege log points at that range.
BegAttach and EndAttach define the family range across an email and all its attachments. If SMITH-04217 is the parent email and SMITH-04218 through SMITH-04256 are three attachments, BegAttach is SMITH-04217 and EndAttach is SMITH-04256 for every member of the family. Lose that, and a deponent’s three-attachment email arrives on opposing counsel’s desk as four floating documents with no parent-child marker.
The two files that carry it are the .DAT and the .OPT. The .DAT is delimited metadata, one row per document, with field names like BEGBATES, ENDBATES, BEGATTACH, CUSTODIAN, SENTDATE. The .OPT is the image cross-reference, mapping each Bates page to a single TIFF or JPEG path on disk.
Field-name variants are where the .DAT refuses to ingest. PRODBEG instead of BEGBATES. STARTBATES instead of BEGBATES. Different review platforms emit different headers, and the receiving platform doesn’t always normalize. Your image keys must match the Bates filenames in the .OPT exactly, character for character, or the viewer breaks at the first page.
In Branhaven v. BeefTek (D. Md. 2013), a producing party delivered roughly 112,000 pages as a single PDF without working Bates. The court called it “not reasonably usable” and assessed sanctions of more than $51,000.
The lesson held into the Bobb line of cases. Bates is the working metadata layer of a production at scale, and a PDF dump without it fails the Rule 34(b)(2)(E) test.
Pollock v. Superior Court (Cal. App. 2023) carved a narrow exception under California Code of Civil Procedure 2031.280, holding that documents need not be Bates-labeled to constitute a code-compliant response. The federal default still treats Bates as the working metadata layer, and Rule 26 proportionality questions push toward maintaining the convention even where state practice diverges. See our docs on Bates numbers in production exports for the field-by-field schema.
Page-Level vs. File-Level Numbering for Native Files
Spreadsheets don’t have pages. Audio doesn’t have pages. Video, databases, complex CAD files. None of them paginate the way a Word document does, and the moment one shows up in a production, page-level Bates numbering breaks down.
Two conventions handle this. The first uses a TIFF placeholder slip-sheet that carries a single Bates number and a “produced in native form” notice, with the actual native file saved to a filename that matches that Bates number, like SMITH-04217.xlsx. The second goes document-level: assign one Bates per file, skip the slip-sheet, and rely on the load file metadata to anchor the family. Either way, the placeholder Bates and the native filename match, so SMITH-04217.pdf placeholder pairs to SMITH-04217.xlsx native, and the load file row anchors both.
Page-level numbering on an Excel file produces unreliable output. A 30-tab workbook with macros has no canonical pagination, and any attempt to flatten it loses the formulas, the conditional formatting, and the audit trail.
In Blevins-Clark v. Beacon Communities (E.D. Ky. Sept. 26, 2025), the court rejected a PDF-only production of native ESI as not “reasonably usable” under Rule 34. Magistrate Judge Stinnett described the result as the equivalent of manufacturing a car without an engine: a shell without much use.
The takeaway: protect the native, stamp the placeholder, and watch for redaction failures during production that can re-expose privileged content when the placeholder ships without the redaction layer applied.
How to Bates Stamp a PDF (Manual and Automatic Methods)
The Origin of Bates Numbering: From 1890s Stamp to PDF Footer
Edwin G. Bates of New York filed US Patent 484,391 in 1892 for an automatic numbering machine. He filed US 676,082 in 1901 for the modern self-inking version that survived into the rubber-stamp drawers of every law firm document room for the next ninety years. The mechanical stamp solved one problem cleanly: cheap, durable, sequential identification across paper.
The function is older than the litigation it now supports. The format keeps moving.
Adobe Acrobat handles batch Bates for sub-1,000-page matters acceptably. Open the file, run the Bates tool, set prefix and start, write to the footer. For a small motion exhibit set, that’s the right tool. For productions at scale it fails, and the failure modes are predictable. Acrobat’s cover-sheet skip rule increments the counter on cover sheets that don’t carry a stamp, so SMITH-04217 logs as produced but lives on a page that never got marked. Stamping over signatures, court footers, and prior third-party Bates is the most common error a re-production catches at the last minute.
Production-at-scale needs a tool that emits matched .DAT and .OPT alongside the stamped image, with the field names the receiving platform expects. PDF-only Bates without matched load files becomes harder to defend as production volume and load-file requirements grow. The Branhaven sanctions are the canonical reminder.
Common Bates Numbering Mistakes That Break a Production
Seven things go wrong, plus one nasty edge case that surfaces only after a Rule 502(d) clawback. The list runs in rough order of how often we see them in small-firm e-discovery workflows.
- Restarting the sequence mid-production. Counter resets to one when a paralegal opens a new batch in Acrobat instead of continuing from the last EndBates. Gap detection dies on arrival, because every joining tool now sees two SMITH-04217 pages and refuses to load.
- Insufficient zero-padding. SMITH-200 sorts after SMITH-1000 in alphanumeric order. Pad to six digits at minimum so privilege logs join cleanly.
- Padding ceiling miscalibration. Pick four digits for what becomes a 50,000-page matter and the next batch overflows the ceiling. Plan for the highest plausible volume.
- Overwriting the native filename without retaining the original. A spreadsheet renamed SMITH-04217.xlsx loses the witness-named original. When a deponent says “I called it q3-actuals,” your matched-pair filename is the only trail back.
- Stamping over text. The footer collides with signatures, court footers, and prior third-party Bates from documents produced in earlier matters. Once stamped, the obscured text is gone from the image. Re-OCR won’t recover it.
- Bates without matching metadata in the load file. Stamps land on pages, but the .DAT row never gets written, or it lists the wrong custodian. Industry commentary has flagged this pattern in tools that overlay text on images without emitting matched .DAT and .OPT, and the receiving platform can’t reconcile.
- Family relationships flattened. BegAttach and EndAttach end up identical to BegBates and EndBates, so the parent email and its three attachments arrive as four standalone documents with no family link. Deposition prep falls apart fast.
The plus-one is the Bates collision after a Rule 502(d) clawback. You produce SMITH-04217 through SMITH-04219, opposing counsel returns SMITH-04218 as inadvertently privileged, you re-produce, and the next batch starts at SMITH-04220 without re-issuing 04218. Three weeks later a supplemental production reuses 04218 for a different document. Two documents now share a Bates. The motion to clarify the record is unpleasant.
Bates Numbering for Small Firms: A Faster Workflow
Manual Acrobat math is unforgiving. Thirty seconds per file across 4,000 files is 33 hours, before any QC pass. Saturday gap-verification by hand is the workflow tax small firms quietly pay every quarter. Re-production after a clawback rebuilds the load file in a text editor, line by line, while a paralegal who started Friday at 9 am tries to keep her cursor on the right row at 11 pm.
Hintyr is Agentic Document Review. It applies Bates numbers and emits matched .DAT and .OPT load files in one pass. The same agent flags prefix drift across batches, catches gaps and overlapping ranges before the production leaves the firm, and carries family ranges (BegAttach and EndAttach) through stamping without flattening them. It also flags pages where the Bates footer overlaps an existing footer or a redaction layer. See Hintyr Bates numbering for the workflow and AI-assisted Bates numbering for the QC layer.
Bates Number FAQ
What does a Bates number look like?
A party prefix, a delimiter, and a zero-padded sequence. SMITH-04217 is typical: SMITH for the producing party, a hyphen, then a six-digit running counter that does not reset across the production. Some firms use the underline character, some use no delimiter at all. Pick one and write it into the ESI protocol.
Who invented Bates numbering?
Edwin G. Bates patented the original automatic numbering machine in 1892 (US 484,391) and the modern self-inking version in 1901 (US 676,082). The 1893 date that circulates online is a common error.
Is a Bates number the same as a page number?
No. Page numbers reset at the start of each document. Bates numbers run sequentially across the entire production. A 40-page brief might be page 1 to 40 internally and SMITH-04217 to SMITH-04256 in the production. Counsel and witnesses cite the Bates range, not the internal pagination.
How many digits should a Bates number have?
Six or seven digits is the working convention for productions over 100,000 pages. Pad with leading zeros so SMITH-000200 sorts before SMITH-001000. Insufficient padding breaks alphanumeric sorting in every common review platform, and the fix once a production is out is rarely cheap.
Do native files get Bates numbers?
Yes, usually through a TIFF placeholder slip-sheet that carries a single Bates and a “produced in native form” notice, with the native saved to a filename matching that Bates. Excel files and audio do not have pages to stamp directly. The best AI document review tools for law firms now apply that placeholder convention automatically when the protocol calls for it.
What is the difference between BegBates and EndBates?
BegBates is the first page of a document. EndBates is the last. Together they define the Bates range for that document. BegAttach and EndAttach define the family range across an email and all its attachments. The four together let a load file represent any production cleanly.
This article is for informational purposes and does not constitute legal advice. Federal and state rules on production form, including FRCP 34(b)(2)(E) and California Code of Civil Procedure 2031.280, vary by jurisdiction and matter. Consult counsel admitted in the relevant jurisdiction before relying on the conventions described above.
Stamp the production once, validate before it ships.
Hintyr applies Bates numbers, generates matched .DAT and .OPT load files, and flags gaps and prefix drift before opposing counsel sees the production.