Processing

Processing in e-discovery is the stage where collected electronically stored information is prepared for review by extracting text, metadata, and embedded objects, filtering out irrelevant or duplicate files, and converting documents into formats suitable for analysis. It bridges the gap between raw data collection and efficient document review.

What is processing?

Processing sits between Collection and Review in the Electronic Discovery Reference Model (EDRM). During processing, raw collected data is ingested into a review platform. Text is extracted from documents, metadata fields such as author, date, and file type are cataloged, and embedded objects like email attachments and linked files are broken out as separate reviewable items.

Processing also includes culling steps that reduce data volume before human reviewers see it. Date range filtering removes documents outside the relevant time period. Deduplication identifies exact or near-duplicate copies so each unique document is reviewed only once. File type exclusion removes system files, executables, and other non-responsive categories. These steps can dramatically cut the number of documents requiring manual review.

"A party need not provide discovery of electronically stored information from sources that the party identifies as not reasonably accessible because of undue burden or cost."-- Federal Rules of Civil Procedure, Rule 26(b)(2)(B)

Processing by the numbers

According to EDRM data, processing typically reduces total document volume by 30 to 60 percent through deduplication, date filtering, and file type exclusion.
Industry estimates place traditional processing costs between $25 and $75 per gigabyte, depending on data complexity and the level of culling required.
The Sedona Conference identifies proportionate processing as a key factor in meeting the cost-benefit analysis required by FRCP Rule 26(b)(1).
Email archives, compressed files, and container formats (such as ZIP and PST) require recursive extraction during processing, expanding a single source file into dozens or hundreds of reviewable items.

Processing in Hintyr

Hintyr processes uploaded files automatically. When you upload files, the platform extracts text, metadata, and embedded content without any manual configuration. Processing details and status are covered in the processing documentation.

You can monitor the processing status of each file in the uploads tab of the document browser. Once processing is complete, documents become searchable and available for analysis by the AI agent, which can answer questions, identify key documents, and perform review tasks across the processed collection.

Frequently asked questions

How long does processing take?

Processing time depends on the number and size of files, their formats, and how many embedded objects they contain. Most individual files process within seconds. Large batches with complex container formats like email archives may take several minutes.

What happens to duplicate files during processing?

Hintyr can identify exact duplicates during processing based on file content. Deduplication settings are configurable per case, allowing you to keep or remove duplicates depending on your review requirements.

Can I search documents before processing is complete?

Documents become searchable as soon as their text has been extracted. Files still in the processing queue are not yet available for search or AI analysis.

Does processing alter the original file?

No. Processing extracts information from the file and creates searchable indexes and viewable renditions. The original native file is preserved unchanged.

What is processing?

Processing by the numbers

Processing in Hintyr

Frequently asked questions

Related terms

Related articles