A Glossary of eDiscovery Terms

We lawyers have a tendency to confuse our terminology when discussing technology.  This is especially true when discussing the technical aspects of eDiscovery. 

I was inspired by the excellent Grossman-Cormack Glossary of Technology-Assisted Review which attempted to define the terminology surrounding TAR.  I prepared the following short glossary of commonly used eDiscovery terms to assist our group in understanding some of jargon used by eDiscovery professionals.  The terms can be found after the break.

• Litigation Hold – The initial letter that is sent to a client to ensure that the duty to preserve is complied with.

• Preservation – The efforts that follow the litigation hold.  Preservation involves turning off automatic deletion processess, the identification of key custodians and the interfacing with those custodians and the client’s IT manager.

• Collection – This involves the collection of any document (whether paper or ESI) from the client.

o Self Collection – A self collection occurs when the client chooses what documents will be collected for eventual review and production.  The Court of Chancery has strongly advised against this See. Roffe v. Eagle Rock.

o Targeted Collection – A collection of client’s data using narrow parameters such as date restrictions and search terms.  This is normally done in an attempt to cut down on costs.  This type of collection has its own risks; if it occurs before a meet and confer all custodians and search terms may not be known and may be subject to change.

o Full Collection – A collection of a client’s entire mailbox and/or imaging of their hard drive.  This ensures that all possible data is collected but could lead to greater cost down the road.

• Processing – This involves the deduplication of documents, making the documents searchable and ready for review and the extraction of metadata. 

• OCR – Optical Character Recognition aka Searchable Text

• Deduplication – The process by which duplicate documents are eliminated, thus cutting down on the total number of documents to review.

o Global Deduplication – Deduplicating across an entire collection

o By Custodian – Deduplicating on a custodian-by-custodian basis.

• Search Terms – Terms that are developed for targeted collection or review of documents…or both.  Search terms are used in order to cut down on costs associated with reviewing full collection of documents.

• Publishing/Migration/Loading – The process by which search terms are applied to a collection and the results are loaded to a review database. 

• Review – The analysis performed on documents after publishing to a database.  The review takes place in a review tool such as Concordance or Relativity.  The analysis usually includes responsiveness to document requests, attorney-client privilege and key issues. 

• Production – The method by which documents are selected for production to opposing or co-counsel.  This involves the OCR’ing, bates stamping and delivery of TIFFs (Images), OCR (Text) and load files.

o TIFFs – The standard file format for the production of document images.

o Load Files – A file included with a production that allows it (including text, images and metadata) to be loaded into a review tool.

• Production Specifications – A set of options that typically appear in Requests for Production of Documents.  Specifications cover the metadata fields to be produced, types of load files and whether documents will be produced in TIFF form.