De-duplication (or “de-duping”) is the process of comparing electronic documents based on their characteristics and removing or marking duplicate documents within the data set. There are various methods to de-duplicate and it is important to understand the benefits and ramifications of each before your files are processed.
For example, vertical de-duping (or “custodian de-duping”) refers to identifying duplicates within the data of a single custodian or source. Horizontal de-duping (“case de-duping” or “cross-custodian de-duping”) covers the entire production set, identifying duplicate documents and emails across every custodian and source involved. There is also the question of whether you want duplicates re-inserted prior to production.
The primary purpose of de-duplication is to allow attorneys and paralegals to maximize their time and focus on the most relevant documents. Further, it helps create a clean, consistent data set in which all versions of the same document are marked with identical designations (responsive, privileged, confidential, etc.). De-duplication also reduces the number of files that must be integrated into a database. All of these factors translate to better efficiency and reduced costs.
Lighthouse Document Technologies uses the most advanced file hashing technology (or “digital fingerprinting”) to identify and/or remove all exact or near duplicates within any set of data. With our knowledge and experience, we can guide you to the appropriate de-duplicating strategy for your particular case needs.
