E-Discovery Glossary
# | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
DAC (Digital to Analog Converter) Converts digital data to analog data.
DAD (Digital Audio Disc) Another term for compact disc.
DAT (Digital Audio Tape) A magnetic tape generally used to record audio but can hold up to 40 gigabytes (or 60 CDs) of data if used for data storage. Has the disadvantage of being a serial access device. Often used for backup.
Data Any information stored on a computer. All software is divided into two general categories: data and programs. Programs are collections of instructions for manipulating data. In database management systems, data files are the files that store the database information. Other files, such as index files and data dictionaries, store administrative information, known as metadata.
Data Categorization The categorization and sorting of ESI such as foldering by "concept," content, subject, taxonomy, etc. through the use of technology such as search and retrieval software or artificial intelligence to facilitate review and analysis.
Data Collection See Harvesting.
Data Controller (as used with regard to the EU Data Protection Act) The natural or legal person who alone or jointly with others determines the purposes for which and the manner in which any Personal Data are to be processed.
Data Element A combination of characters or bytes referring to one separate piece of information, such as name, address, or age.
Data Encryption Standard (DES) A form of private key encryption developed by IBM in the late 1970´s.
Data Extraction The process of retrieving data from documents (hard copy or electronic). The process may be manual or electronic.
Data Field See Field.
Data Filtering The process of identifying for extraction specific data based on specified parameters.
Data Formats The organization of information for display, storage or printing. Data is sometimes maintained in certain common formats so that it can be used by various programs, which may only work with data in a particular format, e.g. PDF, html.
Data Harvesting See Harvesting.
Data Mining Data mining generally refers to knowledge discovery in databases (structured data); often techniques for extracting summaries and reports from databases and data sets. In the context of electronic discovery, this term often refers to the processes used to cull through a collection of ESI to extract evidence for production or presentation in an investigation or in litigation. See also Text Mining.
Data Processor (as used with regard to the EU Data Protection Act) A natural or legal person (other than an employee of the Data Controller) who processes Personal Data on behalf of the Data Controller.
Data Set A named or defined collection of data. See also Production Data Set and Privilege Data Set.
Data Subject (as used with regard to the EU Data Protection Act) An individual who is the subject of Personal Data.
Data Verification Assessment of data to ensure it has not been modified. The most common method of verification is hash coding by some method such as MD5. See also Digital Fingerprint and File Level Binary Comparison and Hash Coding.
Database Management System (DBMS) A software system used to access and retrieve data stored in a database.
Database In electronic records, a database is a set of data elements consisting of at least one file, or of a group of integrated files, usually stored in one location and made available to several users. Databases are sometimes classified according to their organizational approach, with the most prevalent approach being the relational database a tabular database in which data is defined so that it can be reorganized and accessed in a number of different ways. Another popular organizational structure is the distributed database, which can be dispersed or replicated among different points in a network. Computer databases typically contain aggregations of data records or files, such as sales transactions, product catalogs and inventories, and customer profiles. SQL (Structured Query Language) is a standard computer language for making interactive queries from and updates to a database.
Date/Time Normalization See Normalization.
Daubert (challenge) Daubert v. Merrell Dow Pharmaceuticals, 509 U.S. 579 (1993), addresses the admission of scientific expert testimony to ensure that the testimony is reliable before considered for admission pursuant to Rule 702. The court assesses the testimony by analyzing the methodology and applicability of the expert"s approach. Faced with a proffer of expert scientific testimony, the trial judge must determine first, pursuant to Rule 104(a), whether the expert is proposing to testify to (1) scientific knowledge that (2) will assist the trier of fact to understand or determine a fact at issue. This involves preliminary assessment of whether the reasoning or methodology is scientifically valid and whether it can be applied to the facts at issue. Daubert suggests an open approach and provides a list of four potential factors: (1) whether the theory can be or has been tested; (2) whether the theory has been subjected to peer review or publication; (3) known or potential rate of error of that particular technique and the existence and maintenance of standards controlling the technique´s operation; and (4) consideration of general acceptance within the scientific community. 509 U.S. at 59394.
DDE (Dynamic Data Exchange) A form of interprocess communications used by Microsoft Windows to support the exchange of commands and data between two simultaneously running applications.
DEB (Digital Evidence Bag) A standardized electronic "wrapper" or "container" for electronic evidence to preserve and transfer evidence in an encrypted or protected form that prevents deliberate or accidental alteration. The secure "wrapper" provides metadata concerning the collection process and context for the contained data.
Decompression To expand or restore compressed data back to its original size and format. See Compression.
Decryption Transformation of encrypted (or scrambled) data back to original form.
DeDuplication DeDuplication ("DeDuping") is the process of comparing electronic records based on their characteristics and removing or marking duplicate records within the data set. The definition of "duplicate records" should be agreed upon, i.e., whether an exact copy from a different location (such as a different mailbox, server tapes, etc.) is considered to be a duplicate. Deduplication can be selective, depending on the agreedupon criteria. See also Case DeDuplication, Content Comparison, CrossCustodian DeDuplication, Custodian DeDuplication, Data Verification, Digital Fingerprint, File Level Binary Comparison, Hash Coding, Horizontal DeDuplication, Metadata Comparison, Near DeDuplication, and Production DeDuplication.
DeFragment ("defrag") Use of a computer utility to reorganize files so they are more contiguous on a hard drive or other storage medium, if the files or parts thereof have become fragmented and scattered in various locations within the storage medium in the course of normal computer operations. Used to optimize the operation of the computer, it will overwrite information in unallocated space. See Fragmented.
Deleted Data Deleted Data is data that existed on the computer as live data and which have been deleted by the computer system or enduser activity. Deleted data may remain on storage media in whole or in part until they are overwritten or "wiped." Even after the data itself has been wiped, directory entries, pointers or other information relating to the deleted data may remain on the computer. "Soft deletions" are data marked as deleted (and not generally available to the enduser after such marking), but not yet physically removed or overwritten. Softdeleted data can be restored with complete integrity.
Deleted File A file with disc space that has been designated as available for reuse; the deleted file remains intact until it is overwritten.
Deletion Deletion is the process whereby data is removed from active files and other data storage structures on computers and rendered inaccessible except through the use of special data recovery tools designed to recover deleted data. Deletion occurs on several levels in modern computer systems: (a) File level deletion renders the file inaccessible to the operating system and normal application programs and marks the storage space occupied by the file´s directory entry and contents as free and available to reuse for data storage, (b) Record level deletion occurs when a record is rendered inaccessible to a database management system (DBMS) (usually marking the record storage space as available for reuse by the DBMS, although in some cases the space is never reused until the database is compacted) and is also characteristic of many email systems (c) Byte level deletion occurs when text or other information is deleted from the file content (such as the deletion of text from a word processing file); such deletion may render the deleted data inaccessible to the application intended to be used in processing the file, but may not actually remove the data from the file´s content until a process such as compaction or rewriting of the file causes the deleted data to be overwritten.
DeNIST The use of an automated filter program that screens files against the NIST list of computer file types to separate those generated by a system and those generated by a user. See NIST List.
Descenders The portion of a character that falls below the main part of the letter (e.g. g, p, q).
Deshading Removing shaded areas to render images more easily recognizable by OCR. Deshading software typically searches for areas with a regular pattern of tiny dots.
Deskewing The process of straightening skewed (tilted) images. Deskewing is one of the image enhancements that can improve OCR accuracy. Documents often become skewed when scanned or faxed.
Desktop Generally refers to the working area of the display on an individual PC.
Despeckling Removing isolated speckles from an image file. Speckles often develop when a document is scanned or faxed. See Speckle.
DIA/DCA (Document Interchange Architecture) An IBM standard for transmission and storage of voice, text or video over networks.
Digital Information stored as a string of ones and zeros (numeric). Opposite of analog.
Digital Certificate Electronic records that contain keys used to decrypt information, especially information sent over a public network like the Internet.
Digital Fingerprint A fixedlength hash code that uniquely represents the binary content of a file. See also Data Verification and File Level Binary Comparison and Hash Coding.
Digital Signature A way to ensure the identity of the sender, utilizing public key cryptography and working in conjunction with certificates. See Certificate and PKI Digital Signature.
Digitize The process of converting an analog value into a digital (numeric) representation.
Directory A simulated file folder or container used to organize files and directories in a hierarchical or treelike structure. UNIX and DOS use the term "directory," while Mac and Windows use the term "folder."
Dirty Text OCR output reflecting text as read by the OCR engine(s) with no clean up.
Disaster Recovery Tapes Portable media used to store data for backup purposes. See Backup Data/Backup Tapes.
Disc mirroring A method of protecting data from a catastrophic hard disc failure or for long term data storage. As each file is stored on the hard disc, a "mirror" copy is made on a second hard disc or on a different part of the same disc. See also Mirroring and Mirror Image.
Disc Partition A hard drive containing a set of consecutive cylinders.
Disc/Disk Round, flat storage media with layers of material that enable the recording of data.
Discovery Discovery is the process of identifying, locating, securing and producing information and materials for the purpose of obtaining evidence for utilization in the legal process. The term is also used to describe the process of reviewing all materials that may be potentially relevant to the issues at hand and/or that may need to be disclosed to other parties, and of evaluating evidence to prove or disprove facts, theories or allegations. There are several ways to conduct discovery, the most common of which are interrogatories, requests for production of documents and depositions.
Discwipe Utility that overwrites existing data. Various utilities exist with varying degrees of efficiency some wipe only named files or unallocated space of residual data, thus unsophisticated users who try to wipe evidence may leave behind files of which they are unaware.
Disposition The final business action carried out on a record. This action generally is to destroy or archive the record. Electronic record disposition can include "soft deletions" (see Deletion), "hard deletions," "hard deletions with overwrites," "archive to longterm store," "forward to organization," and "copy to another media or format and delete (hard or soft)."
Distributed Data Distributed Data is that information belonging to an organization that resides on portable media and nonlocal devices such as remote offices, home computers, laptop computers, personal digital assistants ("PDAs"), wireless communication devices (e.g., Blackberry) and Internet repositories (including email hosted by Internet service providers or portals and web sites). Distributed data also includes data held by third parties such as application service providers and business partners. Note: Information Technology organizations may define distributed data differently (for example, in some organizations distributed data includes any nonserverbased data, including workstation disc drives).
Dithering In printing, dithering is usually called halftoning, and shades of gray are called halftones. The more dither patterns that a device or program supports, the more shades of gray it can represent. Dithering is the process of converting grays to different densities of black dots, usually for the purposes of printing or storing color or grayscale images as black and white images.
DLT (Digital Linear Tape) A type of backup tape that can hold up to 80 GB depending on the data file format.
Document (or Document Family) A collection of pages or files produced manually or by a software application, constituting a logical single communication of information, but consisting of more than a single standalone record. Examples include a fax cover, the faxed letter, and an attachment to the letter the fax cover being the "Parent," and the letter and attachment being a "Child." See also Attachment, Load File, Message Unit, and Unitization Physical and Logical.
Document Date The original creation date of a document. For an email, the document date is indicated by the datestamp of the email.
Document Imaging Programs Software used to store, manage, retrieve and distribute documents quickly and easily on the computer.
Document Metadata Properties about the file stored in the file, as opposed to document content. Often this data is not immediately viewable in the software application used to create/edit the document but often can be accessed via a "Properties" view. Examples include document author and company, and create and revision dates. Contrast with File System Metadata and Email Metadata. See also Metadata.
Document Type or Doc Type A typical field used in bibliographical coding. Typical doc type examples include correspondence, memo, report, article and others.
DoD 5015 Department of Defense standard addressing records management.
Domain A subnetwork of servers and computers within a LAN. Domain information is useful when restoring backup tapes, particularly of email.
Domino Database Another name for Lotus Notes Databases versions 5.0 or higher. See NSF.
DOS See MSDOS.
Dot Pitch Distance of one pixel in a CRT to the next pixel on the vertical plane. The smaller the number, the higher quality display.
Double Byte Language See Unicode.
Download To copy data from another computer to one´s own, usually over a network or the Internet.
DPI (Dots Per Inch) The measurement of the resolution of display in printing systems. A typical CRT screen provides 96 dpi, which provides 9,216 dots per square inch (96x96). When a paper document is scanned, the resolution, or level of detail, at which the scanning was performed is expressed in DPI. Typically, documents are scanned at 200 or 300 DPI.
Draft Record A draft record is a preliminary version of a record before it has been completed, finalized, accepted, validated or filed. Such records include working files and notes. Records and information management policies may provide for the destruction of draft records upon finalization, acceptance, validation or filing of the final or official version of the record. However, draft records generally must be retained if (1) they are deemed to be subject to a legal hold; or (2) a specific law or regulation mandates their retention and policies should recognize such exceptions.
DragandDrop The movement of onscreen objects by dragging them with the mouse, and dropping them in another place.
DRAM Dynamic Random Access Memory, a memory technology that is periodically "refreshed" or updated
" as opposed to "static" RAM chips that do not require refreshing. The term is often used to refer to the memory chips themselves.
Drive Geometry A computer hard drive is made up of a number of rapidly rotating platters that have a set of read/write heads on both sides of each platter. Each platter is divided into a series of concentric rings called tracks. Each track is further divided into sections called sectors, and each sector is subdivided into bytes. Drive geometry refers to the number and positions of each of these structures.
Driver A driver is a computer program that controls various devices such as the keyboard, mouse, monitor, etc.
DropDown Menu A menu window that opens onscreen to display contextrelated options. Also called popup menu or pulldown menu.
DSP (Digital Signal Processor/Processing) A special purpose computer (or technique) which digitally processes signals and electrical/analog waveforms.
DTP (Desktop Publishing) PC applications used to prepare direct print output or output suitable for printing presses.
Duplex Scanners vs. DoubleSided Scanning Duplex scanners automatically scan both sides of a doublesided page, producing two images at once. Doublesided scanning uses a singlesided scanner to scan doublesided pages, scanning one collated stack of paper, then flipping it over and scanning the other side.
Duplex Twosided page(s).
DVD (Digital Video Disc or Digital Versatile Disc) A plastic disc, like a CD, on which data can be written and read. DVDs are faster, can hold more information, and can support more data formats than CDs.