E-Discovery Glossary
# | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
SaaS (Software as a Service) Software application delivery model where a software vendor develops a webnative software application and hosts and operates (either independently or through a thirdparty) the application for use by its customers over the Internet. Customers pay not for owning the software itself but for using it. See Application Service Provider.
Sampling Sampling usually (but not always) refers to the process of testing a database or a large volume of ESI for the existence or frequency of relevant information. It can be a useful technique in addressing a number of issues relating to litigation, including decisions about what repositories of data are appropriate to search in a particular litigation, and determinations of the validity and effectiveness of searches or other data extraction procedures.
Sampling Rate The frequency at which analog signals are converted to digital values during digitization. The higher the rate, the more accurate the process.
SAN (Storage Area Network) A highspeed subnetwork of shared storage devices. A storage device is a machine that contains nothing but a disc or discs for storing data. A SAN´s architecture works in a way that makes all storage devices available to all servers on a LAN or WAN. As more storage devices are added to a SAN, they too will be accessible from any server in the larger network. In this case, the server merely acts as a pathway between the end user and the stored data. Because stored data does not reside directly on any of a network´s servers, server power is utilized for business applications, and network capacity is released to the end user. See also Network.
SAS70 Statement on Auditing Standards (SAS) No. 70, Service Organizations an auditing standard developed by the American Institute of Certified Public Accountants (AICPA), which includes and examination of an entity´s "controls" over information technology and related processes.
SAS70 Assessment Application of the standards of SAS70 to demonstrate adequate controls and safeguards are in place for hosted or processed data.
Scalability The capacity of a system to expand without requiring major reconfiguration or reentry of data. For example, multiple servers or additional storage can be easily added.
ScaletoGray An option to display a black and white image file in an enhanced mode, making it easier to view. A scaletogray display uses gray shading to fill in gaps or jumps (known as aliasing) that occur when displaying an image file on a computer screen. Also known as grayscale.
Scanner An input device commonly used to convert paper documents into images. Scanner devices are also available to scan microfilm and microfiche. See Flatbed Scanner.
Scanning Software Software that enables a scanner to deliver industry standard formats for images in a collection. Enables the use of OCR and coding of the images.
Schema A set of rules or conceptual model for data structure and content, such as a description of the data content and relationships in a database.
Scroll Bar The bar on the side or bottom of a window that allows the user to scroll up and down through the window´s contents. Scroll bars have scroll arrows at both ends, and a scroll box, all of which can be used to scroll around the window.
SCSI (Small Computer System Interface) Pronounced "skuzzy." A common, industry standard, electronic interface (highway) between computers and peripherals, such as hard discs, CDROM drives and scanners. SCSI allows for up to 7 devices to be attached in a chain via cables. As of this writing, the current SCSI standard is "SCSI II," also known as "Fast SCSI."
SDLT (Super DLT) A type of backup tape that can hold up to 300 GB or 450 CDs, depending on the data file format. See DLT.
Search See Compliance Search, Concept Search, Contextual Search, Boolean Search, FullText Search, Fuzzy Search, Index, Keyword Search, Pattern Recognition, Proximity Search, QBIC, Sampling, and Search Engine.
Search Engine A program that enables search for keywords or phrases, such as on web pages throughout the World Wide Web, e.g. Google, Lycos, etc.
Sector A sector is normally the smallest individually addressable unit of information stored on a hard drive platter, and usually holds 512 bytes of information. Sectors are numbered sequentially starting with 1 on each individual track. Thus, Track 0, Sector 1 and Track 5, Sector 1 refer to different sectors on the same hard drive. The first PC Hard discs typically held 17 sectors per track. Today, they can hold thousands of sectors per track.
Serial Line Internet Protocol (SLIP) A connection to the Internet in which the interface software runs in the local computer, rather than the Internet´s.
Serial Port See Port.
Serif The little cross bars or curls at the end of strokes on certain type fonts.
Server Any central computer on a network that contains ESI or applications shared by multiple users of the network on their client PCs. A computer that provides information to client machines. For example, there are web servers that send out web pages, mail servers that deliver email, list servers that administer mailing lists, FTP servers that hold FTP sites and deliver ESI to requesting users, and name servers that provide information about Internet host names. See File Server.
ServiceLevel Agreement A servicelevel agreement is a contract that defines the technical support or business parameters that a service provider or outsourcing firm will provide its clients. The agreement typically spells out measures for performance and consequences for failure.
Session A lasting connection, usually involving the exchange of many packets between a user or host and a server, typically implemented as a layer in a network protocol, such as telnet or FTP.
SGML/HyTime A multimedia extension to SGML, sponsored by DoD.
SHA1 Secure Hash Algorithm, for computing a condensed representation of a message or a data file specified by FIPS PUB 1801. See Hash.
Signature See Certificate.
SIMM (Single, InLine Memory Module) A mechanical package (with "legs") used to attach memory chips to printed circuit boards.
Simplex (Onesided page(s) Single Instance Storage) When several files in a computer filesystem contain exactly the same data, single instance storage can replace the references to these dentical files by references to a single stored copy of the file. This can potentially save large amounts of disk space in systems with many copies of the same file. Microsoft Exchange can use single instance storage to eliminate redundant copies of a message. The reduction occurs at the Microsoft Exchange Store level, so when mailboxes that receive a given message exist across Exchange Stores, each store will have one copy of the message.
Skewed Tilted images. See Deskewing.
Slack/Slack Space The unused space on a cluster that exists when the logical file space is less than the physical file space. Also known as file slack. A form of residual data, the amount of ondisc file space from the end of the logical record information to the end of the physical disc record. Slack space can contain information softdeleted from the record, information from prior records stored at the same physical location as current records, metadata fragments, and other information useful for forensic analysis of computer systems. See Cluster.
Smart Card A credit card size device that contains a microprocessor, memory and a battery.
SMTP (Simple Mail Transfer Protocol) The protocol widely implemented on the Internet for exchanging email messages.
Snapshot See Bit Stream Backup.
Software application See Application and Software.
Software Any set of coded instructions (programs) stored on computerreadable media that tells a computer what to do. Includes operating systems and software applications.
Speckle Imperfections in an image as a result of scanning paper documents that do not appear on the original. See Despeckling.
Splatter ESI that should be kept on one disc of a jukebox goes instead to multiple platters.
Spoliation Spoliation is the destruction of records or properties, such as metadata,that may be relevant to ongoing or anticipated litigation, government investigation or audit. Courts differ in their interpretation of the level of intent required before sanctions may be warranted.
SPP (Standard Parallel Port) See Port.
Spyware A data collection program that secretly gathers information about the user and relays it to advertisers or other interested parties. Adware usually displays banners or unwanted popup windows, but often includes spyware as well. See Malware.
SQL (Structured Query Language) A standard fourth generation programming language (4GL a programming language that is closer to natural language and easier to work with than a highlevel language). The popular standard for running database searches (queries) and reports.
StandAlone Computer A personal computer that is not connected to any other computer or network, except possibly through a modem.
Standard Generalized Markup Language (SGML) An informal industry standard for open systems document management that specifies the data encoding of a document´s format and content. Has been virtually replaced by XML.
Status Bar A bar at the bottom of a window that is used to indicate the status of a task. For example, when an email message is sent, the status bar will fill with dots indicating that a message is being sent.
Steganography The hiding of information within a more obvious kind of communication. Although not widely used, digital steganography involves the hiding of data inside a sound or image file. Steganalysis is the process of detecting steganography by looking at variances between bit patterns and unusually large file sizes.
Storage Device A device capable of storing ESI. The term usually refers to mass storage devices, such as disc and tape drives.
Storage Media See Magnetic or Optical Storage Media.
Streaming Indexing Realtime or near realtime, indexing of data as it being moved from one storage medium to another.
Structured Data Data stored in a structured format, such as databases or data sets. Contrast to Unstructured Data.
Subjective Coding The coding of a document using legal interpretation as the data that fills a field, versus objective data that is readily apparent from the face of the document, such as date, type, author, addresses, recipients and names mentioned. Usually performed by paralegals or other trained legal personnel.
Subtractive Colors Since the colors of objects are white light minus the color absorbed by the object, they are called subtractive. This is how ink on paper works. The subtractive colors of process ink are CMYK (Cyan, Magenta, Yellow and Black) and are specifically balanced to match additive colors (RGB).
Suspension Notice, Suspension Order See Legal Hold.
SVGA (Super Video Graphics Adapter) A graphics adapter one that exceeds the minimum VGA standard of 640 by 480 by 16 colors. Can reach 1600 by 1280 by 256 colors.
Swap File A file used to temporarily store code and data for programs that are currently running. This information is left in the swap file after the programs are terminated, and may be retrieved using forensic techniques. Also referred to as a page file or paging file.
System A system is: (1) a collection of people, machines, and methods organized to perform specific functions; (2) an integrated whole composed of diverse, interacting, specialized structures and subfunctions; and/or (3) a group of subsystems united by some interaction or interdependence, performing many duties, but functioning as a single unit.
System Administrator ("sysadmin," or "sysop") The person in charge of keeping a network working.
System Files Files allowing computer systems to run; nonusercreated files.
System Metadata See File System Metadata.
T1 A high speed, high bandwidth leased line connection to the Internet. T1 connections deliver information at 1.544 megabits per second.
T3 A high speed, high bandwidth leased line connection to the Internet. T3 connections deliver information at 44.746 megabits per second.
Tape Drive A hardware device used to store or backup ESI on a magnetic tape. Tape drives are usually used to back up large quantities of ESI due to their large capacity and cheap cost relative to other storage options.
Taxonomy The science of categorization, or classification, of things based on a predetermined system. In reference to Web sites and portals, a site´s taxonomy is the way it organizes its ESI into categories and subcategories, sometimes displayed in a site map. Used in information retrieval to find documents that are related to a query by identifying other documents in the same category.
TCP/IP (Transmission Control Protocol/Internet Protocol) The first two networking protocols defined; enable the transfer of data upon which the basic workings of the features of the Internet operate. See Port.
Telnet (Telecommunications Network) A protocol for logging onto remote computers from anywhere on the Internet.
Telephony Converting sounds into electronic signals for transmission.
Templates, Document Sets of index fields for documents, providing framework for preparation.
Temporary ("Temp") File Files stored on a computer for temporary use only, often created by Internet browsers. These temp files store information about Web sites that a user has visited, and allow for more rapid display of the Web page when the user revisits the site. Forensic techniques can be used to track the history of a computer´s Internet usage through the examination of these files. Temp files are also created by common office applications, such as word process or spreadsheet applications.
Terabyte 1,099,511,627,776 bytes 10244 (a trillion bytes). See Byte.
Text Mining The application of data mining (knowledge discovery in databases) to unstructured textual data. Text mining usually involves structuring the input text (often parsing, along with application of some derived linguistic features and removal of others, and ultimate insertion into a database), deriving patterns within the data, and evaluating and interpreting the output, providing such ranking results as relevance, novelty, and interestingness. Also referred to as "Text Data Mining." See Data Mining.
TGA Targa format. This is a "scanned format" " widely used for colorscanned materials (24bit) as well as by various "paint" and desktop publishing packages.
Thin Client A networked user computer that acts only as a terminal and stores no applications or user files. May have little or no hard drive space. See Client.
Thread A series of communications, usually on a particular topic. Threads can be a series of bulletin board messages (for example, when someone posts a question and others reply with answers or additional queries on the same topic). A thread can also apply to emails or chats, where multiple conversation threads may exist simultaneously. See Email String.
Thumb Drive See Key Drive.
Thumbnail A miniature representation of a page or item for quick overviews to provide a general idea of the structure, content and appearance of a document. A thumbnail program may be a standalone or part of a desktop publishing or graphics program. Thumbnails provide a convenient way to browse through multiple images before retrieving the one needed. Programs often allow clicking on the thumbnail to retrieve it.
TIFF (Tagged Image File Format) A widely used and supported graphic file formats for storing bitmapped images, with many different compression formats and resolutions. File name has .TIF extension. Can be black and white, grayscaled, or color. Images are stored in tagged fields, and programs use the tags to accept or ignore fields, depending on the application. The format originated in the early 1980s.
TIFF Group III (compression) A onedimensional compression format for storing black and white images that is utilized by many fax machines. See TIFF.
TIFF Group IV (compression) A twodimensional compression format for storing black and white images. Typically compresses at a 20to1 ratio for standard business documents. See TIFF.
Time Zone Normalization See Nomalization.
Toggle A switch that is either on or off, and reverses to the opposite when selected.
Tone Arm A device in a computer that reads to/from a hard drive.
Tool Kit Without An Interesting Name (TWAIN) A universal toolkit with standard hardware/software drivers for multimedia peripheral devices.
Toolbar The row of graphical or text buttons that perform special functions quickly and easily.
Topology The geometric arrangement of a computer system. Common topologies include a bus (network topology in which nodes are connected to a single cable with terminators at each end), star (local area network designed in the shape of a star, where all end points are connected to one central switching device, or hub), and ring (network topology in which nodes are connected in a closed loop; no terminators are required because there are no unconnected ends). Star networks are easier to manage than ring topology.
Track Each of the series of concentric rings contained on a hard drive platter.
TREC (Text Retrieval Conference) An ongoing series of workshops cosponsored by NIST and the U. S. Department of Defense.
Trojan A program that does something undocumented which the programmer intended, but that the user would not approve of if known to the user. Sometimes referred to as a "Trojan horse." See Malware.
True Resolution The "true" optical resolution of a scanner is the number of pixels per inch (without any software enhancements).
Twiki A "WikiWiki" enables simple formbased web applications without programming, and granular access control (thought it can also operate in the classic "no authentication" mode). Other enhancements include configuration variables, embedded searches, serverside includes, file attachmednts, and a plugin API that has spawned over 150 plugins to link into databases, create charts, sort tables, write spreadsheets, make drawings, track Extreme Programming projects, and so on.
Typeface There are over 10,000 typefaces available for computers. The general categories are: oldstyle (faces have slanted serifs, gradual thick to thin strokes and a slanted stress the "O" appears slanted), modern (faces have thin, horizontal serifs, radical thick to thin strokes and a vertical street the "O" does not appear to slant); slab serif (faces have thick, horizontal serifs, little or no thicktothin in the strokes and a vertical stress the "O" appears vertical); sans serif (faces have no serifs), script (from elaborate handwriting styles to casual, freeform, unconnected letter forms), decorative unusual fonts (designed to be very different and attention getting).