data compression

(redirected from Disk compression)
Also found in: Wikipedia.

data compression

[′dad·ə kəm‚presh·ən]
(computer science)
The technique of reducing the number of binary digits required to represent data.
McGraw-Hill Dictionary of Scientific & Technical Terms, 6E, Copyright © 2003 by The McGraw-Hill Companies, Inc.

Data compression

The process of transforming information from one representation to another, smaller representation from which the original, or a close approximation to it, can be recovered. The compression and decompression processes are often referred to as encoding and decoding. Data compression has important applications in the areas of data storage and data transmission. Besides compression savings, other parameters of concern include encoding and decoding speeds and workspace requirements, the ability to access and decode partial files, and error generation and propagation.

The data compression process is said to be lossless if the recovered data are assured to be identical to the source; otherwise the compression process is said to be lossy. Lossless compression techniques are requisite for applications involving textual data. Other applications, such as those involving voice and image data, may be sufficiently flexible to allow controlled degradation in the data.

Data compression techniques are characterized by the use of an appropriate data model, which selects the elements of the source on which to focus; data coding, which maps source elements to output elements; and data structures, which enable efficient implementation.

Information theory dictates that, for efficiency, fewer bits be used for common events than for rare events. Compression techniques are based on using an appropriate model for the source data in which defined elements are not all equally likely. The encoder and the decoder must agree on an identical model. See Information theory

A static model is one in which the choice of elements and their assumed distribution is invariant. For example, the letter “e” might always be assumed to be the most likely character to occur. A static model can be predetermined with resulting unpredictable compression effect, or it can be built by the encoder by previewing the entire source data and determining element frequencies. The benefits of using a static model include the ability to decode without necessarily starting at the beginning of the compressed data.

An alternative dynamic or adaptive model assumes an initial choice of elements and distribution and, based on the beginning part of the source stream that has been processed prior to the datum presently under consideration, progressively modifies the model so that the encoding is optimal for data distributed similarly to recent observations. Some techniques may weight recently encountered data more heavily. Dynamic algorithms have the benefit of being able to adapt to changes in the ensemble characteristics. Most important, however, is the fact that the source is considered serially and output is produced directly without the necessity of previewing the entire source.

In a simple statistical model, frequencies of values (characters, strings, or pixels) determine the mapping. In the more general context model, the mapping is determined by the occurrence of elements, each consisting of a value which has other particular adjacent values. For example, in English text, although generally “u” is only moderately likely to appear as the “next” character, if the immediately preceding character is a “q” then “u” would be overwhelmingly likely to appear next.

The use of a model determines the intended sequence of values. An additional mapping via one coding technique or a combination of coding techniques is used to determine the actual output. Several data coding techniques are in common use.

Digitized audio and video signals

The information content of speech, music, and television signals can be preserved by periodically sampling at a rate equal to twice the highest frequency to be preserved. This is referred to as Nyquist sampling. However, speech, music, and television signals are highly redundant, and use of simple Nyquist sampling to code them is inefficient. Reduction of redundancy and application of more efficient sampling results in compression of the information rate needed to represent the signal without serious impairment to the quality of the remade source signal at a receiver. For speech signals, redundancy evident in pitch periodicity and in the format (energy-peaks) structure of the signal's spectrum along with aural masking of quantizing noise is used to compress the information rate. In music, which has much wider bandwidth than speech and far less redundancy, time-domain masking and frequency-domain masking are principally used to achieve compression. For television, redundancy evident in the horizontal and vertical correlation of the pixels of individual frames and in the frame-to-frame correlation of a moving picture, combined with visual masking that obscures quantizing noise resulting from the coding at low numbers of bits per sample, is used to achieve compression. See Television

Compression techniques may be classified into two types: waveform coders and parametric coders. Waveform coders replicate a facsimile of a source-signal waveform at the receiver with a level of distortion that is judged acceptable. Parametric coders use a synthesizer at the receiver that is controlled by signal parameters extracted at the transmitter to remake the signal. The latter may achieve greater compression because of the information content added by the synthesizer model at the receiver.

Waveform compression methods include adaptive differential pulse-code modulation (ADPCM) for speech and music signals, audio masking for music, and differential encoding and sub-Nyquist sampling of television signals. Parametric encoders include vocoders for speech signals and encoders using orthogonal transform techniques for television.

McGraw-Hill Concise Encyclopedia of Engineering. © 2002 by The McGraw-Hill Companies, Inc.

data compression

(algorithm)
compression. Probably to distinguish it from (electronic) signal compression.
This article is provided by FOLDOC - Free Online Dictionary of Computing (foldoc.org)

data compression

There are two categories of data compression. The first reduces the size of a single file to save storage space and transmit faster. The second is for storage and transmission convenience.

#1 - Compressing a Single File
The JPEG image, MPEG video, MP3 audio and G.7xx voice formats are widely used "lossy" methods that analyze which pixels, video frames or sound waves can be removed forever without the average person noticing (see lossy compression). GIF images have no loss of pixels but may have a loss of colors (see GIF).

JPEG files can be reduced as much as 80%; MPEG enables a two-hour HD movie to fit on a single disc, and MP3 sparked a revolution by reducing CD music 90%. For a list of compression methods, see codec examples. See JPEG, GIF, MPEG, MP3, G.7xx and interframe coding.

#2 - Compressing a Group of Files (Archiving)
The second "lossless" category compresses and restores data without the loss of a single bit. Although this is widely used for documents, this method is not aware of the content's purpose. It merely looks for repeatable patterns of 0s and 1s, and the more patterns, the higher the compression ratio. Text documents compress the most, while binary and already-compressed files (JPEG, MPEG, etc.) compress the least.

Although lossless methods such as the ZIP format are used to reduce the size of a single, huge file, they are widely used to compress several files into one "archive." It is convenient to store and considerably more convenient to transmit a single file than to keep track of multiple files. See lossless compression, archive, archive formats and capacity optimization.

Lossless Methods (Dictionary and Statistical)
The widely used dictionary method creates a list of repeatable phrases. For example, GIF images and ZIP and JAR archives are compressed with this method (see LZW). The statistical method converts characters into variable length strings of bits based on frequency of use (see Huffman coding).


Copyright © 1981-2019 by The Computer Language Company Inc. All Rights reserved. THIS DEFINITION IS FOR PERSONAL USE ONLY. All other reproduction is strictly prohibited without permission from the publisher.
References in periodicals archive ?
DiskZIP's transparent disk compression builds a compressed disk image file, which is locked by the operating system during boot, safely out of reach of all known malware.
Tests to determine if the pain is caused by disk compression of a spinal nerve root are mostly variations of the straight-leg raising test, that is, raising the leg with the knee extended to stretch the nerve root over the protruded disk.
Bodybuilding in Water[TM] is mostly about opposing muscle contractions directly without driving exercise forces through the skeleton, which allows for massive muscle exertion, without the pain of joint and disk compression.
Our virtual tape library products, for example, reduce the cost of enterprise-class disk-to-disk backup by 50% to 67% through high-performance disk compression. You can also use our disk-based backup solutions with our mirroring and replication technologies to instantaneously retrieve lost data in the event of a failure.<p>Team with an industry leader <p>Media and entertainment companies around the world rely on our unified storage architecture for its superior scalability, flexibility, manageability, and reliability.
Cortisone or other steroid in the epidural space treats inflammation of the spinal nerve roots re sulting from disk compression or spinal stenosis.
An injection of cortisone or other steroid into the epidural space treats inflammation of the spinal nerve roots resulting from disk compression or spinal stenosis.
DISK COMPRESSION Almost everyone is familiar with compressing and archiving files in a ZIP file format.
(2) Its use is being studied in clinical trials for wound care and the treatment of facial and leg telangiectasias, endobronchial tumors, and disk compression.
At the same time, DR Dos was making a big splash in the DOS market and Stac Electronics was making hard disk compression software.
Keep in mind that you can't use GoBack if you have enabled disk compression. And, GoBack is not meant to replace traditional backup units.
However, the main advantage to FAT 16 is that the DriveSpace 3 disk compression utility only works on it.
Stac began as a fabless semiconductor company, but is most famous for the Stacker PC disk compression software that was ubiquitous during the early 1990s.