information theory

a collection of mathematical theories, based on statistics, concerned with methods of coding, transmitting, storing, retrieving, and decoding information

Collins Discovery Encyclopedia, 1st edition © HarperCollins Publishers 2005

information theory

[‚in·fər′mā·shən ‚thē·ə·rē]

(communications)

A branch of theory which is devoted to problems in communications, and which provides criteria for comparing different communications systems on the basis of signaling rate, using a numerical measure of the amount of information gained when the content of a message is learned.

(mathematics)

The branch of probability theory concerned with the likelihood of the transmission of messages, accurate to within specified limits, when the bits of information composing the message are subject to possible distortion.

Information theory

A branch of communication theory devoted to problems in coding. A unique feature of information theory is its use of a numerical measure of the amount of information gained when the contents of a message are learned. Information theory relies heavily on the mathematical science of probability. For this reason the term information theory is often applied loosely to other probabilistic studies in communication theory, such as signal detection, random noise, and prediction. See Electrical communications

In designing a one-way communication system from the standpoint of information theory, three parts are considered beyond the control of the system designer: (1) the source, which generates messages at the transmitting end of the system, (2) the destination, which ultimately receives the messages, and (3) the channel, consisting of a transmission medium or device for conveying signals from the source to the destination. The source does not usually produce messages in a form acceptable as input by the channel. The transmitting end of the system contains another device, called an encoder, which prepares the source's messages for input to the channel. Similarly the receiving end of the system will contain a decoder to convert the output of the channel into a form that is recognizable by the destination. The encoder and the decoder are the parts to be designed. In radio systems this design is essentially the choice of a modulator and a detector.

A source is called discrete if its messages are sequences of elements (letters) taken from an enumerable set of possibilities (alphabet). Thus sources producing integer data or written English are discrete. Sources which are not discrete are called continuous, for example, speech and music sources. The treatment of continuous cases is sometimes simplified by noting that signal of finite bandwidth can be encoded into a discrete sequence of numbers.

The output of a channel need not agree with its input. For example, a channel might, for secrecy purposes, contain a cryptographic device to scramble the message. Still, if the output of the channel can be computed knowing just the input message, then the channel is called noiseless. If, however, random agents make the output unpredictable even when the input is known, then the channel is called noisy. See Communications scrambling, Cryptography

Many encoders first break the message into a sequence of elementary blocks; next they substitute for each block a representative code, or signal, suitable for input to the channel. Such encoders are called block encoders. For example, telegraph and teletype systems both use block encoders in which the blocks are individual letters. Entire words form the blocks of some commercial cablegram systems. It is generally impossible for a decoder to reconstruct with certainty a message received via a noisy channel. Suitable encoding, however, may make the noise tolerable.

Even when the channel is noiseless, a variety of encoding schemes exists and there is a problem of picking a good one. Of all encodings of English letters into dots and dashes, the Continental Morse encoding is nearly the fastest possible one. It achieves its speed by associating short codes with the most common letters. A noiseless binary channel (capable of transmitting two kinds of pulse 0, 1, of the same duration) provides the following example. Suppose one had to encode English text for this channel. A simple encoding might just use 27 different five-digit codes to represent word space (denoted by #), A, B, . . . , Z; say # 00000, A 00001, B 00010, C 00011, . . . , Z 11011. The word #CAB would then be encoded into 00000000110000100010. A similar encoding is used in teletype transmission; however, it places a third kind of pulse at the beginning of each code to help the decoder stay in synchronism with the encoder.

information theory

The study of encoding and transmitting information. From Claude Shannon's 1948 paper, "A Mathematical Theory of Communication," which proposed the use of binary digits for coding information. Shannon said that all information has a "source rate" that can be measured in bits per second and requires a transmission channel with a capacity equal to or greater than the source rate.

Copyright © 1981-2025 by The Computer Language Company Inc. All Rights reserved. THIS DEFINITION IS FOR PERSONAL USE ONLY. All other reproduction is strictly prohibited without permission from the publisher.

The following article is from The Great Soviet Encyclopedia (1979). It might be outdated or ideologically biased.

Information Theory

the mathematical discipline that studies the processes of storage, transformation, and transmission of information. Information theory is an essential part of cybernetics.

At the basis of information theory lies a definite method for measuring the quantity of information contained in given data (“messages”). Information theory proceeds from the idea that the messages designated for retention in a storage device or for transmission over a communication channel are not known in advance with complete certainty. Only the set from which these messages may be selected is known in advance and, at best, how frequently certain of these messages are selected (that is, the probability of the messages). In information theory it is shown that the “uncertainty” encountered in such circumstances admits of a quantitative expression and that precisely this expression (and not the specific nature of the messages themselves) determines the possibility of their storage and transmission.

As such a “measure of uncertainty” in information theory one uses the number of binary digits (bits) necessary to record an arbitrary message from a given source. More precisely, one looks at all possible methods for representing the messages by sequences of the symbols 0 and 1 (binary codes) that satisfy two conditions: (a) different sequences correspond to different messages and (b) upon the transcription of a certain sequence of messages into coded form this sequence must be unambiguously recoverable. Then as a measure of the uncertainty one takes the average length of the coded sequence that corresponds to the most economical method of encoding; one binary digit serves as the unit of measurement.

For example, let certain messages x₁, x₂, and x₃ appear with probabilities of ½, ⅜, and ⅛, respectively. Any code that is too short, such as

x₁ = 0, x₂ = 1, x₃ = 01

is unsuitable since it violates condition (b). Thus, the sequence 01 can denote x₁,x₂x₃ The code

x₁ = 0, x₂ = 10, x₃ = 11

satisfies conditions (a) and (b). To it corresponds an average length of a coded sequence equal to

It is not hard to see that no other code can give a smaller value, that is, the code indicated is the most economical. In accordance with our choice of a measure for uncertainty, the uncertainty of the given information source should be taken equal to 1.5 binary units.

Here it is appropriate to note that “message,” “communication channel,” and other terms are understood very broadly in information theory. Thus, from the viewpoint of information theory, an information source is described by enumerating the set x₁, x₂, … of possible messages (which can be the words of some language, results of measurements, or television pictures) and their respective probabilities p₁, p₂p,

There is no simple formula expressing the exact minimum H’ of the average number of bits necessary for encoding the messages x₁, x₂, …, x_n through the probabilities p₁, p₂, … P_n of these messages. However, the specified minimum is not less than the value

(where log₂a denotes the logarithm of the quantity a to base 2) and may not exceed it by more than one unit. The quantity H (the entropy of the set of messages) possesses simple formal properties, and for all conclusions of information theory that are of an asymptotic character, corresponding to the case H′→ ∞, the difference between H and H′ is absolutely immaterial. Accordingly, the entropy is taken as the measure of the uncertainty of the messages from a given source. In the example above, the entropy is equal to

From the viewpoint stated, the entropy of an infinite aggregate, as a rule, turns out to be infinite. Therefore, when applied to an infinite collection it is treated differently: a certain precision level is assigned, and the concept of £-entropy is introduced as the entropy of the information recorded with a precision of e, if the message is a continuous quantity or function (for example, of time).

Just as with the concept of entropy, the concept of the amount of information contained in a certain random object (random quantity, random vector, or random function) relative to another is introduced at first for objects with a finite number of possible values. Then the general case is studied with the help of a limiting process. In contrast to entropy, the amount of information, for example, in a certain continuously distributed random variable relative to another continuously distributed variable, very often turns out to be finite.

The concept of a communication channel is of an extremely general nature in information theory. In essence, a communication channel is given by specifying a set of “admissible messages” at the “channel input,” a set of “output messages,” and a collection of conditional probabilities for receiving one or another message at the output for a given input message. These conditional probabilities describe the effect of “noise” distorting the transmitted information. “Connecting” any information source to the channel, one may calculate the amount of information contained in the messages at the output relative to that at the input. The upper limit of these amounts of information, taken with all admissible sources, is termed the capacity of the channel. The capacity of a channel is its fundamental information characteristic. Regardless of the effect (possibly strong) of noise in the channel, at a definite ratio of the entropy of the incoming information to the channel capacity, almost error-free transmission is possible with the correct coding.

Information theory searches for methods for transmitting information that are optimal with respect to speed and reliability, having established theoretical limits to the quality attainable. Clearly, information theory is of an essentially statistical character; therefore, a significant portion of its mathematical methods is derived from probability theory.

The foundations of information theory were laid in 1948–49 by the American scientist C. Shannon. The contribution of the Soviet scientists A. N. Kolmogorov and A. Ia. Khinchin was introduced into its theoretical branches and that of V. A. Kotel’-nikov, A. A. Kharkevich, and others into the branches concerning applications.

REFERENCES

Iaglom, A. M., and I. M. Iaglom. Veroiatnost’ i informatsiia, 2nd ed. Moscow, 1960.
Shannon, C. “Statisticheskaia teoriia peredachi elektricheskikh signalov.” In Teoriia peredachi elektricheskikh signalov pri nalichii pomekh: Sb. perevodov. Moscow, 1953.
Goldman, S. Teoriia informatsii. Moscow, 1957. (Translated from English.)
Teoriia informatsii i ee prilozheniia: Sb. perevodov. Moscow, 1959.
Khinchin, A. Ia. “Poniatie entropii v teorii veroiatnostei.” Uspekhi matematicheskikh nauk, 1953, vol. 8, issue 3.
Kolmogorov, A. N. Teoriia peredachi informatsii. Moscow, 1956. (Academy of Sciences of the USSR. Session on the scientific problems of the automation of production. Plenary session.)
Peterson, W. W. Kody, ispravliaiushchie oshibki. Moscow, 1964.
(Translated from English.)

IU. V. PROKHOROV

Mentioned in

References in periodicals archive

Here, we point out that current optical communication systems are treated in the framework of classical information theory. However, optical communication can be treated in both classical and quantum information theory as follows (Figs.

Finite-block-length analysis in classical and quantum information theory

Lacan's encounter with American game theory, cybernetics and information theory, she proposes, was central to his rethinking of Freud in the mid-1950s.

I, Robot

Intriguing connections to physics and other academic disciplines could be made by extending Information Theory to the relatively new field of Quantum Information Theory.

An academic curriculum proposal

The automated rule induction process uses information theory to produce general statements or rules from the examples.

Optimizing iron quality through artificial intelligence

Unlike the deterministic sciences once favoured by orthodox Modernists, however, the sciences that now aroused such intense excitement and speculation were the postwar new sciences of complex systems - cybernetics and its cousins, information theory, general systems theory and the theory of self-organising systems.

Visible and invisible complexities

In contrast, several other approaches have considered the measure of lexical complexity based on information theory. Information theory, as originally conceptualized by Shannon and Weaver (1949), is explicitly concerned with the transmission of signals and symbols in a unidirectional communication context such as that involving a mediated channel (cf.

When copy complexity can help ad readership

Information theory provides a useful framework for assessing the interest rate sensitivity of prospective and current cardholders.(18) The theory postulates that consumers will continue to seek information about the prices and attributes of a product up to the point at which the additional cost of obtaining information equals the additional benefit they may gain from their extra search effort.

Developments in the pricing of credit card services

Thus, whereas the monetarist view focuses on bank panics and monetary aggregates to explain crises, the asymmetric information theory looks at more particular microeconomic failures in institutions or markets.

'Financial Markets and Financial Crises.'

The novels are virtual floods of factual detail -- physics and electronics, history, cybernetics, information theory, etc.

Pynchon, Thomas