## ‘A Mathematical Theory of Communication,’ pp. 379-423 in Bell System Technical Journal, Vol. 27, No. 3, July, 1948 and pp. 623-656 in ibid., No. 4, October, 1948.

New York: American Telephone and Telegraph Company, 1948.

First edition, journal issue, of “the most famous work in the history of communication theory” (*Origins of Cyberspace*, 880), and rare in such fine condition. “Probably no single work in this century has more profoundly altered man's understanding of communication than C. E. Shannon’s article, ‘A mathematical theory of communication’, first published in 1948” (Slepian). “Th[is] paper gave rise to ‘information theory’, which includes metaphorical applications in very different disciplines, ranging from biology to linguistics via thermodynamics or quantum physics on the one hand, and a technical discipline of mathematical essence, based on crucial concepts like that of channel capacity, on the other” (DSB). On the first page of the paper is the first appearance of the term ‘bit’ for ‘binary digit.’ “A half century ago, Claude Shannon published his epic paper ‘A Mathematical Theory of Communication.’ This paper [has] had an immense impact on technological progress, and so on life as we now know it … One measure of the greatness of the [paper] is that Shannon’s major precept that all communication is essentially digital is now commonplace among the modern digitalia, even to the point where many wonder why Shannon needed to state such an obvious axiom” (Blahut & Hajek). “In 1948 Shannon published his most important paper, entitled ‘A mathematical theory of communication’. This seminal work transformed the understanding of the process of electronic communication by providing it with a mathematics, a general set of theorems rather misleadingly called information theory. The information content of a message, as he defined it, has nothing to do with its inherent meaning, but simply with the number of binary digits that it takes to transmit it. Thus, information, hitherto thought of as a relatively vague and abstract idea, was analogous to physical energy and could be treated like a measurable physical quantity. His definition was both self-consistent and unique in relation to intuitive axioms. To quantify the deficit in the information content in a message he characterized it by a number, the entropy, adopting a term from thermodynamics. Building on this theoretical foundation, Shannon was able to show that any given communications channel has a maximum capacity for transmitting information. The maximum, which can be approached but never attained, has become known as the Shannon limit. So wide were its repercussions that the theory was described as one of humanity’s proudest and rarest creations, a general scientific theory that could profoundly and rapidly alter humanity’s view of the world. Few other works of the twentieth century have had a greater impact; he altered most profoundly all aspects of communication theory and practice” (*Biographical Memoirs of Fellows of the Royal Society, *Vol. 5, 2009). Remarkably, Shannon was initially not planning to publish the paper, and did so only at the urging of colleagues at Bell Laboratories. This journal issue of Shannon’s great work precedes, and is rarer in commerce than, the *Bell Telephone System Technical Publications Monograph **(#B-1598:1948**), **the first separate publication. *No offprints of the BSTJ articles offered here are known.

*Provenance:* Regnar Svensson (ownership signature to front wrappers). Svensson was employed by the Dansk Industri Syndikat, a defence manufacturer in Copenhagen.

“Relying on his experience in Bell Laboratories, where he had become acquainted with the work of other telecommunication engineers such as Harry Nyquist and Ralph Hartley, Shannon published in two issues of the *Bell System Technical Journal* his paper ‘A Mathematical Theory of Communication.’ The general approach was pragmatic; he wanted to study ‘the savings due to statistical structure of the original message’ (p. 379), and for that purpose, he had to neglect the semantic aspects of information, as Hartley did for ‘intelligence’ twenty years before. For Shannon, the communication process was stochastic in nature, and the great impact of his work, which accounts for the applications in other fields, was due to the schematic diagram of a general communication system that he proposed. An ‘information source’ outputs a ‘message,’ which is encoded by a ‘transmitter’ into the transmitted ‘signal.’ The received signal is the sum of the transmitted signal and unavoidable ‘noise.’ It is recovered as a decoded message, which is delivered to the ‘destination.’ The received signal, which is the sum between the signal and the ‘noise,’ is decoded in the ‘receiver’ that gives the message to destination. His theory showed that choosing a good combination of transmitter and receiver makes it possible to send the message with arbitrarily high accuracy and reliability, provided the information rate does not exceed a fundamental limit, named the ‘channel capacity.’ The proof of this result was, however, nonconstructive, leaving open the problem of designing codes and decoding means that were able to approach this limit.

“The paper was presented as an ensemble of twenty-three theorems that were mostly rigorously proven (but not always, hence the work of A. I. Khinchin and later A.N. Kolmogorov, who based a new probability theory on the information concept). Shannon’s paper was divided into four parts, differentiating between discrete or continuous sources of information and the presence or absence of noise. In the simplest case (discrete source without noise), Shannon presented the [entropy] formula he had already defined in his mathematical theory of cryptography, which in fact can be reduced to a logarithmic mean. He defined the bit, the contraction of ‘binary digit’ (as suggested by John W. Tukey, his colleague at Bell Labs) as the unit for information. Concepts such as ‘redundancy,’ ‘equivocation,’ or channel ‘capacity,’ which existed as common notions, were defined as scientific concepts. Shannon stated a fundamental source-coding theorem, showing that the mean length of a message has a lower limit proportional to the entropy of the source. When noise is introduced, the channel-coding theorem stated that when the entropy of the source is less than the capacity of the channel, a code exists that allows one to transmit a message ‘so that the output of the source can be transmitted over the channel with an arbitrarily small frequency of errors.’ This programmatic part of Shannon’s work explains the success and impact it had in telecommunications engineering. The turbo codes (error correction codes) achieved a low error probability at information rates close to the channel capacity, with reasonable complexity of implementation, thus providing for the first time experimental evidence of the channel capacity theorem” (DSB).

“The landmark event that established the discipline of information theory and brought it to immediate worldwide attention was the publication of Claude E. Shannon's classic paper ‘A Mathematical Theory of Communication’ in the Bell System Technical Journal in July and October 1948. Prior to this paper, limited information-theoretic ideas had been developed at Bell Labs, all implicitly assuming events of equal probability. Harry Nyquist's 1924 paper, ‘Certain Factors Affecting Telegraph Speed,’ contains a theoretical section quantifying ‘intelligence’ and the ‘line speed’ at which it can be transmitted by a communication system, giving the relation *W *= *K *log *m *(recalling Boltzmann's constant), where *W* is the speed of transmission of intelligence, *m* is the number of different voltage levels to choose from at each time step, and *K* is a constant. Ralph Hartley's 1928 paper, ‘Transmission of Information,’ uses the word information as a measurable quantity, reflecting the receiver’s ability to distinguish one sequence of symbols from any other, thus quantifying information as *H* = log *S ^{n}* =

*n*log

*S*, where

*S*was the number of possible symbols, and

*n*the number of symbols in a transmission. The unit of information was therefore the decimal digit … Alan Turing in 1940 used similar ideas as part of the statistical analysis of the breaking of the German second world war Enigma ciphers. Much of the mathematics behind information theory with events of different probabilities were developed for the field of thermodynamics by Ludwig Boltzmann and J. Willard Gibbs” (Wikipedia, accessed 20 October 2018).

“Shannon saw the communication process as essentially stochastic in nature. The meaning of information plays no role in the theory. In the Shannon paradigm, information from a "source" (defined as a stochastic process) must be transmitted though a "channel" (defined by a transition probability law relating the channel output to the input). The system designer is allowed to place a device called an "encoder" between the source and channel which can introduce a fixed though finite (coding) delay. A "decoder" can be placed at the output of the channel. The theory seeks to answer questions such as how rapidly or reliably can the information from the source be transmitted over the channel, when one is allowed to optimize with respect to the encoder/decoder?

“Shannon gives elegant answers to such questions. His solution has two parts. First, he gives a fundamental limit which, for example, might say that for a given source and channel, it is impossible to achieve a fidelity or reliability or speed better than a certain value. Second, he shows that for large coding delays and complex codes, it is possible to achieve performance that is essentially as good as the fundamental limit. To do this, the encoder might have to make use of a coding scheme that would be too slow or complicated to be used in practice.

“One of Shannon's most brilliant insights was the separation of problems like these (where the encoder must take both the source and channel into account) into two coding problems. He showed that with no loss of generality one can study the source and channel separately and assume that they are connected by a digital (say binary) interface. One then finds the (source) encoder/decoder to optimize the source-to-digital performance, and the (channel) encoder/decoder to optimize the performance of the channel as a transmitter of digital data. Solution of the source and channel problems leads immediately to the solution of the original joint source-channel problem. The fact that a digital interface between the source and channel is essentially optimal has profound implications in the modem era of digital storage and communication of all types of information.

“Thus the revolutionary elements of Shannon's contribution were the invention of the source-encoder-channel-decoder-destination model, and the elegant and remarkably general solution of the fundamental problems which he was able to pose in terms of this model. Particularly significant is the demonstration of the power of coding with delay in a communication system, the separation of the source and channel coding problems, and the establishment of fundamental natural limits on communication.

“In the course of developing the solutions to the basic communication problem outlined above, Shannon created several original mathematical concepts. Primary among these is the notion of the "entropy" of a random variable (and by extension of a random sequence), the "mutual information" between two random variables or sequences, and an algebra that relates these quantities and their derivatives. He also achieved a spectacular success *with his *technique of random *coding, *in which he showed that an encoder chosen at random from the universe of possible encoders will, with high probability, give essentially optimal performance” (Sloane & Wyner, pp. 3-4).

“Shannon had the presight to overlay the subject of communication with a distinct partitioning into *sources*, *source encoders*, *channel encoders*, *channels*, and *associated channel and source decoders*. Although his formalization seems quite obvious in our time, it was not so obvious back then. Shannon further saw that channels and sources could and should be described using the notions of entropy and conditional entropy. He argued persuasively for the use of these notions, both through their characterization by intuitive axioms and by presentation of precise coding theorems. Moreover, he indicated how very explicit, operationally significant concepts such as the information content of a source of the information capacity of a channel can be identified using entropy and maximization of functions involving entropy.

“Shannon’s revolutionary work brought forth this new subject of information theory fully formed but waiting for the maturity that fifty years of aging would bring. It is hard to imagine how the subject could have been created in an evolutionary way, though after the conception its evolution proceeded in the hands of hundreds of authors to produce the subject in its current state of maturity …

“The impact of Shannon’s theory of information on the development of telecommunication has been immense. This is evident to those working at the edge of advancing developments, though perhaps not quite so visible to those involved in routine design. The notion that a channel has a specific information capacity, which can be measured in bits per second, has had a profound influence. On the one hand, this notion offers the promise, at least in theory, of communication systems with frequency of errors as small as desired for a given channel for any data rate less than the channel capacity. Moreover, Shannon’s associated existence proof provided tantalizing insight into how ideal communication systems might someday fulfil the promise. On the other hand, this notion also clearly establishes a limit on the communication rate that can be achieved over a channel, offering communication engineers the ultimate benchmark with which to calibrate progress toward construction of the ultimate communication system for a given channel.

“The fact that a specific capacity can be reached, and that no data transmission system can exceed this capacity, has been the holy grail of modern design for the last fifty years. Without the guidance of Shannon’s capacity formula, modern designers would have stumbled more often and proceeded more slowly. Communication systems ranging from deep-space satellite links to storage devices such as magnetic tapes and ubiquitous compact discs, and from high-speed internets to broadcast high-definition television, came sooner and in better form because of his work. Aside from this wealth of consequences, the wisdom of Claude Shannon’s insights may in the end be his greatest legacy” (Blahut & Hajek).

“The 1948 paper rapidly became very famous; it was published one year later as a book, with a postscript by Warren Weaver regarding the semantic aspects of information” (DSB). The book was titled *The Mathematical Theory of Communication*, a small but significant title change reflecting the generality of this work.

OOC 880. Blahut & Hajek, Foreword to the book edition, University of Illinois Press, 1998. Slepian (ed.), *Key papers in the development of information theory, Institute of Electrical and Electronics Engineers*, 1974. Brainerd, ‘Origin of the term *Bit*,’ *Annals of the History of Computing* 6 (1984), pp. 152-155. Sloane & Wyner (eds.), *Claude Elwood Shannon Collected Papers*, 1993.

Two complete journal issues, 8vo (228 x 153 mm), pp. 379-592; 593-751, [1]. Original printed wrappers (spines faded). Housed in custom cloth folder and morocco-backed slipcase.

Item #6187

**
Price:
$9,500.00
**