## Theoria Combinationis Observationum Erroribus Minimis Obnoxiae. Pars prior [– Posterior]. [Offered with:] Supplementum Theoria Combinationis Observationum Erroribus Minimis Obnoxiae.

Göttingen: Dieterich, 1823-1828.

First edition, extremely rare separately-paginated offprints, of Gauss’s definitive presentation of the theory of least squares which “was the dominant theme – the leitmotif – of nineteenth-century mathematical statistics” (Stigler, p. 11). “Surveying problems motivated Gauss to develop his ideas on least squares and more general problems of what is now called mathematical statistics. The result was the definitive exposition of his mature ideas in the *Theoria combinationis obseruationum erroribus minimis obnoxiae* (1823, with supplement in 1828)” (DSB). The *Supplementum* is particularly important for the new methods of linear algebra and numerical analysis which Gauss introduces in the course of his analysis of least squares. The principle of least squares arose from the problem of combining sets of overdetermined equations to form a square system that could be solved for the unknowns. Although Gauss had discovered the method of least squares during the last decade of the eighteenth century and used it regularly after 1801 in astronomical calculations, it was Legendre who first published it, in an appendix to his *Nouvelles méthodes pour la détermination des orbites des comètes* (1805), but he provided no justification for the method. In 1809, toward the end of his treatise *Theoria motus corporum coelestium*, Gauss gave a probabilistic justification of the method, in which he essentially showed that if the errors are normally distributed then least squares gives maximum likelihood estimates. However, his reasons for assuming normality were tenuous, and Gauss himself later rejected the approach. Shortly thereafter, Laplace turned to the subject and derived the method of least squares from the principle that the best estimate should have the smallest mean of the absolute value of the error. Since the mean absolute error does not lead directly to the least squares principle, Laplace gave an asymptotic argument based on his central limit theorem. “In the 1820s Gauss returned to least squares in two memoirs, the first in two parts, published by the Royal Society of Göttingen under the common title *Theoria Combinationis Observationum Erroribus Minimis Obnoxiae*. In the *Pars Prior* of the first memoir, Gauss substituted the root mean square error for Laplace’s mean absolute error. This enabled him to prove his minimum variance theorem: of all linear combinations of measurements estimating an unknown, the least squares estimate has the greatest precision. The remarkable thing about this theorem is that it does not depend on the distributions of the errors, and, unlike Laplace’s result, it is not asymptotic. The second part of the first memoir is dominated by computational considerations. Among other things Gauss gives several formulas for the residual sum of squares, a technique for adding and deleting an observation from an already solved problem, and new methods for computing variances. The second memoir, called *Supplementum*, is a largely self-contained work devoted to the application of the least squares principle to geodesy. The problem here is to adjust observations so that they satisfy certain constraints, and Gauss shows that the least squares solution is optimal in a very wide sense” (Stewart, pp. ix-x). ABPC/RBH list one copy of the 1823 offprint (William Schab, Cat. 51, 1971) and none of the 1828 *Supplementum*.

In the 1823 work, “Gauss reviews the history of the method of the squares, beginning with his own early applications of the method, mentioning Legendre, and explaining his own argument (1809) for assuming a normal distribution of errors and choosing the most probable value in the posterior distribution of the perimeter as estimate.

“He remarks that Laplace thereafter (from 1811 on) considered the problem from a new point of view, by seeking the most advantageous combination of the observations instead of the most probable value of the parameter, and that Laplace proved the curious result that the method of least squares always and regardless of the distribution of the errors leads to the most advantageous combination when the number of observations can be considered as infinitely large.

“Gauss says that both of these justifications leave something to be desired. He has, therefore, starting from the same point of view as Laplace, taken up the problem again, and using the mean square error of estimation as criterion, he has shown that the method of least squares generally leads to the best combination of the observations, not as an approximation but exactly, whatever be the error distribution and the number of observations. He states that the problem of finding the combination having the smallest estimating error ‘is unquestionably one of the most important problems in the application of mathematics to the natural sciences.’

“He mentions that the estimating error may be measured by any even power of the error and that it is arbitrary which one to choose. Like Laplace he likens the estimation of a parameter with a game of chance in which one can lose but not win. The risk of such a game is evaluated by means of the sum of the products of the possible losses by their probabilities. Gauss concludes that it is not clear in itself what losses to associate with errors, we can only say that the loss should be a positive and increasing function of the absolute value of the error. Among the infinite number of such functions he says that the simplest without doubt is the square.

“He criticises Laplace’s choice of the absolute error on the following grounds: (1) it is like other choices arbitrary; (2) it is not continuously differentiable; (3) it seems natural to let the importance of an error increase in greater proportion than the error itself because two errors of the same size are considered to be less serious than one error of double the size; (4) the results of using the absolute error are less simple and satisfactory then for the square.

“Gauss follows Laplace in developing a theory of estimation based on the frequency distribution of the estimates; he does not use inverse probability and the normal distribution as in 1809. Only the second moment of the error distribution is assumed to exist, and within the class of linear estimates the one with the smallest mean square error is chosen as the best. Like Laplace he considers only estimates that coincide with the true value when the observations are free of error so that the estimates are unbiased …

“Gauss’s real contribution is the extension of Laplace’s theory to finite samples and his elegant proofs of theorems on the estimation of linear functions of the parameters, estimation under restriction of the parameters, recursive least-squares estimation, and the unbiased estimation of the variance. He solved all the estimation problems for the linear model of full rank …

“Gauss prefers his second justification of the method of the squares to the first, as indicated in a letter (1839) to Bessel … Gauss follows Laplace in preferring the frequentist theory of estimation to the previous one based on inverse probability, he even goes so far as to characterize the latter as metaphysics. Nevertheless, in his lectures on the method of least squares, he continued to present both proofs” (Hald, pp. 466-7).

“It is appropriate to conclude with a list of what was new in Gauss’s treatment of random errors.

- The careful distinction between systematic and random errors.
- The use of the first moment of a distribution to measure precision.
- The use of the second moment to measure precision.
- A Chebyshev-like inequality.
- The correct formula for the expectation of a function of a random variable.
- The rate of convergence of the sample mean and variance.
- The correct formula for estimating the precision of observations from the residual sum of squares.

“The climax of the *Pars prior* is Gauss’s proof of his minimum variance theorem (arts. 19-21)” (Stewart, p. 223).

“With Gauss’s 1823 paper a new style is introduced in mathematical statistics. His paper is divided into small sections each devoted to the proof of a single theorem. It is the great mathematician who presents his polished proofs with a minimum of motivation and explanation” (Hald, p. 467).

The *Supplement* (1828) is motivated by Gauss’s experience in analysing geodetic data. Gauss participated in the triangulation of the Kingdom of Hanover about 1820.

“The *Supplement* is a separate memoir devoted to the application of least squares to geodesy. Here the problem is not to estimate unknowns in an overdetermined set of equations but to adjust data to satisfy constraints. For example, the measured angles of a triangle must be adjusted to sum to 180° (with a correction for spherical excess). This memoir is given less than its due in the secondary literature, partly because Gauss’s commentators tend to run out of steam at this point, but also because Gauss’s oblique style is particularly demanding here. Nonetheless, it contains some of Gauss’s finest work on least squares …

“For the problems of adjusting data, Gauss proved a general minimum variance theorem that applies to all functions of the data. Gauss is also very close to generalizing the original minimum variance theorem of the *Pars prior* and proving that the most reliable estimate of any linear function of the unknowns is its value at the most reliable values of the unknowns … Whether Gauss made this connection is not easily decided … Arguably, he was aware of the general equivalence of the two problems and therefore was in possession of the generalized minimum variance theorem” (Stewart, pp. 232-5).

Stewart draws attention to the important advances in linear algebra and numerical analysis which Gauss makes in the course of his treatise on least squares.

“In deriving his results, Gauss proceeds with great economy … Gauss chose his notation so that relations between entire sets of equations stand out … But good notation is not enough. A more basic reason for Gauss’s success is that he had a clear notion of the operation of inversion … Gauss uses the phrase *eliminatio indefinite* for this operation. It refers to the process of obtaining the coefficients of the inverse of a linear system … The process is contrasted with *eliminatio definite*, which refers to the solution of a linear system by elimination. Thus, although Gauss did not possess the concept of a matrix, he had at hand one of the most important matrix operations – inversion – and he used it to good effect.

“Throughout the *Theoria combinationis* Gauss shows his ease with the ideas of linear algebra. For example, he proves (*Supplementum*, Art. 21) that the inverse of a symmetric system of equations is symmetric. In Art. 4 of the *Supplementum*, he makes a statement that is equivalent to saying that the null space of the transpose of a matrix is the orthogonal complement of its column space. However, Gauss’s strongest point in this regard was his ability to devise elegant, efficient algorithms to implement the procedures of least squares … With the introduction in the *Supplementum* (Art. 13) of the inner product forms of the elimination algorithm, Gauss will have said almost everything there is to say on algorithms for solving dense, positive-definite systems …

“Gauss partly anticipated another theme of modern matrix computations: the updating of previously computed quantities when the problem is changed. Specifically, he considers two problems. The first is to determine new estimates and variances from old ones when a new equation is added. The second is to do the same thing when the weight of an equation is changed …

“Finally, Gauss may have been the first to solve linear systems iteratively. In Arts. 18-20 of the *Supplementum*, he introduces what we today would call a block Gauss-Seidel iteration. The idea is to partition the equations (actually constraints) into groups and adjust the estimates cyclically to satisfy one group after another. Gauss does not prove convergence of his method, and he suggests that making a good choice of groups is something of an art …

“Gauss was probably the one who popularized the idea of reducing quadratic forms. In its various generalizations, it lead to the discovery of many matrix decompositions before the widespread use of matrices themselves … Gauss’s method for solving positive-definite systems had lasting influence. Gaussian elimination in Gauss’s notation continued to appear in astronomical and geodetic textbooks well into the twentieth century … The Gauss-Seidel iterative method and its variants also had a long and fruitful career. Because of its low storage requirements and its repetitive nature it came into its own in the early days of the digital computer and has not yet been entirely superseded” (Stewart, pp. 225-231).

Hald, *A History of Mathematical Statistics from 1750 to 1930*, 1998 (Hald devotes the whole of Chapter 21 to the *Theoria Combinationis).* Stewart (tr.), *Theory of the combination of observations least subject to error: part one, part two, supplement, by Carl Friedrich Gauss*, 1995. Stigler, *The History of Statistics*, 1986.

Two works bound in one, 4to (240 x 207mm), pp. [ii], [1], 2-58; [1-3], 4-44 (browning and foxing in the 1823 paper, a few brown spots and marginal pencil annotations in the Supplement). Contemporary marbled wrappers, waxed in the German style of the period (spine slightly worn). Very good copies of these extremely rare offprints.

Item #5422

**
Price:
$8,500.00
**