Co-relations and their measurement, chiefly from anthropometric data by Francis GALTON on SOPHIA RARE BOOKS

GALTON, Francis.

Co-relations and their measurement, chiefly from anthropometric data.

London: Royal Society, 1888. / Hardcover.

First edition, journal issue in original printed wrappers, of Galton’s invention of the statistical concept of correlation, among the most fundamental and ubiquitous ideas in statistics. Stephen Stigler, the leading historian of statistics, has stressed that correlation did not emerge suddenly or in isolation, but represented the culminating step of a roughly twenty-year research programme in which Galton—driven in part by problems of heredity and variation—gradually assembled the conceptual and graphical machinery that correlation would unify (ibid.). In more recent historiography, Stigler has cautioned against the tendency to treat Galton’s statistical innovations as a direct outgrowth of eugenics: although Galton coined the term “eugenics” in 1883, from the 1870s he pursued a line of inheritance research that was methodologically distinct, culminating in Natural Inheritance (1889), a work that does not even contain the word “eugenics” and includes nothing that could reasonably be classed as eugenical inquiry.

The correlation coefficient measures the strength of the linear relationship between two observed phenomena. It ranges from –1 to +1: the closer the coefficient is to +1 or –1, the stronger the relationship. The correlation is positive if increases in one variable tend to accompany increases in the other; it is negative if increases in one tend to accompany decreases in the other. As Stigler notes in his account of Galton’s exposition, Galton largely conceived correlation in positive terms, and negative correlations play no role in his discussion. Clauser places the 1888 paper within Galton’s unusually broad legacy in quantitative method: if a single individual can be singled out as a founder of behavioural and educational statistics, it is Galton—who supplied the term correlation (from “co-relation”), uncovered regression toward the mean, and helped fix the letter r as the conventional symbol for the correlation coefficient (Clauser, p. 440).

By 1886, as Stigler observes, the major components of what we now take to constitute correlation were already in place—most notably Galton’s “rather full” development of regression. Galton summarized much of this work in Natural Inheritance (1889). Yet correlation itself—both as a named concept and as a general, transportable measure—was still missing: strikingly, the word does not appear in Natural Inheritance (Stigler, p. 75). It was only late in 1888, after Galton had parted with the final proofs of that book, that the unifying idea crystallized.

Galton later recounted the discovery in a paper published in 1890 (“Kinship and correlation,” North American Review, vol. 150, pp. 419–431). He describes how he was then pursuing two investigations that appeared unrelated. One was an anthropological question: if a single thigh bone is recovered from an ancient grave, what does its length tell us about the stature of the individual? The other was a forensic question of criminal identification: what could be said about the relationships among measurements of different parts of the same person, which plainly are not independent for purposes of identification? Galton recognized that these problems were identical in structure and attacked them using a data set of measurements on 348 adult males. While plotting the data, he realized that these new problems were not only identical in principle with the earlier problem of kinship he had already solved, but that all three were special cases of a much more general problem—correlation.

There is, as Stigler remarks, an almost breathless urgency in Galton’s narrative. Galton feared that once Natural Inheritance appeared, the general idea would seem obvious to others and that he would be reproached for having overlooked it. He therefore rushed a paper to the Royal Society under the title “Correlation,” which was read before the book was published and appeared in print a few days earlier. The published title, however, is the present one—Co-relations and their measurement, chiefly from anthropometric data—and Galton soon adopted the more familiar spelling “correlation,” already common at the time and used in his subsequent writings. Later historical analysis has emphasized that this 1888 breakthrough arose not from eugenical study but from a practical problem in forensic anthropology: inferring the size of a missing bone from a partial skeleton.

To Galton, correlation meant what might now be called an intraclass correlation: two variables are correlated because they share a common set of influences. He illustrates this through its effect on dispersion, noting that the median difference in height between two random Englishmen is about 2.4 inches, whereas between two brothers it is about 1.4 inches. Galton offers three examples to make the concept concrete. One concerns kinship, which can appear unsatisfactory to modern readers accustomed to Mendelian genetics. The other two are strikingly lucid: the travel times of two clerks who share part of their journey on the same bus, and the stock portfolios of two investors who hold shares in some of the same ventures.

Galton uses these examples to emphasize key invariances. Correlation does not depend on the choice of origin. On scaling, he initially appears to hesitate, remarking that there may be a relation between stature and finger length but “no real correlation” because of differing scales. He quickly clarifies the point: by measuring quantities in units of their “probable error” (a median deviation for symmetric distributions), the relation becomes properly comparable, and correlation can be treated rigorously. He further insists that correlation applies only where variables have at least a “quasi-normal” distribution. Although enamoured of the “singularly beautiful” normal law, Galton nevertheless insists that distributional assumptions be checked—a caution too rarely imitated by later practitioners.

The paper includes what amounts to a definition of the correlation coefficient, though Galton calls it an “index of correlation” (the term “coefficient” being applied only later, by Edgeworth). Galton explains the definition by working through a numerical example and threading it through the already subtle notion of regression toward the mean. When variables are measured on the same scale, the ratio of regression measures correlation. When the scales differ, the situation is more complex. Considering the relationship between the length of the left middle finger and height, Galton shows that those whose finger length deviates from the average by one inch have heights that deviate by about 8.19 inches, whereas those whose heights deviate by one inch have finger lengths that deviate by only about 0.06 inches. The two regression lines are thus distinct relationships.

Returning to the anthropological question, Galton shows how correlation exposes the error of proportional rescaling then commonly used. If a thigh bone is 5% longer than average, one should not infer that the individual was 5% taller than average. Such proportional inference ignores regression and therefore overestimates, increasingly so as correlation weakens.

The paper concludes by suggesting broad applicability, especially to social questions such as the relationship between poverty and crime, and by inviting further investigation: there exists, Galton writes, a vast field of topics lying open under the laws of correlation for any competent investigator.

Galton (1822–1911) was Charles Darwin’s half-cousin (they shared the same grandfather, Erasmus Darwin). His early adult life was devoted to intensive study and travel, including the exploration and charting of Damaraland (part of modern Namibia) from 1850 to 1852. After his marriage to Louisa Butler his travels ceased, though his scientific activity intensified. Inspired by Darwin’s On the Origin of Species (1859), Galton turned to the study of inheritance, publishing Hereditary Genius (1869), which met with mixed reviews and persistent criticism. He continued to collect family data across humans and animals, pioneered the measurement of individual differences in physical and mental traits, and introduced innovative questionnaires on topics ranging from twins to mental imagery. His Anthropometric Laboratory, founded in 1891, was a precursor of the Department of Applied Statistics at University College London, later established by his colleague Karl Pearson.

Galton died convinced that a scientifically enlightened society should favour the reproduction of the most physically and psychologically able. He left the residue of his estate to University College London to establish the Galton Professorship of Eugenics and a laboratory devoted to the study of national eugenics. Later historians, however, have emphasized that whatever the moral evaluation of Galton’s social views, his statistical innovations of the 1880s—regression and correlation chief among them—were not inspired by eugenics and must be understood within the history of quantitative method rather than political doctrine.

Clauser, “The life and labors of Francis Galton,” Journal of Educational and Behavioral Statistics, vol. 32 (2007), pp. 440–444.
Stigler, “Francis Galton’s Account of the Invention of Correlation,” Statistical Science, vol. 4 (1989), pp. 73–79.

8vo (216 x 137 mm) pp. 135-145 in: Proceedings of the Royal Society of London, vol. 45 (1888), no. 274. The entire issue offered here in its original printed wrappers, extremeties with a little chipping, front wrapper detached.

Item #4698

Price: $4,500.00

Add to Cart Ask a Question Add to Wish List

See all items in Statistics, Probability Theory

See all items by Francis GALTON