Artigo Revisado por pares

Sir David Cox: 1924–2022

2022; Royal Statistical Society; Volume: 185; Issue: 4 Linguagem: Inglês

10.1111/rssa.12964

ISSN

1467-985X

Autores

A. C. Davison, Valerie Isham, Nancy Reid,

Tópico(s)

Statistical Methods and Bayesian Inference

Resumo

David speaking at the RSS Conference in 2016 Sir David Cox FRS died on 18 January 2022 at the age of 97. He was the greatest statistician of his time, universally renowned for the breadth of his interests and the depth of his knowledge. He made fundamental contributions to almost all areas of applied probability and statistics. His work in statistics includes theoretical statistics and foundations of inference, experimental design, statistical methods, time series and involvement in a wide variety of applied domains, while his work in stochastic processes includes general theory, queues and point processes. He wrote some 20 books and well over 350 research papers, more than a third of which were written after he formally 'retired' in 1994, and he was still working up until his death. He was knighted by Queen Elizabeth in 1985. He received many academic prizes and honours including election as a Fellow of the Royal Society, as an Honorary Fellow of the British Academy and as a Foreign Member of the American and the Danish Academies of Sciences, as well as honorary membership of many other international academies and societies. He had honorary doctorates from 22 universities in more than a dozen countries, and had been president of the Bernoulli Society, the Royal Statistical Society (RSS) and the International Statistical Institute. He was awarded the Guy Medal in Silver by the RSS in 1961, and in Gold in 1973 'in recognition of his services to both the theory and practice of statistics over a wide range of topics, his services to the advancement of the subject under the aegis of the Royal Statistical Society and his standing as a statistician of international repute'. In 1990 he won the Kettering Prize and Gold Medal for Cancer Research 'for the development of the Proportional Hazard Regression Model' (the Cox regression model), and in 2010 he was awarded the Copley Medal of the Royal Society, which recognises outstanding research achievements across science, 'for his seminal contributions to the theory and applications of statistics' (earlier recipients include Charles Darwin, Paul Dirac, Ronald Fisher and Harold Jeffreys). Further major awards citing the Cox regression model were the BBVA Foundation Frontiers of Knowledge Award in basic sciences (2016), for the development of 'pioneering and hugely influential statistical methods that have proved indispensable for obtaining reliable results in a vast spectrum of disciplines from medicine to astrophysics, genomics or particle physics' and the first International Prize for Statistics (2017), the citation noting that 'his 1972 paper is one of the three most cited papers in statistics and is ranked 16th in Nature's list of the top 100 most cited papers of all time for all fields'. A numbered list of his publications is available online at Nuffield College Oxford, and references of the type [ x ] $$ \left[x\right] $$ are to that list. David Roxbee Cox was born in Birmingham on 15 July 1924, and was educated at Handsworth Grammar School, before going on to St John's College Cambridge in 1942, where he read mathematics. After 2 years exemption from military service, he went to the Royal Aircraft Establishment (RAE) at Farnborough in the Department of Structural and Mechanical Engineering, working mainly on the strengths of materials. Synchronicity played a big part in his next—life-changing—move at the end of the war. In 1946, immediately after reading an inspirational paper by Henry Daniels closely related to his RAE work, he happened to see an advertisement for a job in textile research working with Daniels at the Wool Industries Research Association (WIRA) in Leeds. Impulsively he chose not to return to Cambridge, and instead went to Leeds where he stayed until 1950, obtaining his PhD from the University of Leeds in 1949. When David was sent to the RAE as a statistician, despite having done almost no statistics in Cambridge, the idea was that any reasonably able mathematician could easily pick it up, an assumption that David described as 'totally false' (Reid, 1994), but that was amply justified in his case. His first book ([5]), with Alan Brearley, was published by WIRA in 1948 and ran to five editions by 1960. David's work at WIRA involved experimental design and data analysis as well as theoretical problems in applied mathematics and stochastic processes. It set him on a path he would follow for the rest of his life, addressing important practical problems with innovative models and methods that make elegant use of a broad range of mathematical techniques, as well as developing the underlying statistical theory and methodology. An early example is [6], which formed a chapter of his PhD thesis. This paper concerns properties of a model for periodic variations in thickness of a web of wool fibres that has passed through two sets of rollers moving at different speeds. It is a mixture of applied mathematics and time series analysis; the follow-up paper promised in the discussion never appeared. During his time at WIRA, David published extensively in wool-industry related journals and also in Biometrika, which he would later edit from 1966 to 1991, in the Proceedings of the Royal Society, of which he would be elected Fellow in 1973, and in the Journal of the RSS, of which he would become President from 1980 to 1982. The latter paper ([7]) includes the acknowledgement 'Mrs J. Cox (Miss J. Drummond) gave me a great deal of valuable help in the work'. David married Joyce Drummond in 1947, and they went on to have a daughter, Joan, and three sons, John, Andrew and Steven. For more than 70 years, Joyce loyally and steadfastly supported David through his long and distinguished career. This cannot always have been easy—her patience must sometimes have been sorely tried by the never-ending queues of people waiting for 'a quick word' with David at statistical gatherings—but her demeanour remained unfailingly serene. In 1950, David left WIRA to begin his academic career, taking up an appointment as an Assistant Lecturer with the University of Cambridge. In Reid (1994) he describes both the frustrations and difficulties of living on an assistant lecturer's salary without any security of tenure and the scientific excellence of colleagues and students. In 1955 he and Joyce moved to the United States, where David took up visiting positions at Princeton, the University of North Carolina, and the University of California at Berkeley. David particularly enjoyed the energy and enthusiasm of academic life in the United States, as well as the standard of living, which contrasted sharply with that of post-war Britain. However, in 1956 the family moved to London, where David would spend the next 32 years, first as Reader and then Professor of Statistics at Birkbeck College and then (from 1966) as Professor of Statistics at Imperial College. Between these two appointments, he returned to work in the United States for a short while at Bell Laboratories. David served as Head of the Mathematics Department at Imperial College from 1969 to 1973 and then in 1988, at close to conventional retirement age in UK universities, he took up the position of Warden of Nuffield College, Oxford, becoming an Honorary Fellow of the College upon his retirement in 1994. At this time his office became very much smaller and more cramped, with the space under the couch (and every other level surface) an invaluable place for filing papers, but the work continued unabated. If anything, he travelled more ('of course you know I never go anywhere, but …' was the invariable response to a question about imminent plans), wrote more (six books and over 150 papers), and responded even more to requests for advice and help. For a while 'the badgers' seemed to take over his life—he worked tirelessly for almost 10 years as a member of the Department of the Environment, Food and Rural Affairs (DEFRA)'s Independent Scientific Group on bovine tuberculosis; parts of the final report (Bourne et al., 2007) are obviously written by David. Characteristic features of David's research are innovation, elegance and deceptive simplicity. He had formidable powers of intuition and brought an exceptionally broad knowledge of applied mathematics, probability and statistics to bear on a huge variety of applications. This, his deep understanding and his capacious memory spurred his creative imagination: as one discussant of his 1955 read paper ([36]) said, 'on several occasions, on the first reading of the paper, I thought I had found a possible alternative not mentioned, but on turning to the next page there it was, with three others I had not thought of'. His papers and books are written in a concise and almost instantly recognisable style, where every word counts. David's extraordinary energy and industry, and his passion for statistics, will dominate our assessment of his contributions below. But David was not 'all work, all the time'—he was very widely read, particularly on science and scientific biographies, and on music. He loved attending operas (especially Mozart and Wagner), concerts and theatre in the United Kingdom and on his travels around the world. Vacations were sometimes 'tacked on' to conference visits but Yorkshire always held a very special place in his affections for family holidays. We have found it impossible to do anything approaching justice to his research contributions in this brief article, but have tried to give some flavour of them in the following sections. Following his move to Cambridge in 1950, much of David's early research was in the field of stochastic processes, an interest prompted by Henry Daniels and encouraged by attending lectures from Maurice Bartlett in Manchester while at WIRA, as well as by his applied experience. Looking back, he said 'In those days, while it was a difficult subject to work in, it wasn't highly technical' (Reid, 1994). Stochastic processes were to become a lifelong interest and the subject of five of his books. The seminal paper [36] focusses on statistical methods for events occurring randomly in time or space; the applications of such theory are extremely wide-ranging. The paper discusses in detail the stochastic point processes giving rise to these events and how their properties can be used for inference. The Poisson process, which plays a comparable role to that of the normal distribution in statistical theory (e.g., as the limiting process under superposition), is the natural starting point. In section 4 an extension to a random rate function is proposed and various special cases discussed. This doubly stochastic Poisson process would come to be known universally (except by David) as the Cox process. Arguably, this intuitively attractive, tractable and widely applicable process plays as important a role for stochastic processes as does the much-feted Cox proportional hazards model in survival analysis. Versions of such processes in which the logarithm of the rate function is a Gaussian process, so-called log-Gaussian Cox processes, are now routinely fitted using computational methods that were unthinkable when the paper was written. Other papers in the 1950s on stochastic processes included three papers by Cox and Smith ([26], [27], [33]) on renewal theory and on the superposition of point processes. Walter Smith was a Cambridge research student, supervised jointly by David Cox and Henry Daniels, who obtained his PhD in 1953 for his thesis on series of events. A few years later, a monograph by Cox and Smith on queueing theory appeared ([63]), and a pattern of forming a fruitful collaboration with an ex-research student, culminating in the publication of a research monograph, was established. This was to be repeated many times during David's career. Just a year later, David published a second monograph on renewal theory ([68]), and followed this not long afterwards with a third on the statistical analysis of series of events ([83], with Peter Lewis, an IBM computer scientist who had been on a doctoral fellowship supervised by David at Birkbeck College during 1962–1964). Like its 1955 precursor ([36]), this covers point process models, their properties, and statistical methodology. During the first half of the 1960s, David also wrote a ground-breaking textbook on stochastic processes ([76]) with Hilton Miller, a colleague of David's at Birkbeck and later at Imperial College. The text, intended for mathematics and statistics students and for research workers, introduces the main mathematical techniques useful for analysing stochastic processes that arise in a broad range of scientific applications. Thus, in the 5-year period between 1961 and 1966 and in addition to 23 (mostly single author) papers on a range of other topics, David published three monographs encapsulating the results of previous research in a highly succinct and accessible form, together with a substantial textbook on stochastic processes. A further monograph on point processes appeared later ([136]). These books were intended to be read and studied as well as used for reference, and they came complete with exercises. All have become literature standards, used by generations of postgraduate students learning the analytical tools needed to gain essential insights and understanding of the properties of stochastic processes. They remain as valuable today as when they were written more than half a century ago. Although the bulk of David's publications in the late 1960s concerned statistical theory and methodology, his interest in stochastic processes, and especially point processes, continued unabated, and a seminar series on the topic at Imperial College led to an influential conference in 1971 at the IBM Research Centre. The resulting conference volume (Lewis, 1972), which included [108], defined the state of the art in point processes. Other publications during this time on stochastic processes include [110] on multivariate point processes, an RSS read paper on multi-server queues with appointments ([103]) with David Hinkley, and one on low traffic approximations for queues ([105]) with Peter Bloomfield; both Hinkley and Bloomfield were research students supervised by David. The point-process theme continued later in the 1970s, in collaboration with Valerie Isham, another ex-research student, first with a paper on a longstanding problem of covariance counting in physics ([125]), another on a process of controlled variability used by seismologists ([131]), and their research monograph mentioned above ([136]). In 1988 David chaired the Department of Health Working Group on HIV infection and AIDS, which produced a report on short-term predictions for England and Wales ([181]), and a special edition of Philosophical Transactions of the Royal Society on the supporting methodology ([190], [191s], [191]). The incubation period, that is, the interval between infection with HIV and diagnosis of AIDS, is a key quantity in predicting the course of the epidemic, and further papers by David on its estimation appeared in a wide range of journals ([177], [186], [199]). A long and fruitful period of collaboration and friendship in a very different applied area followed a serendipitous meeting with Ignacio Rodríguez-Iturbe, a distinguished Venezuelan hydrologist who was awarded the Stockholm Water Prize in 2002. The first of a series of papers ([173]) considered a highly idealised model for the spatial distribution of total storm rainfall, in which rain cell origins are located according to a spatial point process, the cells have a random depth and spread function, and may overlap. This point-process model was extended to a temporal process at a single site ([178], [187]) where the cell origins cluster in time, and then to fully spatial-temporal cluster processes ([183], [228], [248]). An important feature of these models is that the cells have random but finite temporal durations and spatial extents, enabling realistic representation of dry periods and/or regions. These models have helped to solve a range of catchment-based hydrological problems linking stochastic models for rainfall fields to distributed models for soil run-off and stream flow, have been successfully applied in many different climatological regimes, and have had a substantial impact on hydrological practice and policy. Related work with hydrological application concerned soil moisture, starting from a purely temporal process in which random jump transitions (due, e.g., to rainfall) interrupt periods of deterministic decay ([170]), and further developed as a fully spatial-temporal process ([261], [307], [311]), allowing for the effects of different climate and vegetation regimes and incorporating biomass dynamics. In hydrology, as in much of his applied work, David's characteristic approach was analytical rather than numerical: first to abstract the essence of the problem, using a simple mathematical formulation of the underlying process, followed by clever and original use of a wide range of techniques from applied mathematics to determine properties algebraically that could then be used to solve important practical problems. While most of the stochastic process applications with which David was concerned focussed on modelling the timings of series of events, rather than the events per se, he also made contributions to the more classical analysis of time series. In particular [141] introduced the now-standard distinction between parameter- and observation-driven time series and provides a substantial discussion of data interpretation, with attention on the topic of long-range dependence, an interest of David's since the WIRA days. His publications on the topic include [162], [195], [200] and the delightful and insightful presentation on dependence in 'big data' at his 90th birthday conference [361]. Planning of Experiments ([45]), still in print, has been a remarkably influential book. As described in Reid (1994), David's first thought was to write up his rather theoretical notes from a course given in Cambridge, but he decided instead to address the book to scientists, focussing on the key ideas, and keeping the mathematics to a bare minimum. The Cambridge notes later formed the basis for the more theoretical treatment in [265]. David had varied practical experience in design during his time at WIRA, and that early training may well have helped him develop his remarkable intuition for all aspects of data collection. This infused his writing throughout his career: for example, his ingenious approach to identifying confidence sets of models in high-dimensional regression is based on balanced incomplete block designs ([371], [374]) and was foreshadowed by brief comments in [90]. Settings with ' p > n $$ p>n $$ ' were not unfamiliar to him—in 1962 he described computer-generated super-saturated factorial designs ([66]). His presentation to the RSS on the occasion of its 150th anniversary was entitled 'Present Position and Potential Developments: Some Personal Views. Design of Experiments and Regression' ([163]), and closes with 'outline statements of one or two open problems'—there are 22 problems listed. Although he was somewhat sceptical about the practical utility of optimal designs, [111] considers them in the context of both interpolation and extrapolation, with an emphasis on 'satisfying additional requirements such as leading to good estimates of parameters'. Design for model selection is discussed in [112]. David was sought after for many expert panels and working groups in different contexts—when once asked what it took to be knighted for services to statistics, he replied 'serve on an infinite number of committees in zero time'. For example, the pair of papers by Peto el al. ([120], [126]) gave a definitive tutorial on the design and analysis of clinical trials, and [288] summarises work on the risks associated with mobile phone use. His work for DEFRA mentioned above involved a design over heterogeneous spatial areas to monitor the effect of badger culling on the spread of bovine tuberculosis—a study that led to unanticipated results ([294], [310], [323], [324]), and was used to illustrate the importance of planning for the unexpected in Principles of Applied Statistics ([346]). In addition to his work on what might be called classical design of experiments, David made very many contributions to more general aspects of the design of investigations, including sampling in various contexts and the design of observational studies. Special mention is due the masterful book with Ruth Keogh on the design and analysis of case-control studies ([358]). David is best known for the 1972 RSS discussion paper that introduced the 'Cox model' (though he himself never used this term), but his other contributions to statistical methods were so extensive that it would be quicker to list the areas that he left untouched. Although the 1972 paper is widely and correctly seen as a startling breakthrough, in retrospect it can be seen to draw on several earlier themes of his work: point process models, likelihood inference and regression. In particular there is a natural connection between the complete intensity function of a point process (i.e., the conditional rate of events given its past history) and the proportional hazards model in survival analysis. The crucial insight, of basing a 'partial likelihood' on comparisons within risk sets at each observed failure, came to David while feverish. Its correctness and practical value were immediately clear and others rapidly made software available, but the theoretical details were not clarified until 1975 in another paper ([118]) that has also been much-cited both for itself and as a spur to the development of specialised likelihood functions. David modestly described the 1972 paper, the second most-cited in the statistical literature, as being cited a 'fairly large' number of times and added the typically modest disclaimer 'although no doubt read rather less often'. It led to his being awarded the 1990 Kettering Prize and Gold Medal for Cancer Research, and stimulated major streams of research over the following decades, including the now-standard use of point process and martingale methods in the theoretical study of methods for time-to-event data and semi-parametric likelihood inference. Although the proportional hazards model is David's best-known scientific contribution, he generally preferred parametric models to semi-parametric or non-parametric ones, on the grounds that a judicious, parsimonious, formulation of a statistical model would often lead to more insight. Indeed, his 1984 monograph on survival data with David Oakes ([165]) gives a broad non-technical discussion in which a good deal of space is devoted to parametric models. David's 1958 RSS discussion paper 'The regression analysis of binary sequences' ([48]) was the first to fully exploit the exponential family properties of logistic regression for dichotomous response data, and is a tour de force, using elements of sampling theory to simplify the then-onerous computations involved in testing and model-fitting, and describing the use of conditioning to eliminate nuisance parameters. Like many of his other papers, it contains much more than the idea for which it has become best-known, and includes, for example, models for binary time series, as arise in psychological experiments on animal learning. Subsequent papers ([49], [107]) consider binary matched pairs and multivariate binary data; the loss of information due to discarding consonant matched pairs was later investigated in [379]. The logistic regression model is the central focus of his 1970 monograph The Analysis of Binary Data ([98]), in which he advocated weighted least squares estimation with the empirical logistic transform to palliate computational difficulties; David later commented that this book should have been written earlier, but in view of his other activities it is hard to imagine how this could have been feasible. Maximum likelihood fitting was used more widely in a 1989 second edition written jointly with Joyce Snell that remains a standard text ([193]). Generalised linear modelling (Nelder & Wedderburn, 1972) had mushroomed between the two editions and the associated iterative weighted least squares algorithm facilitated the fitting of logistic and related models. An overview paper ([90]) read to the RSS in March 1968 points out that binomial, gamma and Poisson regression models are all linear exponential families, so the inferential simplicity that David described in [98] for logistic regression can be directly generalised; this may have inspired John Nelder, who was present. Just 1 week earlier David, jointly with Joyce Snell, had read 'A general definition of residuals' ([93]) which gave an implicit definition of residuals for general regression models and investigated corrections taking account of parameter estimation; this may be the only occasion on which an author has read papers to the RSS on two successive Wednesdays! Despite David's strong advocacy of the logistic model for its range of applicability, the direct interpretation of linear regression for binary responses gave it enduring appeal for him, and this too was later investigated ([376]). Another contribution to regression analysis that has had a long afterlife is the 1964 RSS discussion paper 'An analysis of transformations', jointly written with George Box ([64]), whose Bayesian perspective is interleaved with the likelihood approach generally preferred by David. After surveying the assumptions underlying the linear model and techniques for checking them, the paper considers the use of parametric transformations including the now-ubiquitous Box–Cox function, which encompasses power-law and logarithmic response transformations. The calculations are readily reformulated in terms of linear model fits, and this led to the rapid uptake of the idea. The Bayesian version requires a data-dependent prior, which David did not regard as an intrinsic flaw, but it seems to be much less used. Directness of interpretation of the resulting analysis, and in particular additivity of effects, is kept in mind throughout, and the paper comments that the method 'is not, of course, to be followed blindly'. Subsequent asymptotic analysis by Bickel and Doksum (1981) pointed out that the variances of parameter estimates can become very large if the transformation is treated as unknown, a finding seen by Box and Cox as 'qualitatively obvious and at the same time scientifically irrelevant' ([145]). The issue reared its head again in discussion of parameter orthogonality ([176]). David's early industrial experience gave him a deep appreciation of the influence of different sources of variability both in experiments and more widely, and the subtleties involved in the treatment of variance components infuses his writing. His book with Patty Solomon ([283]) provides a deep but mainly non-technical account of this topic, illustrated with applications from textile processing to microarray analysis. David's move to Nuffield College as Warden in 1988 introduced him to many aspects of quantitative social science and thus to problems that he often remarked were very intellectually stimulating. Around this time he began a long, wide-ranging and remarkably original collaboration with Nanny Wermuth on multivariate data from observational studies in areas such as psychology, education and medicine, best approached through their 1996 book Multivariate Dependencies ([235]) and their review paper ([224]). In addition to the development of models and methods for analysis of complex observational data involving both intermediate and endpoint responses, their joint work included discussions of causality (e.g., [273], [274], [298]) and several key results in theoretical statistics, such as quadratic exponential models for binary data ([229]), likelihood factorizations with discrete and continuous variables ([260]), aspects of marginalization ([292]), and constructions of special graphs ([266]). Much of David's theoretical and methodological work arose out of his interest in applications; where the gestation of an idea is directly related to an applied problem, the latter is sketched in the resulting article and an acknowledgement given, as for example to an 'anonymous comment at a meeting of the General Applications Section, Royal Statistical Society' ([152]). As mentioned above, he was involved in substantial applications throughout his career and he published in a wide range of non-statistical journals. Lessons drawn from his applied experience were summarized in the opening article in the inaugural issue of Annals of Applied Statistics ([319]), and at greater length in two books. Applied Statistics ([144]), published in 1981 jointly with Joyce Snell and stemming from teaching for the MSc Statistics at Imperial College, consists of a concise and highly illuminating discussion of general principles followed by a series of carefully chosen case studies. The second, Principles of Applied Statistics ([346]) with Christl Donnelly, appeared in 2011. This book covers topics rarely included in similar texts, includes a huge range of outline examples, and well repays careful reading and re-reading. All his general writing on applied statistics stresses both the 'desirability of an intimate union between subject-matter and statistical aspects of an investigation' ([319]) and that arriving at secure conclusions depends more on scientific good sense than on technical mastery of complex methods. David's 1958 paper ([46]) was his first to concentrate on the foundations and theory of inference. Exceptionally clearly written, and bursting with new ideas, it continues to be regularly cited. In the abstract he writes modestly 'It consists of some general comments, few of them new, abou

Referência(s)