Yesterday’s slight detour [i] presents an opportunity to (re)read Lindley’s “Philosophy of Statistics” (2000) (see also an earlier post). I recommend the full article and discussion. There is actually much here on which we agree.
The Philosophy of Statistics
Dennis V. Lindley
The Statistician (2000) 49:293-319
Summary. This paper puts forward an overall view of statistics. It is argued that statistics is the study of uncertainty. The many demonstrations that uncertainties can only combine according to the rules of the probability calculus are summarized. The conclusion is that statistical inference is firmly based on probability alone. Progress is therefore dependent on the construction of a probability model; methods for doing this are considered. It is argued that the probabilities are personal. The roles of likelihood and exchangeability are explained. Inference is only of value if it can be used, so the extension to decision analysis, incorporating utility, is related to risk and to the use of statistics in science and law. The paper has been written in the hope that it will be intelligible to all who are interested in statistics.
Around eight pages in we get another useful summary:
Let us summarize the position reached.
(a) Statistics is the study of uncertainty.
(b) Uncertainty should be measured by probability.
(c) Data uncertainty is so measured, conditional on the parameters.
(d) Parameter uncertainty is similarly measured by probability.
(e) Inference is performed within the probability calculus, mainly by equations (1) and (2) (301).
Then on 309:
“The position has been reached that the practical uncertainties should be described by probabilities, incorporated into your model and then manipulated according to the rules of the probability calculus. We now consider the implications that the manipulation within that calculus have on statistical methods, especially in contrast with frequentist procedures, thereby extending the discussion of significance tests and confidence intervals in section 6. It is sometimes said, by those who use Bayes estimates or tests, that all the Bayesian approach does is to add a prior to the frequentist paradigm. A prior is introduced merely as a device for constructing a procedure, that is then investigated within the frequentist framework, ignoring the ladder of the prior by which the procedure was discovered. This is untrue: the adoption of the full Bayesian paradigm entails a drastic change in the way that you think about statistical methods.” (309)
I agree (that the difference can be drastic). While frequentists (or sampling theorists or error statisticians) also assign probabilities to events, the role and interpretation of these probabilities for statistical inference differs notably from what Lindley advocates. Probability arises (i) to control, assess, reduce the error rates of procedures (in behavioristic contexts); and (ii) to quantify how reliably probed, severely tested, or well-corroborated claims are (in scientific contexts). Details are explained elsewhere (the blog can be searched).
Yet much of contemporary discussions of statistical foundations tends to minimize or discount the contrast, and would likely scoff at Lindley’s idea that a “drastic change” is associated with adopting any of the Bayesian (or frequentist) methodologies now on offer. This is a theme that has continually recurred in this blog.
For example, a common assertion is that in scientific practice, by and large, the frequentist sampling theorist (error statistician) ends up in essentially the “same place” as Bayesians, as if to downplay the importance of disagreements within the Bayesian family, as well as between the Bayesian and frequentist. This renders any subsequent claims to prefer the frequentist philosophy as just that—a matter of preference, without a pressing foundational imperative. Yet, even if one were to grant an agreement in numbers, it is altogether crucial to ascertain who or what is really doing the work. If we don’t understand what is really responsible for success stories, we cannot hope to improve methods, or get ideas for extending and developing tools in brand new arenas.
Many if not most will claim to be eclectic or pluralistic or the like—even with respect to methods for statistical inference (as distinct from model specification and decision). They say they do not have, and do not need, a clear statistical philosophy, even for the single context of scientific inference, which is my focus. That is fine, for practitioners. I would never claim there is any obstacle to practice in not having a clear statistical philosophy. But that is different from maintaining both that practice calls for recognition of underlying foundational issues, while also denying Bayesian-frequentist issues are especially important to them. Even if one or the other paradigm is chosen (perhaps just for a particular problem), there are still basic issues of warrant and interpretation within that paradigm.
We noted a common tendency for “default” Bayesians to profess reverence to subjective Bayesianism deep down (BADD), at a core philosophical level. But if their practice is at odds with the underlying philosophy, they still need to tackle the consequences that Lindley brings out (failing at any of (a) – (e)). So on this Lindley and I agree. Nor do I think it suffices to describe their methods as approximations to a Bayesian normative ideal. I think the methods in practice require their own principles or philosophy or whatever one likes to call it.
Some readers might say that it’s only because I’m a philosopher of science that I think foundations matter. Maybe. But, I think we must admit some fairly blatant “tensions” that sneak into day-to-day practice. For example there seems to be a confusion between our limits in achieving the goal of adequately capturing a given data generating mechanism, and making the goal itself be to capture our knowledge of, or degrees of belief in (or about), the data generating mechanism. The former may be captured by severity assessments (or something similar), but these are not posterior probabilities (even if one agrees with Lindley that the latter could be). Then there are some slippery slopes about objective/subjective, deduction/induction, and truth/idealizations, deliberately discussed on this blog. These are all philosophical issues, and they are clearly illuminated within frequentist-Bayesian contrasts.
Then there is the rationale for introducing priors. While one group of Bayesians insists we must introduce prior probability distributions (on an exhaustive set of hypotheses) if we are to properly take account of background knowledge (see Oct. 31 post); subjectively elicited priors are often seen as so hard to get, and so rarely to be trusted, that much work goes into developing conventional “default” priors that are not supposed to be expressions of uncertainty, ignorance, or degree of belief. We are back to the question Fisher asked long ago (1934, 287): if prior probabilities in hypotheses are intended to allow subjective background beliefs to influence statistical assessments of hypotheses, then why do we want them? If the priors are designed to have minimal influence on any inferences, then why do we need them? As remarked in Cox and Mayo (2010, p. 301):
“Reference priors yield inferences with some good frequentist properties, at least in one-dimensional problems – a feature usually called matching. Although welcome, it falls short of showing their success as objective methods. First, as is generally true in science, the fact that a theory can be made to match known successes does not redound as strongly to that theory as did the successes that emanated from first principles or basic foundations. This must be especially so where achieving the matches seems to impose swallowing violations of its initial basic theories or principles.
Even if there are some cases where good frequentist solutions are more neatly generated through Bayesian machinery, it would show only their technical value for goals that differ fundamentally from their own. But producing identical numbers could only be taken as performing the tasks of frequentist inference by reinterpreting them to mean confidence levels and significance levels, not posteriors.” (Cox and Mayo 2010)
I invite your comments, remarks and queries.
[i] Which got the highest # of hits of any post.