Chance, rational beliefs, decision, uncertainty, probability, error probabilities, truth, random sampling, resampling, opinion, expectations. These are some of the concepts we bandy about by giving various interpretations to mathematical statistics, to statistical theory, and to probabilistic models. But are they real? The question of “ontology” asks about such things, and given the “Ontology and Methodology” conference here at Virginia Tech (May 4, 5), I’d like to get your thoughts (for possible inclusion in a Mayo-Spanos presentation).* Also, please consider attending**.
Interestingly, I noticed the posts that have garnered the most comments have touched on philosophical questions of the nature of entities and processes behind statistical idealizations (e.g.,http://errorstatistics.com/2012/10/18/query/).
1. When an interpretation is supplied for a formal statistical account, its theorems may well turn out to express approximately true claims, and the interpretation may be deemed useful, but this does not mean the concepts give correct descriptions of reality. The interpreted axioms, and inference principles, are chosen to reflect a given philosophy, or set of intended aims: roughly, to use probabilistic ideas (i) to control error probabilities of methods (Neyman-Pearson, Fisher), or (ii) to assign and update degrees of belief, actual or rational (Bayesian). But this does not mean its adherents have to take seriously the realism of all the concepts generated. In fact ,we often (on this blog) see supporters of various stripes of frequentist and Bayesian accounts running far away from taking their accounts literally, even as those interpretations are, or at least were, the basis and motivation for the development of the formal edifice (“we never meant this literally”). But are these caveats on the same order? Or do some threaten the entire edifice of the account?
Starting with the error statistical account, recall Egon Pearson in his “Statistical Concepts in Their Relation to Reality” making it clear to Fisher that the business of controlling erroneous actions in the long run, acceptance sampling in industry and 5-year plans, only arose with Wald, and were never really part of the original Neyman-Pearson tests (declaring that the behaviorist philosophy was Neyman’s, not his). The paper itself may be found here. I was interested to hear (Mayo 2005) Neyman’s arch opponent, Bruno de Finetti, remark (quite correctly) that the expression “inductive behavior…that was for Neyman simply a slogan underlining and explaining the difference between his, the Bayesian and the Fisherian formulations” became with Abraham Wald’s work, “something much more substantial” (de Finetti 1972, 176).
Granted, it has not been obvious to people just how to interpret N-P tests “evidentially “ or “inferentially”—the subject of my work over many years. But there always seemed to me to be enough hints and examples to see what was intended: A statistical hypothesis H assigns probabilities to possible outcomes, and the warrant for accepting H as adequate—for an error statistician– is in terms of how well corroborated H is: how well H has stood up to tests that would have detected flaws in H, at least with very high probability. So the grounds for holding or using H are error statistical. The control and assessment of error probabilities may be used inferentially to determine the capabilities of methods to detect the adequacy/inadequacy of models, and express the extent of the discrepancies that have been identified. We also employ these ideas to detect gambits that make it too easy to find evidence for claims, even if the claims have been subjected to weak tests and biased procedures. A recent post is here.
The account has never professed to supply a unified logic, or any kind of logic for inference. The idea that there was a single rational way to make inferences was ridiculed by Neyman (whose birthday is April 16).
2. Proposed (“we never meant this literally”) withdrawals from the Bayesian interpretations do not seem so innocuous. Perhaps some will say this just shows my bias. Let me grant that the popular idea of interpreting prior probability distributions as non-subjective, in some sense or other, is not so radical (though I’d still want to know how to interpret posteriors and why). But what we usually see now is some blurring of the two: touting the advantage of Bayesian methods because they incorporate background beliefs, while also advertising “conventional” (default, reference, or “objective”) priors as having minimal influence on inference. [1] See “Grace and amen Bayesianism within this deconstruction. Also relevant: Irony and Bad Faith: Deconstructing Bayesians.
Perhaps the most popular view nowadays regards the prior as some kind of uninterpreted mathematical construct, merely serving to get a posterior. These same Bayesians, some of them, advocate “testing” the prior, but this is hard to grasp if we do not know what the priors intend to be, or stand for. Then there are those Bayesians, perhaps they are a radical (but influential) subgroup, who deny the machine of updating by Bayes theorem altogether. In Gelman (2011) (our special topic of RMM):
“Our key departure from the mainstream Bayesian view (as expressed, for example, [in Wikipedia]) is that we do not attempt to assign posterior probabilities to models or to select or average over them using posterior probabilities. Instead, we use predictive checks to compare models to data and use the information thus learned about anomalies to motivate model improvements.” (p. 71).
In Gelman and Robert (2013), we hear that a major source of Bayesian criticism comes from assuming “that Bayesians actually seem to believe their assumptions rather than merely treating them as counters in a mathematical game.” (p. 3) This comes as a surprise to those of us who thought the Bayesians really meant it. So what is the game being played?
[W]e make strong assumptions and use subjective knowledge in order to make inferences and predictions that can be tested by comparing to observed and new data (see Gelman and Shalizi, 2012, or Mayo, 1996 for a similar attitude coming from a non-Bayesian direction). (p. 3)
So maybe some kind of a “non-Bayesian checking of Bayesian models” would offer more a more promising foundation, at least for Gelman’s brand of “Bayesian falsificationism” (Gelman 2011). See my 2013 Comments on Gelman and Shalizi [2]. On the face of it, any inference, whether to the adequacy of a model (for a given purpose), or to a posterior probability, can be said to be warranted just to the extent that the inference has withstood severe testing: one with a high probability of having found flaws were they present. The ontology matters less than the epistemology.
Thus, the severity idea, could conceivably illuminate what’s going on with Gelman’s model checking; I find the idea promising, but do not really know what he thinks.
But to pursue such an avenue still requires reckoning with a fundamental issue at the foundations of Bayesian method: the interpretation of and justification for the prior probability distribution. Error statisticians use idealizations, but they are tightly constrained by the need for error probabilities, in a statistical model, to approximate the actual ones, even if only hypothetical, or checked by simulation. We are modeling real processes, not knowledge of processes.
Gelman and Robert (2013) allow:
“that many Bayesians over the years have muddied the waters by describing parameters as random rather than fixed. Once again, for Bayesians as much as for any other statistician, parameters are (typically) fixed but unknown. It is the knowledge about these unknowns that Bayesians model as random” (p. 4).
Bayesians will …assign a probability distribution to a parameter that one could not possibly imagine to have been generated by a random process, parameters such as the coefficient of party identification in a regression on vote choice, or the overdispersion in a network model, or Hubble’s constant in cosmology. There is no inconsistency in this opposition once one realizes that priors are not reflections of a hidden “truth” but rather evaluations of the modeler’s uncertainty about the parameter. (p. 3)
The choice, of course, is not between modeling a “hidden ‘truth’” and modeling “the modeler’s uncertainty”. Actually, in the majority of the examples I have seen, it seems better to imagine the parameter being generated by a random process. On the other hand, “the modeler’s uncertainty about the parameter” is one of the most unclear parts of Bayesian modeling. It is not that we can’t see measuring the degree of evidence, corroboration, severity of test, or the like, that is accorded a claim about a fixed parameter. We can and do. It is just that those measures will not be well represented as posterior or prior probabilities, obeying the probability calculus.
Possibly an idea I once proposed–a variation on a view held by the frequentist Reichenbach– can work (in EGEK, ch. 4 1996). Reichenbach suggested that scientists might eventually be able to assess the relative frequency with which a given type of hypothesis or theory is true. This might provide it a frequentist probability assignment. I don’t see how one could get such a relative frequency (or rather I can see many different reference sets that could be used), nor why knowing such quantities would be useful in appraising the evidence for a given hypothesis H. My variation (Chapter 4 Duhem, Kuhn, and Bayes, pp 120-4) is to consider the relative frequency with which evidence of a certain strength, (e.g., passing k tests with increasingly impressive error probabilities) is generated, despite H being false. This is attainable. But that of course take us to an error probabilistic assessment!
Maybe this style of Bayesianism doesn’t need a clear ontology so long as it’s got a clear epistemology. But does it?***
What do readers think?
*To see the full list of speakers: “Ontology and Methodology” conference. Actually our presentation will likely take a different tack, but I still want to hear your thoughts.
**Registration is free, but required, by April 20-25.
***I should say right off (for those who do not know) that my work is not in metaphysics, but on philosophical problems about inductive-statisical inference , experiment and evidence.My colleague (and co-conference organizer) Ben Jantzen is the “ontology” guy, and the third colleague involved, Lydia Patton, does O & M as well as HPS.
For further references, see those within posts and papers linked here, or search this blog.
De Finetti, B. (1972), Probability, Induction, and Statistics: The Art of Guessing. NY, Wiley.
Gelman, A. (2011). Induction and deduction in Bayesian data analysis. Rationality, Markets and Morals (RMM) 2, 67–78.
Gelman, A.and C. Shalizi. (Article first published online: 24 Feb 2012). “Philosophy and the Practice of Bayesian statistics (with discussion)”.British Journal of Mathematical and Statistical Psychology (BJMSP).
Gelman, A, and Robert, C. (2013). Not only defended but also applied: The perceived absurdity of Bayesian inference.
http://www.stat.columbia.edu/~gelman/research/published/feller8.pdf
Kass and Wasserman, L. (1996). The Selection of Prior Distributions by Formal Rules. Journal of the American Statistical Association 91, 1343-1370.
Mayo, D. G. (1996).[EGEK] Error and the growth of experimental knowledge. Chicago: University of Chicago Press.
_____ (2005). Evidence as passing severe tests: Highly probable vs. highly probed hypotheses. In P. Achinstein (Ed.), Scientific Evidence (pp. 95-127). Baltimore: Johns Hopkins University Press.
_____ (2011). Statistical science and philosophy of science: where do/should they meet in 2011 (and beyond)?” Rationality, Markets and Morals (RMM) 2, Special Topic: Statistical Science and Philosophy of Science, 79–102.
_____ (2013). Comments on A. Gelman and C. Shalizi: Philosophy and the practice of Bayesian statistics. British Journal of Mathematical and Statistical Psychology, forthcoming.
Mayo, D. and Cox, D. (2010). Frequentist statistics as a theory of inductive inference. In D. Mayo and A. Spanos (Eds.), Error and Inference: Recent Exchanges on Experimental Reasoning, Reliability and the Objectivity and Rationality of Science (pp. 247-275). Cambridge: Cambridge University Press. This paper appeared in The Second Erich L. Lehmann Symposium: Optimality, 2006, Lecture Notes-Monograph Series, Volume 49, Institute of Mathematical Statistics, 247-275.
Mayo, D. and Spanos, A. (2011). Error statistics. In P. Bandyopadhyay and M. Forster (Volume Eds.); D. M.Gabbay, P. Thagard and J. Woods (General Eds.). Philosophy of statistics: Handbook of philosophy of science Vol 7 (pp. 1-46). The Netherlands: Elsevier.
Pearson, E. S. (1955). Statistical concepts in their relation to reality. Journal of the Royal Statistical SocietyB 17, 204-207.
Senn, S. (2011). You may believe you are a Bayesian but you are probably wrong. Rationality, Markets and Morals (RMM) 2, Special Topic: Statistical Science and Philosophy of Science, 48-66.
[1] For a thorough account of problems with the latter, see Kass and Wasserman (1996).
[2] I take Gelman-Shalizi (2012) to be an attempt at a meeting of the minds between Bayesian Gelman and error statistical Shalizi. I may be wrong.