Mayo Commentary on Gelman & Robert

The following is my commentary on a paper by Gelman and Robertforthcoming (in early 2013) in the The American Statistician* (submitted October 3, 2012).

_______________________

mayo 2010 conference IphoneDiscussion of Gelman and Robert, “Not only defended but also applied”: The perceived absurdity of Bayesian inference”
Deborah G. Mayo

1. Introduction

I am grateful for the chance to comment on the paper by Gelman and Robert. I welcome seeing statisticians raise philosophical issues about statistical methods, and I entirely agree that methods not only should be applicable but also capable of being defended at a foundational level. “It is doubtful that even the most rabid anti-Bayesian of 2010 would claim that Bayesian inference cannot apply” (Gelman and Robert 2012, p. 6). This is clearly correct; in fact, it is not far off the mark to say that the majority of statistical applications nowadays are placed under the Bayesian umbrella, even though the goals and interpretations found there are extremely varied. There are a plethora of international societies, journals, post-docs, and prizes with “Bayesian” in their name, and a wealth of impressive new Bayesian textbooks and software is available. Even before the latest technical advances and the rise of “objective” Bayesian methods, leading statisticians were calling for eclecticism (e.g., Cox 1978), and most will claim to use a smattering of Bayesian and non-Bayesian methods, as appropriate. George Casella (to whom their paper is dedicated) and Roger Berger in their superb textbook (2002) exemplify a balanced approach.

What about the issue of the foundational defense of Bayesianism? That is the main subject of these comments. Whereas many practitioners see the “rising use of Bayesian methods in applied statistical work” as being in support of a corresponding Bayesian philosophy, Gelman and Shalizi (2012) declare that “most of the standard philosophy of Bayes is wrong” (p. 2). The widespread use of Bayesian methods does not underwrite the classic subjective inductive philosophy that Gelman associates (correctly) with the description of Bayesianism found on Wikipedia: “Our key departure from the mainstream Bayesian view (as expressed, for example, [in Wikipedia]) is that we do not attempt to assign posterior probabilities to models or to select or average over them using posterior probabilities. Instead, we use predictive checks to compare models to data and use the information thus learned about anomalies to motivate model improvements.” (p. 71).

From the standpoint of this departure, Gelman and Robert defend their Bayesian approach against Feller’s view “that Bayesian methods are absurd—not merely misguided but obviously wrong in principle” (p. 2).

Given that Bayesian methods have inundated all teaching and applications, a reader might at first be puzzled by the authors’ choice to consider Feller’s 1950 introduction to probability, the text of which gives a page or two to “Bayes Rule.” Noting that “before the ascendance of the modern theory, the notion of equal probabilities was often used as synonymous for ‘no advance knowledge,’” Feller questions the “ ‘law of succession of Laplace’ connected with this” (Feller 1950, pp. 124-125 of the 1970 edition). The authors readily concede: “[I]t would be accurate, we believe, to refer to Bayesian inference as being an undeveloped subfield in statistics at that time, with Feller being one of the many academics who were aware of some of the weaker Bayesian ideas but not of the good stuff” (p. 4).

Yet the authors have a deeper reason to examine Feller. As they reiterate, what strikes them “about Feller’s statement was not so much his stance as his apparent certainty” (p. 3). They “doubt that Feller came to his own considered judgment about the relevance of Bayesian inference…. Rather, we suspect that it was from discussions with one or more statistician colleagues that he drew his strong opinions about the relative merits of different statistical philosophies” (p. 6).

Whether or not their suspicion of Feller is correct, they have identified a common tendency in foundational discussions of statistics simply to be swayed by colleagues and oft-repeated criticisms, rather than arriving at one’s own considered conclusion. Also to their credit, their defense is not “defensive.” Indeed, in some ways they raise stronger criticisms of Bayesian standpoints than Feller himself:

In the last half of the twentieth century, Bayesians had the reputation (perhaps deserved) of being philosophers who were all too willing to make broad claims about rationality, with optimality theorems that were ultimately built upon questionable assumptions of subjective probability, in a denial of the garbage-in-garbage-out principle, thus defying all common sense. In opposition to this nonsense, Feller (and others of his time) favored a mixture of Fisher’s rugged empiricism and the rigorous Neyman-Pearson theory, which “may be not only defended but also applied.” (p. 17)

Perhaps Bayesians have gotten over the reputation cited by the authors of “being philosophers who were all too willing to make broad claims about rationality,” but, by and large, philosophers have not. I regard the most important message of their paper as being a call for a change from all players (p. 15).

2. Probabilism in contrast to sampling theory standpoints

Tellingly, the authors begin their article by observing that “[y]ounger readers of this journal may not be fully aware of the passionate battles over Bayesian inference among statisticians in the last half of the twentieth century” (p. 2). They are undoubtedly correct, and that alone attests to the predominance of Bayesian methods and pro-Bayesian arguments in statistics courses. By contrast, few readers are unaware of the litany of criticisms repeatedly raised regarding statistical significance tests, confidence intervals, and the frequentist sampling-theory justifications for these tools. We heartily share their sentiment:

At the very least, we hope Feller’s example will make us wary of relying on the advice of colleagues to criticize ideas we do not fully understand. New ideas by their nature are often expressed awkwardly and with mistakes—but finding such mistakes can be an occasion for modifying and improving these ideas rather than rejecting them. (p. 17)

The construal of Neyman-Pearson statistics that is so widely lampooned reflects Neyman and Pearson’s very early attempt to develop a formalism that would capture the Fisherian and other methods used at the time. As Pearson remarks in his response to Fisher’s (1955) criticisms: “Indeed, from the start we shared Professor Fisher’s view that in scientific enquiry, a statistical test is ‘a means of learning’” (Pearson 1955, 206).

Underlying one of the philosophers’ examples Gelman and Robert discuss (the doomsday argument) “is the ultimate triumph of the idea, beloved among Bayesian educators, that our students and clients don’t really understand Neyman-Pearson confidence intervals and inevitably give them the intuitive Bayesian interpretation.” The idea “beloved among Bayesian educators” does not merely assert that probability should enter to provide posterior probabilities—an assumption we may call probabilism– it assumes that the frequentist error statistician also shares this goal. Thus, whenever error probabilities, be they p-values or confidence levels, disagree with a favored Bayesian posterior, this is alleged to show that frequentist methods are self-contradictory, and thus unsound.

For example, the fact that a frequentist p-value can differ from a Bayesian posterior (in two-sided testing, assuming one or another prior) has been regarded as showing that p-values overestimate the evidence against a (point) null (e.g., Berger 2003). That a sufficiently large sample size can result in rejecting a null deemed plausible by a Bayesian is thought to show the logical unsoundness of significance testers (Howson 1997a, 1997b).[i] Assuming that confidence levels are to give posterior probabilities to the resulting interval estimate, Jose Bernardo declares that non-Bayesians “should be subject to some re-education using well known, standard counter-examples such as the fact that conventional 0.95-confidence regions may actually consist of the whole real line” (2008, 453). The situation with all of these alleged “counterexamples” looks very different when error probabilities associated with methods are employed in order to indicate the parameter values that are or are not well indicated by the data (e.g., Mayo 2003, 2005, 2010). Error probabilities are not posteriors, but refer to the distribution of a statistic d(X)—the so-called sampling distribution (hence the term sampling theory). Admittedly, this alone is often claimed to be at odds with mainstream (at least subjective) Bayesian methods where consideration of outcomes other than the one observed is disallowed (i.e., the likelihood principle [LP]), at least once the data are available. In Jay Kadane’s recent text: “Neyman-Pearson hypothesis testing violates the likelihood principle, because the event either happens or does not; and hence has probability one or zero” (Kadane 2011, 439).

It often goes unrecognized that criticisms of frequentist statistical methods assume a certain philosophy about statistical inference (probabilism), and often allege that error-statistical methods can achieve only radical behavioristic goals, wherein only long-run error rates matter. Feller, in declaring that “the modern method of statistical tests and estimation is less intuitive but more realistic,” also reveals the common tendency to assume a philosophy of probabilism (Feller 1950, pp. 124-125 of the 1970 edition). Our own intuitions go in a different direction: what is intuitively required are ways to quantify how well tested claims are, and how precisely and accurately they are indicated. Still, we admit that good error probabilities while necessary, do not automatically suffice to satisfy the goal of capturing the well-testedness of inferences.

However, when we try to block the unintuitive inferences, for example, by conditioning on error properties that are relevant for assessing well-testedness, “there is a catch” (Ghosh, Delampady, and Semanta 2006, 38): we seem to be led toward violating other familiar frequentist principles (sufficiency, weak conditionality), at least according to a famous argument (by Allan Birnbaum in 1962). Once again, critics place us in a self-contradictory position, but we argue that the frequentist is simply presented with a false dilemma, and that “the ‘dilemma’ argument is therefore an illusion” (Cox and Mayo 2010).

While the text by Gelman et al. (2003) is a noteworthy exception, it is standard for texts to list, in addition to the above “counterexamples,” an assortment of classic fallacies (conflating statistical and substantive significance, fallacies of insignificant results, fallacies of rejection), which, to echo the authors’ point about Feller, stem from often-heard strong opinions of frequentist methods, overlooking how frequentists have responded. The current situation in statistical foundations may present an opportunity to reconsider them, free of the traditional frameworks both of Bayesian and frequentist statistics. The appeal to a testing notion may also be relevant to justify the Bayesian account that Gelman and Robert advance.

3. A Testing Defense for Bayesianism? 

The authors correctly suspect that what has bothered mathematicians such as Feller comes from assuming “that Bayesians actually seem to believe their assumptions rather than merely treating them as counters in a mathematical game. . . . [T]his interpretation may be common among probabilists, whereas we see applied statisticians as considering both prior and data models as assumptions to be valued for their use in the construction of effective statistical inferences” (p. 8).

Rather than believing their assumptions, the authors suggest that they test them:

[W]e make strong assumptions and use subjective knowledge in order to make inferences and predictions that can be tested by comparing to observed and new data (see Gelman and Shalizi, 2012, or Mayo, 1996 for a similar attitude coming from a non-Bayesian direction). (p. 9)

So perhaps some kind of a “non-Bayesian checking of Bayesian models” (Gelman and Shalizi 2012, 11) would offer more promise than attempts at a reconciliation of Bayesian and frequentist ideas by way of long-run performance properties.

To pursue such an avenue, one still must reckon with a fundamental issue at the foundations of Bayesian method: the interpretation of and justification for the prior probability distribution, the use of which is arguably what distinguishes it from frequentist error statistics. To their credit, the authors concede “that many Bayesians over the years have muddied the waters by describing parameters as random rather than fixed. Once again, for Bayesians as much as for any other statistician, parameters are (typically) fixed but unknown. It is the knowledge about these unknowns that Bayesians model as random” (pp. 15-16).

Although many illustrations enable an intuitive grasp of what they seem to have in mind, viewing the knowledge of fixed unknowns as random, if it is to sit at the foundations, calls for explication. The authors are right to observe that most statisticians are comfortable with probability models:

Bayesians will go the next step and assign a probability distribution to a parameter that one could not possibly imagine to have been generated by a random process, parameters such as the coefficient of party identification in a regression on vote choice, or the overdispersion in a network model, or Hubble’s constant in cosmology. There is no inconsistency in this opposition once one realizes that priors are not reflections of a hidden “truth” but rather evaluations of the modeler’s uncertainty about the parameter. (pp. 9-10; emphasis mine)

But it is precisely the introduction of “the modeler’s uncertainty about the parameter” that is so much at the heart of questions involving the understanding and justification of Bayesian methods. It would be illuminating to hear the authors’ take on the different conceptions of and debates about this “modeler’s uncertainty” about a parameter. Arguably, the predominant uses of Bayesian methods come from those who advocate “objective” or “default” or “reference” priors (we use the neutral term “conventional” Bayesians, but any preferred term will do). Yet contemporary conventional Bayesians have worked assiduously to develop priors that are not supposed to be considered expressions of uncertainty, ignorance, or degree of belief; they are “mathematical concepts” of some sort used to obtain posterior probabilities. While subjective Bayesians urge us to incorporate background information into the analysis of a given set of data by means of a prior probability on alternative hypotheses (perhaps attained through elicitations of degrees of belief), some of the most influential Bayesian methods in practice invite us to employ conventional priors that have the most minimal influence on resulting inferences, letting the data dominate.  Conventional priors, unlike what might be expected from measures of initial uncertainty in parameters, are model-dependent, leading to Bayesian incoherence, “leading to violations of basic principles, such as the likelihood principle and the stopping rule principle” (Berger 2006, 394). Even within the conventional Bayesian school, there are many from which to choose: priors based on the asymptotic model-averaged information differences (between the prior and the posterior); matching priors that yield optimal frequentist methods, and others besides (Berger 2006; Kass and Wasserman 1996). Cox (2006) summarizes some of the concerns he has often articulated:

[T]he prior distribution for a particular parameter may well depend on which is the parameter of primary interest or even on the order in which a set of nuisance parameters is considered. Further, the simple dependence of the posterior on the data only via the likelihood regardless of the probability model is lost. If the prior is only a formal device and not to be interpreted as a probability, what interpretation is justified for the posterior as an adequate summary of information. (p.77).

Bayesian testing seems to be in a state of flux. The authors’ invitation to test Bayesian models, including priors, is welcome; but the results of testing are clearly going to depend on explicating the intended interpretation of whatever is being tested.

Elsewhere it is suggested that there need not be a uniquely correct conventional nor subjective prior, it may be a “combination of the prior distribution and the likelihood, each of which represents some compromise among scientific knowledge, mathematical convenience, and computational tractability” (Gelman and Shalizi 13).  (Without presuming Robert concurs, we assume the authors endorse some latitude in interpreting priors.) There is no problem with the prior serving many functions, so long as its particular role is pinned down for the case at hand (Mayo 2013). These authors correctly argue that the assumptions of the likelihood are also just that—assumptions–but we still need to understand what is being represented. If the prior and likelihood is regarded as a holistic model, it is still possible to test for adequacy; but to pinpoint the source of any misfits would seem to require more.

Finally, if we agree with these authors that they key goal is “to make inferences and predictions that can be tested by comparing to observed and new data,” we need a notion of adequate/inadequate tests. A basic intuition is that a test have a good capacity, or at least some capability, of detecting inadequacies and flaws in whatever is being tested. The philosophy of statistics we favor employs frequentist error probabilities to appraise and ensure the probative capacity, or severity, of tests, being sensitive to the actual data and claim to be inferred. Admittedly, in developing this statistical philosophy, mistakes and shortcomings in the typical behavioristic construal of frequentist methods were used as “an occasion for modifying and improving these ideas rather than rejecting them”, to echo these authors. Possibly this can offer a non-traditional avenue for a philosophical defense of the Bayesian testing these authors advance.

4. Concluding remarks

Bayesian methods are widely applied, but when the discussion turns to foundations there is some question as to whether the success stories are properly credited to mainstream philosophical subjective Bayesianism.  Gelman and Robert, if we understand them, deny this. Failure to consider an alternative defense for widely used Bayesian methods is at the heart of criticisms that continue. Stephen Senn (2011, p.58) calls attention to: “A very standard form of argument … frequently encountered in many applied Bayesian papers where the first paragraphs laud the Bayesian approach on various grounds, in particular its ability to synthesize all sources of information”, and in the rest of the paper the authors engage in non-Bayesian inexplicit reasoning.  The objection loses its force, if some non-standard or even non-Bayesian defense is involved, but that is something that requires development. We do not deny there is an epistemological foundation for the authors’ Bayesian approach, only that the foundations for Bayesian testing is in some flux and deserves attention.

Our take-home message in a nutshell is this: contemporary Bayesianism is in need of new foundations; whether they are to be found in non-Bayesian testing, or elsewhere. Hopefully philosophers of probability will turn their attention to these tantalizing problems of statistics. In contrast to the heady golden era of philosophy of statistics of 25 or 40 years ago, contemporary philosophers of science are far more focused on probability than statistics. While some of the issues have trickled down to the philosophers, by and large we see ‘formal epistemology’ assuming the traditional justifications for probabilism that are being questioned by contemporary statisticians, Bayesian and non-Bayesian. Gelman and Robert are among the philosophically-minded statisticians who are taking the lead[ii].  As practitioners, it suffices that their methods are useful and widely applied; we philosophical under laborers should be helping to make explicit underlying philosophical defenses.

*Some very small editorial corrections are missing from what was first posted (e.g., it’s their paper, and not the whole issue that is dedicated to Cassella). Elbians will correct and update this.

References

 Berger, J. O. (2003). Could Fisher, Jeffreys and Neyman have agreed on testing? Statistical Science, 18, 1–12.

_____ (2006). The case for objective Bayesian analysis; and RejoinderBayesian Analysis, 1(3), 385–402; 457–464.

Bernardo, J. M. (2008). Comment on article by GelmanBayesian Analysis, 3(3), 451–454.

Birnbaum, A. (1962). On the foundations of statistical inference. In S. Kotz & N. Johnson (Eds.), Breakthroughs in statistics, (Vol.1, pp. 478-518). Springer Series in Statistics, New York: Springer-Verlag. First published (with discussion) in Journal of the American Statistical Association, 57, 269–306.

Casella, G., and Berger, R. L. (2002).  Statistical inference (2nd ed.). Pacific Grove, CA: Duxbury Press.

Cox, D. R. (1978). Foundations of statistical inference: The case for eclecticism. Australian Journal of Statistics, 20(1), 43-59. Knibbs Lecture, Statistical Society of Australia, 1977.

_____ (2006). Principles of statistical inference. Cambridge: Cambridge University Press.

_____ and Mayo, D. G. (2010). Objectivity and conditionality in frequentist inference. In D. Mayo & A. Spanos (Eds.), Error and inference: Recent exchanges on experimental reasoning, reliability, and the objectivity and rationality of science (pp. 276-304). Cambridge: Cambridge University Press.

Feller, W. (1950). An introduction to probability theory and its applications. New York: Wiley.

Fisher, R. A. (1934). Two new properties of mathematical likelihoodProceedings of the Royal SocietyA, 144, 285-307.

_______ (1955). Statistical methods and scientific inductionJournal of the Royal Statistical SocietyB, 17, 69-78.

Gelman, A. (2011). Induction and deduction in Bayesian data analysisRationality, Markets and Morals (RMM) 2, 67–78.

_______, J. B. Carlin, H. S. Stern and D. B. Rubin (2003).  Bayesian Data Analysis, 2nd ed., London: Chapman and Hall Press.

_______ and C. Shalizi. (Article first published online: 24 FEB 2012). “Philosophy and the Practice of Bayesian statistics (with discussion)”.British Journal of Mathematical and Statistical Psychology (BJMSP).

_______, and Robert, C. (forthcoming). Not only defended but also applied: The perceived absurdity of Bayesian inference.

Ghosh, J. K., Delampady, M., and Samanta, T. (2006). An introduction to Bayesian analysis. New York: Springer.

Howson, C. (1997a). A logic of inductionPhilosophy of Science 64, 268–90.

_______ (1997b). Error probabilities in error. Philosophy of Science 64, 194.

Kadane J. (2011). Principles of uncertainty. Boca Raton: Chapman & Hall.

Kass, R. (2011). Statistical Inference: The Big Picture. Statistical Science 26, 1-9.

_______ and Wasserman, L. (1996). The Selection of Prior Distributions by Formal Rules. Journal of the American Statistical Association 91, 1343-1370.

Mayo, D. G. (1996). Error and the growth of experimental knowledge. Chicago: University of Chicago Press.

_____ (2003). Could Fisher, Jeffreys and Neyman have agreed on testing? Commentary on J. Berger’s Fisher address. Statistical Science 18, 19-24.

_____ (2005). Evidence as passing severe tests: Highly probable vs. highly probed hypotheses. In P. Achinstein (Ed.), Scientific Evidence (pp. 95-127). Baltimore: Johns Hopkins University Press.

_____ (2010). An error in the argument from conditionality and sufficiency to the likelihood principle. In D. Mayo and A. Spanos (Eds.), Error and inference: Recent exchanges on experimental reasoning, reliability, and the objectivity and rationality of science (pp. 305-314). Cambridge: Cambridge University Press.

_____ (2011). Statistical science and philosophy of science: where do/should they meet in 2011 (and beyond)?Rationality, Markets and Morals (RMM) 2, Special Topic: Statistical Science and Philosophy of Science, 79–102.

_____ (2013). Comments on A. Gelman and C. Shalizi: Philosophy and the practice of Bayesian statistics. British Journal of Mathematical and Statistical Psychology, forthcoming.

_______ and Cox, D. (2010). Frequentist statistics as a theory of inductive inference. In D. Mayo and A. Spanos (Eds.), Error and Inference: Recent Exchanges on Experimental Reasoning, Reliability and the Objectivity and Rationality of Science (pp. 247-275). Cambridge: Cambridge University Press. This paper appeared in The Second Erich L. Lehmann Symposium: Optimality, 2006, Lecture Notes-Monograph Series, Volume 49, Institute of Mathematical Statistics, 247-275.

_______  and Spanos, A. (2011). Error statistics. In P. Bandyopadhyay and M. Forster (Volume Eds.); D. M.Gabbay, P. Thagard and J. Woods (General Eds.). Philosophy of statistics: Handbook of philosophy of science Vol 7 (pp. 1-46). The Netherlands: Elsevier.

Pearson, E. S. (1955). Statistical concepts in their relation to reality.  Journal of the Royal Statistical SocietyB 17, 204-207.

Senn, S. (2011). You may believe you are a Bayesian but you are probably wrong. Rationality, Markets and Morals (RMM) 2, Special Topic: Statistical Science and Philosophy of Science, 48-66.


1 Relevant references are far too numerous, but are well known; please see, for example, Mayo 1996; Mayo and Spanos 2011.

2 An incomplete, contemporary list includes G.Casella, D.R.Cox, J.Berger, R. Berger, J. Bernardo, R. Kass, S. Senn, C. Shalizi, L. Wasserman.

Categories: frequentist/Bayesian, Statistics | 24 Comments

Post navigation

24 thoughts on “Mayo Commentary on Gelman & Robert

  1. Very nice discussion.
    One small disagreement:
    You write:

    “Given that Bayesian methods have inundated all teaching and applications”

    I don’t think this is true.
    In both education and practice, I think Bayesian inference
    remains a (vocal) minority position.

    –Larry

    • Larry: Thanks. All I can say is that I think one has to read my entire commentary as a whole, together with, or after reading, Gelman and Robert’s paper.

      • On the subject, I would be interested to hear recommendations of any new general/introductory, graduate level texts in statistics that would be considered to take a frequentist (error statistical) stance. Of course “inundated” has its meaning…

  2. TheIronLady

    Uh … ok, I nominate the Laplace-Jeffries-Jaynes line of the theory for the “new” Bayesian foundation. The book “Probability Theory: The Logic of Science” gives a decent philosophical exposition of it.

    Interestingly, Laplace’s initial applications concerned whether an observed wobble resided in a heavenly body or in the telescope itself. Since the above summary, fair and just as far as it goes, shows no awareness of Jaynes, the wobble you detect in the foundations is in the telescope and not in Bayesian Statistics itself.

    Since you like testing, you’ll be especially glad to know that Jaynes tests his philosophical understanding of probability in a very wide of circumstances far away from the usual simple examples found in introductory text books (something which definitly cannot be said for ‘severity’).

    I believe there are still major results left to be discovered in Statistics and that philosophy will play a decisive role in their discovery. This is probably a controversial opinion for several reasons (most people’s research seems to be a minor – albeit difficult – iteration on the same old ideas). But I’m convinced both that it’s true and that those discoveries will be made by those developing the Laplace-Jeffries-Jaynes line.

    If you disagree, I invite you look at Jaynes book and papers and compare them to any exposition of Error Statistics (which philosophically speaking, is the state of the art in the foundations of Frequentist Statistics)

    • You’ve moved to the “iron lady” now?

      • TheIronLady

        I’m a fan. Since you’re the Iron Lady of Frequentist Statistics, I assume you’re response goes something like: “You turn if you want to. The lady’s not for turning!”

        • I didn’t like the movie. Far too much emphasis on describing a senile woman, never mind how amazing Streep’s make-up was.

  3. Christian Hennig

    I think that your discussion is very good and hits the nail on its head.

  4. Christian Hennig

    ThelronLady: Regardless of the high quality of Jaynes’s work, I think that the key issue in the context of the present discussion is whether the “pragmatic” Bayesian approach, potentially informed by error probabilities, as advocated by Gelman and Robert and applied by many statisticians, can be seen as “Jaynes’s philosophy in practice”. I very much doubt that. If you look at most applied statistical work coming from this, one could call it, “new wave” of Bayesians, you hardly find justifications of priors along the lines of Jaynes’s work.

    • Christian: Are you suggesting the “pragmatic” Bayesian line that informs much of applied (Bayesian) work is “informed by error probabilities” in the manner that we/I would advocate? or merely that the pragmatic Bayesians take into account “performance of some long-run”? You’re right of course about Jaynes.

      • Christian Hennig

        Mayo: When I wrote “potentially informed by error probabilities”, I made reference to what Gelman explicitly wrote in some places, which you know. So there is nothing that you don’t know already in my remark, I’m afraid.

    • TheIronLady

      Jaynes and Gelman only sometimes touch on the same stuff, but when they do they’re in strong agreement. Gelman has been explicit about Jaynes influence on his posterior model checking for example. Subject to the usual caveats that one or both may get some technical matter wrong, I don’t think Jaynes would have any problem with Gelman’s work at all.

      “you hardly find justifications of priors along the lines of Jaynes’s work.”

      You don’t see people justifying their distributions much at all regardless of what kind they are. They just say “assume NIID”. It’s always been a mystery why such assumptions work in practice. Even Frequentists like Feller admitted that they work much better in practice than they should according to their official Frequentist justification.

      This is fine to some extent, but if you want to extend the applicability of statistics, at some point you’ll have to know the real reason why such assumptions work. That’s where Jaynes (or some developement thereof) comes in.

      • Iron Lady: I don’t know/care if you are the Iron Lady, Fisher, Neyman Pearson, Anon, Guest, Gelman or someone else, but the bottom line is that you are not taking up the challenges posed in my post. You repeat that your hero and one true love is Jaynes, and you say here that Gelman is like Jaynes (so far as they’re doing the same thing, which might be very little, see Hennig’s post below), but recall that Gelman also purports to be striving to satisfy error statistical criteria of the sort I espouse (see also Gelman and Shalizi on this blog). Gelman pokes fun at the assumption “beloved among Bayesian educators” that probability should enter to provide posterior probabilities—an assumption we may call probabilism–and in this he would seem to also be poking fun at Jaynes. Gelman, at least, has often claimed to reject the idea of inductive updating to reach a posterior, (see also his paper in the RMM volume discussed on this blog) and further he claims to be doing something akin to Mayo-style error probabilities (one finds it mentioned also in the Gelman –Robert paper, although I don’t think it sits with Robert’s view). The bottom line is: Jaynes, so far as I can see, is seriously at odds with these points. If you’re just looking for a space to give Jaynes yet another shout out, as appears, then you’re not seriously advancing the discussion on this blog at all. Shifting your names around has not helped in that regard.

  5. Christian Hennig

    TheIronLady: For the moment the “Probability Theory” book is my main source of information about Jaynes, and I don’t find anything on calibration or model checking in it. These seem to be quite central for Gelman. Of course I’m happy if you broaden my horizon by pointing me to where Jaynes wrote on these issues.

    It is true that the problem that model assumptions are often not sufficiently justified is not an exclusive Bayesian one, but at least frequentists don’t have to justify a prior (so they only have half the work justifying their models) and it is pretty clear how a frequentist model choice could be rejected by data. It is also clear in principle how a consistent de Finettian or Jaynesian *should* justify a prior, but I’m not so sure about Gelman and “new wave” Bayes.

    • Corey

      Christian: Andrew has written of Jaynes :

      …some of his work inspired me greatly. In particular, I like his approach of assuming a strong model and then fixing it when it does not fit the data… I don’t think Jaynes ever stated this principle explicitly but he followed it in his examples. I remember one example of the probability of getting 1,2,3,4,5,6 on a roll of a die, where he discussed how various imperfections of the die would move you away from a uniform distribution… he didn’t just try to fit the data; rather, he used model misfit as information to learn more about the physical system under study.

    • Christian: how *should* a consistent de Finettian or Jaynesian justify a prior. Would the former introspect?

  6. Christian Hennig

    Corey: Thanks for this. However, no testing done in this paper involves a prior distribution of parameters. Neither is such a distribution stated, nor tested, nor is there any calibration considered where models fit on past data are evaluated against new data. One could take this as a fine and insightful paper on frequentist testing (namely the use of entropy as a test statistic) were it not for a few philosophical remarks.

    Of course this may have inspired Gelman to come up with something else, but this is very different from claiming that Jaynes’s philosophy somehow explains what Gelman is doing.

  7. Corey

    Christian: I agree with you; but it seems to me that your comment fails to address any of the points raised by TheIronLady (if that was your intention). Jaynes is pretty clear in the intro to PTLOS that he intends his philosophy to encompass both Bayesian updating and maximum entropy within the notion of probability theory as the logic of science.

  8. Christian Hennig

    Corey: In a nutshell, Mayo called for a new philosophy underlining/explaining the new Bayesian approach advocated by Gelman and Robert. TheIronLady claimed that Jaynes provides this. My point was that Jaynes may explain his own Bayesianism very well but not the one of Gelman/Robert, which has a number of elements that don’t have to do with Jaynes.

    • E.Berk

      Christian: Not understanding how Mayo can be calling “for a new philosophy underlining/explaining the new Bayesian approach advocated by Gelman and Robert.” I may be mistaken, but Gelman and Robert sound very dissimilar. They are quoted here that priors are “evaluations of the modeler’s uncertainty about the parameter”, but Gelman comes up on this blog as denying this. Even here, in Mayo’s commentary,“Gelman and Shalizi (2012) declare that ‘most of the standard philosophy of Bayes is wrong’ (p. 2). The widespread use of Bayesian methods does not underwrite the classic subjective inductive philosophy.” Would Robert agree to this? There was a deconstruction not too long ago where Gelman says that “a Bayesian wants everybody else to be a non Bayesian” and not to factor in their knowledge states. Not sure Robert concurs with this. Gelman denies Bayesian induction; Robert appears to endorse it. Their article here includes two contrasting positions. This does not invalidate their case against Feller, but it does speak against looking for a meaningful, unified philosophy.

      • Christian Hennig

        OK, you can then even ask for more than one. 😉

        • E.Berk

          Christian: Ha! But if they do not cohere with each other, this is problematic for understanding what is being called for, and what is being advocated.

  9. Corey

    Christian: When I started writing this reply I was under the impression that you were being overly specific about what Mayo was calling for and therefore unfair to TheIronLady regarding her failure to provide it. Upon rereading the comment thread, I see that you are right.

Leave a reply to E.Berk Cancel reply

Blog at WordPress.com.