Philosophy of Statistics

Blogging Birnbaum: on Statistical Methods in Scientific Inference

I said I’d make some comments on Birnbaum’s letter (to Nature), (linked in my last post), which is relevant to today’s Seminar session (at the LSE*), as well as to (Normal Deviate‘s) recent discussion of frequentist inference–in terms of constructing procedures with good long-run “coverage”. (Also to the current U-Phil).

NATURE VOL. 225 MARCH 14, 1970 (1033)

LETTERS TO THE EDITOR

Statistical methods in Scientific Inference

 It is regrettable that Edwards’s interesting article[1], supporting the likelihood and prior likelihood concepts, did not point out the specific criticisms of likelihood (and Bayesian) concepts that seem to dissuade most theoretical and applied statisticians from adopting them. As one whom Edwards particularly credits with having ‘analysed in depth…some attractive properties” of the likelihood concept, I must point out that I am not now among the ‘modern exponents” of the likelihood concept. Further, after suggesting that the notion of prior likelihood was plausible as an extension or analogue of the usual likelihood concept (ref.2, p. 200)[2], I have pursued the matter through further consideration and rejection of both the likelihood concept and various proposed formalizations of prior information and opinion (including prior likelihood).  I regret not having expressed my developing views in any formal publication between 1962 and late 1969 (just after ref. 1 appeared). My present views have now, however, been published in an expository but critical article (ref. 3, see also ref. 4)[3] [4], and so my comments here will be restricted to several specific points that Edwards raised. Continue reading

Categories: Likelihood Principle, Statistics, U-Phil | 5 Comments

Likelihood Links [for 28 Nov. Seminar and Current U-Phil]

old blogspot typewriterDear Reader: We just arrived in London[i][ii]. Jean Miller has put together some materials for Birnbaum LP aficionados in connection with my 28 November seminar. Great to have ready links to some of the early comments and replies by Birnbaum, Durbin, Kalbfleish and others, possibly of interest to those planning contributions to the current “U-Phil“.  I will try to make some remarks on Birnbaum’s 1970 letter to the editor tomorrow.

November 28th reading

Categories: Birnbaum Brakes, Likelihood Principle, U-Phil | Leave a comment

Announcement: 28 November: My Seminar at the LSE (Contemporary PhilStat)

28 November: (10 – 12 noon):
Mayo: “On Birnbaum’s argument for the Likelihood Principle: A 50-year old error and its influence on statistical foundations”
PH500 Seminar, Room: Lak 2.06 (Lakatos building). 
London School of Economics and Political Science (LSE)

Background reading: PAPER

See general announcement here.

Background to the Discussion: Question: How did I get involved in disproving Birnbaum’s result in 2006?

Answer: Appealing to something called the “weak conditionality principle (WCP)” arose in avoiding a classic problem (arising from mixture tests) described by David Cox (1958), as discussed in our joint paper:

Cox D. R. and Mayo. D. (2010). “Objectivity and Conditionality in Frequentist Inference” in Error and Inference: Recent Exchanges on Experimental Reasoning, Reliability and the Objectivity and Rationality of Science (D Mayo & A. Spanos eds.), CUP 276-304. Continue reading

Categories: Announcement, Likelihood Principle, Statistics | 12 Comments

Irony and Bad Faith: Deconstructing Bayesians-reblog

 The recent post by Normal Deviate, and my comments on it, remind me of why/how I got back into the Bayesian-frequentist debates in 2006, as described in my first “deconstruction” (and “U-Phil”) on this blog (Dec 11, 2012):

Some time in 2006 (shortly after my ERROR06 conference), the trickle of irony and sometime flood of family feuds issuing from Bayesian forums drew me back into the Bayesian-frequentist debates.1 2  Suddenly sparks were flying, mostly kept shrouded within Bayesian walls, but nothing can long be kept secret even there. Spontaneous combustion is looming. The true-blue subjectivists were accusing the increasingly popular “objective” and “reference” Bayesians of practicing in bad faith; the new O-Bayesians (and frequentist-Bayesian unificationists) were taking pains to show they were not subjective; and some were calling the new Bayesian kids on the block “pseudo Bayesian.” Then there were the Bayesians somewhere in the middle (or perhaps out in left field) who, though they still use the Bayesian umbrella, were flatly denying the very idea that Bayesian updating fits anything they actually do in statistics.3 Obeisance to Bayesian reasoning remained, but on some kind of a priori philosophical grounds. Doesn’t the methodology used in practice really need a philosophy of its own? I say it does, and I want to provide this. Continue reading

Categories: Likelihood Principle, objective Bayesians, Statistics | Tags: , , , , | 33 Comments

Comments on Wasserman’s “what is Bayesian/frequentist inference?”

What I like best about Wasserman’s blogpost (Normal Deviate) is his clear denial that merely using conditional probability makes the method Bayesian (even if one chooses to call the conditional probability theorem Bayes’s theorem, and even if one is using ‘Bayes’s’ nets). Else any use of probability theory is Bayesian, which trivializes the whole issue. Thus, the fact that conditional probability is used in an application with possibly good results is not evidence of (yet another) Bayesian success story [i].

But I do have serious concerns that in his understandable desire (1) to be even-handed (hammers and screwdrivers are for different purposes, both perfectly kosher tools), as well as (2) to give a succinct sum-up of methods,Wasserman may encourage misrepresenting positions. Speaking only for “frequentist” sampling theorists [ii], I would urge moving away from the recommended quick sum-up of “the goal” of frequentist inference: “Construct procedures with frequency guarantees”. If by this Wasserman means that the direct aim is to have tools with “good long run properties”, that rarely err in some long run series of applications, then I think it is misleading. In the context of scientific inference or learning, such a long-run goal, while necessary is not at all sufficient; moreover, I claim, that satisfying this goal is actually just a byproduct of deeper inferential goals (controlling and evaluating how severely given methods are capable of revealing/avoiding erroneous statistical interpretations of data in the case at hand.) (So I deny that it is even the main goal to which frequentist methods direct themselves.) Even arch behaviorist Neyman used power post-data to ascertain how well corroborated various hypotheses were—never mind long-run repeated applications (see one of my Neyman’s Nursery posts). Continue reading

Categories: Error Statistics, Neyman's Nursery, Philosophy of Statistics, Statistics | 21 Comments

PhilStat: So you’re looking for a Ph.D dissertation topic?

Maybe you’ve already heard Hal Varian, Google’s chief economist: “The next sexy job in the next ten years will be statisticians.” Even Larry Wasserman declares that “statistics is sexy.” In that case, philosophy of statistics must be doubly so!

Thus one wonders at the decline of late in the lively and long-standing exchange between philosophers of science and statisticians. If you are a graduate student wondering how you might make your mark in a philosophy of science area, philosophy of statistical science, fairly brimming over with rich and open philosophical problems, may be the thing for you!* Surprising, pressing, intriguing, and novel philosophical twists on both traditional and cutting-edge controversies are going begging for analysis—they not only bear on many areas of popular philosophy but also may offer you ways of getting out in front of them.

I came across a spotty blog by Pitt graduate student Gregory Gandenberger awhile back (not like his new, frequently updated one) where he was wrestling with a topic for his masters thesis, and some years later, wrangling over dissertation topics in philosophy of statistics. After I started this blog, I looked for it again, and now I’ve invited him to post, on the topic of his choice, as he did here, and I invite other graduate students though the U-Phil call. Continue reading

Categories: Error Statistics, philosophy of science, Philosophy of Statistics | 3 Comments

U-Phil: Blogging the Likelihood Principle: New Summary

U-Phil: I would like to open up this post, together with Gandenberger’s (Oct. 30, 2012), to reader U-Phils, from December 6- 19 (< 1000 words) for posting on this blog (please see # at bottom of post).  Where Gandenberger claims, “Birnbaum’s proof is valid and his premises are intuitively compelling,” I have shown that if Birnbaum’s premises are interpreted so as to be true, the argument is invalid.  If construed as formally valid, I argue, the premises contradict each other. Who is right?  Gandenberger doesn’t wrestle with my critique of Birnbaum, but I invite you (and Greg!) to do so. I’m pasting a new summary of my argument below.

 The main premises may be found on pp. 11-14. While these points are fairly straightforward (and do not require technical statistics), they offer an intriguing logical, statistical and linguistic puzzle. The following is an overview of my latest take on the Birnbaum argument. See also “Breaking Through the Breakthrough” posts: Dec. 6 and Dec 7, 2011.

Gandenberger also introduces something called the methodological likelihood principle. A related idea for a U-Phil is to ask: can one mount a sound, non-circular argument for that variant?  And while one is at it, do his methodological variants of sufficiency and conditionality yield plausible principles?

Graduate students and others invited!

______________________________________________________

New Summary of Mayo Critique of Birnbaum’s Argument for the SLP
Deborah Mayo
See also a (draft) of the full PAPER corresponding to this summary, a later and more satisfactory draft is here. Yet other links to the Strong Likelihood Principle SLP: Mayo 2010; Cox & Mayo 2011 (appendix).

Continue reading

Categories: Likelihood Principle, Statistics, U-Phil | 19 Comments

Guest Post: Greg Gandenberger, “Evidential Meaning and Methods of Inference”

Evidential Meaning and Methods of Inference

Greg Gandenberger
PhD student, History and Philosophy of Science
Master’s student, Statistics
University of Pittsburgh

Bayesian methods conform to the Likelihood Principle, while frequentist methods do not.  Thus, proofs of the Likelihood Principle* such as Birnbaum’s (1962) appear to be threats to frequentist positions.  Deborah Mayo has recently argued that Birnbaum’s proof is no threat to frequentist positions because it is invalid (Ch. 7(III) in Mayo and Spanos 2010).  In my view, Birnbaum’s proof is valid and his premises are intuitively compelling.  Nevertheless, I agree with Professor Mayo that the proof, properly understood, does not imply that frequentist methods should not be used.

There are actually at least two different Likelihood Principles: one, which I call the Evidential Likelihood Principle, says that the evidential meaning of an experimental outcome with respect to a set of hypotheses depends only on its likelihood function for those hypothesis (i.e., the function that maps each of those hypotheses to the probability it assigns to that outcome, defined up to a constant of proportionality); the other, which I call the Methodological Likelihood Principle, says that a statistical method should not be used if it can generate different conclusions from outcomes that have the same likelihood function, without a relevant difference in utilities or prior probabilities. Continue reading

Categories: Likelihood Principle | 17 Comments

Reblogging: Oxford Gaol: Statistical Bogeymen

Reblogging 1 year ago in Oxford: Oxford Jail is an entirely fitting place to be on Halloween!

Moreover, rooting around this rather lavish set of jail cells (what used to be a single cell is now a dressing room) is every bit as conducive to philosophical reflection as is exile on Elba!  My goal (while in this gaol—as the English sometimes spell it) is to try and free us from the bogeymen and bogeywomen often associated with “classical” statistics. As a start, the very term “classical statistics” should I think be shelved, not that names should matter.

In appraising statistical accounts at the foundational level, we need to realize the extent to which accounts are viewed through the eyeholes of a mask or philosophical theory.  Moreover, the mask some wear while pursuing this task might well be at odds with their ordinary way of looking at evidence, inference, and learning. In any event, to avoid non-question-begging criticisms, the standpoint from which the appraisal is launched must itself be independently defended.   But for Bayesian critics of error statistics the assumption that uncertain inference demands a posterior probability for claims inferred is thought to be so obvious as not to require support. Critics are implicitly making assumptions that are at odds with the frequentist statistical philosophy. In particular, they assume a certain philosophy about statistical inference (probabilism), often coupled with the allegation that error statistical methods can only achieve radical behavioristic goals, wherein all that matters are long-run error rates (of some sort) Continue reading

Categories: Error Statistics, Philosophy of Statistics | Tags: , | Leave a comment

Mayo: (section 7) “StatSci and PhilSci: part 2″

Here is the final section (7) of my paper: “Statistical Science Meets Philosophy of Science Part 2” SS & POS 2.* Section 6 is in my last post.

7. Can/Should Bayesian and Error Statistical Philosophies Be Reconciled?

Stephen Senn makes a rather startling but doubtlessly true remark:

The late and great George Barnard, through his promotion of the likelihood principle, probably did as much as any statistician in the second half of the last century to undermine the foundations of the then dominant Neyman-Pearson framework and hence prepare the way for the complete acceptance of Bayesian ideas that has been predicted will be achieved by the De Finetti-Lindley limit of 2020. (Senn 2008, 459)

Many do view Barnard as having that effect, even though he himself rejected the likelihood principle (LP). One can only imagine Savage’s shock at hearing that contemporary Bayesians (save true subjectivists) are lukewarm about the LP! The 2020 prediction could come to pass, only to find Bayesians practicing in bad faith. Kadane, one of the last of the true Savage Bayesians, is left to wonder at what can only be seen as a Pyrrhic victory for Bayesians.

Continue reading

Categories: Error Statistics, Philosophy of Statistics, Statistics | Leave a comment

Mayo: (section 6) “StatSci and PhilSci: part 2″

Here is section 6 of my paper: “Statistical Science Meets Philosophy of Science Part 2: Shallow versus Deep Explorations” SS & POS 2. Section 5 is in my last post.

6. Some Knock-Down Criticisms of Frequentist Error Statistics

 With the error-statistical philosophy of inference under our belts, it is easy to run through the classic and allegedly damning criticisms of frequentist errorstatistical methods. Open up Bayesian textbooks and you will find, endlessly reprised, the handful of ‘counterexamples’ and ‘paradoxes’ that make up the charges leveled against frequentist statistics, after which the Bayesian account is proffered as coming to the rescue. There is nothing about how frequentists have responded to these charges; nor evidence that frequentist theory endorses the applications or interpretations around which these ‘chestnuts’ revolve.

If frequentist and Bayesian philosophies are to find common ground, this should stop. The value of a generous interpretation of rival views should cut both ways. A key purpose of the forum out of which this paper arises is to encourage reciprocity.

Continue reading

Categories: Error Statistics, philosophy of science, Philosophy of Statistics | Leave a comment

Mayo: (section 5) “StatSci and PhilSci: part 2”

Here is section 5 of my new paper: “Statistical Science Meets Philosophy of Science Part 2: Shallow versus Deep Explorations” SS & POS 2. Sections 1 and 2 are in my last post.*

5. The Error-Statistical Philosophy

I recommend moving away, once and for all, from the idea that frequentists must ‘sign up’ for either Neyman and Pearson, or Fisherian paradigms. As a philosopher of statistics I am prepared to admit to supplying the tools with an interpretation and an associated philosophy of inference. I am not concerned to prove this is what any of the founders ‘really meant’.

Fisherian simple-significance tests, with their single null hypothesis and at most an idea of  a directional alternative (and a corresponding notion of the ‘sensitivity’ of a test), are commonly distinguished from Neyman and Pearson tests, where the null and alternative exhaust the parameter space, and the corresponding notion of power is explicit. On the interpretation of tests that I am proposing, these are just two of the various types of testing contexts appropriate for different questions of interest. My use of a distinct term, ‘error statistics’, frees us from the bogeymen and bogeywomen often associated with ‘classical’ statistics, and it is to be hoped that that term is shelved. (Even ‘sampling theory’, technically correct, does not seem to represent the key point: the sampling distribution matters in order to evaluate error probabilities, and thereby assess corroboration or severity associated with claims of interest.) Nor do I see that my comments turn on whether one replaces frequencies with ‘propensities’ (whatever they are). Continue reading

Categories: Error Statistics, philosophy of science, Philosophy of Statistics, Severity | 5 Comments

Mayo: (first 2 sections) “StatSci and PhilSci: part 2”

Here are the first two sections of my new paper: “Statistical Science Meets Philosophy of Science Part 2: Shallow versus Deep Explorations” SS & POS 2. (Alternatively, go to the RMM page and scroll down to the Sept 26, 2012 entry.)

1. Comedy Hour at the Bayesian Retreat[i]

 Overheard at the comedy hour at the Bayesian retreat: Did you hear the one about the frequentist…

 “who defended the reliability of his radiation reading, despite using a broken radiometer, on the grounds that most of the time he uses one that works, so on average he’s pretty reliable?”

or

 “who claimed that observing ‘heads’ on a biased coin that lands heads with probability .05 is evidence of a statistically significant improvement over the standard treatment of diabetes, on the grounds that such an event occurs with low probability (.05)?”

Such jests may work for an after-dinner laugh, but if it turns out that, despite being retreads of ‘straw-men’ fallacies, they form the basis of why some statisticians and philosophers reject frequentist methods, then they are not such a laughing matter. But surely the drubbing of frequentist methods could not be based on a collection of howlers, could it? I invite the reader to stay and find out. Continue reading

Categories: Error Statistics, Philosophy of Statistics, Severity | 2 Comments

Query

I was reviewing blog comments and various links people have sent me. I have noticed a kind of comment often arises about a type of (subjective?) Bayesian who does not assign probabilities to a general hypothesis H but only to observable events. In this way, it is claimed, one can avoid various criticisms but retain the Bayesian position, label it (A):

(A) the warrant accorded to an uncertain claim is in terms of probability assignments (to events).

But what happens when H’s predictions are repeatedly and impressively born out in a variety of experiments? Either one can say nothing about the warrant for H (having assumed A), or else one seeks a warrant for H other than a probability assignment to  H*.

Take the former. In that case what good is it to have passed many of H’s predictions? We cannot say we have grounds to accept H in some non-probabilistic sense (since that’s been ruled out by (A)). We also cannot say that the impressive successes in the past warrant predicting that future successes are probable because events do not warrant other events. It is only through some general claim or statistical hypothesis that we may deduce predicted probabilities of events. Continue reading

Categories: Philosophy of Statistics, Statistics | 50 Comments

RMM-8: New Mayo paper: “StatSci and PhilSci: part 2 (Shallow vs Deep Explorations)”

A new article of mine,  “Statistical Science and Philosophy of Science Part 2: Shallow versus Deep Explorations” has been published in the on-line journal, Rationality, Markets, and Morals (Special Topic: Statistical Science and Philosophy of Science: Where Do/Should They Meet?”).

The contributions to this special volume began with the conference we ran in June 2010. (See web poster.)   My first article in this collection was essentially just my introduction to the volume, whereas this new one discusses my work. If you are a reader of this blog, you will recognize portions from early posts, as I’d been revising it then.

The sections are listed below. I will be posting portions in the next few days. We invite comments for this blog, and for possible publication in this special volume of RMM, if received before the end of this year.

This is the 8th RMM announcement. Many thanks to Sailor for digging up the previous 7, and listing them at the end*. (The paper’s title stemmed from the Deepwater Horizon oil spill of spring 2010**).

Abstract:

Inability to clearly defend against the criticisms of frequentist methods has turned many a frequentist away from venturing into foundational battlegrounds. Conceding the distorted perspectives drawn from overly literal and radical expositions of what Fisher, Neyman, and Pearson ‘really thought’, some deny they matter to current practice. The goal of this paper is not merely to call attention to the howlers that pass as legitimate criticisms of frequentist error statistics, but also to sketch the main lines of an alternative statistical philosophy within which to better articulate the roles and value of frequentist tools.

Statistical Science Meets Philosophy of Science Part 2:
Shallow versus Deep Explorations

 1. Comedy Hour at the Bayesian Retreat

2. Popperians Are to Frequentists as Carnapians Are to Bayesians
2.1 Severe Tests
2.2 Another Egregious Violation of the Severity Requirement
2.3 The Rationale for Severity is to Find Things Out Reliably
2.4 What Can Be Learned from Popper; What Can Popperians Be Taught?

3. Frequentist Error-Statistical Tests
3.1 Probability in Statistical Models of Experiments
3.2 Statistical Test Ingredients
3.3. Hypotheses and Events
3.4. Hypotheses Inferred Need Not Be Predesignated Continue reading

Categories: Philosophy of Statistics, Statistics | Tags: , | Leave a comment

Mayo Responds to U-Phils on Background Information

Thanks to Emrah Aktunc and Christian Hennig for their U-Phils on my September 12 post: “How should ‘prior information’ enter in statistical inference?” and my subsequent deconstruction of Gelman[i] (starting here, and ending with part 3).  I’ll begin with some remarks on Emrah Aktunc’s contribution.

First, we need to avoid an ambiguity that clouds prior information and prior probability. In a given experiment, prior information may be stronger than the data: to take but one example, say that we’ve already falsified Newton’s theory of gravity in several domains, but in our experiment the data (e.g., one of the sets of eclipse data from 1919) accords with the Newtonian prediction (of half the amount of deflection as that predicted by Einstein’s general theory of relativity [GTR]). The pro-Newton data, in and of itself, would be rejected because of all that we already know. Continue reading

Categories: Background knowledge, Error Statistics, Philosophy of Statistics, Statistics, U-Phil | Tags: , , | 4 Comments

U-Phils: Hennig and Aktunc on Gelman 2012

I am posting two U-Phils I received in relation to the 9/12 call call on Andrew Gelman’s (2012): “Ethics and the statistical use of prior information”

A Deconstruction of Gelman by Mayo in 3 parts:
(10/5/12) Part 1: “A Bayesian wants everybody else to be a non-Bayesian”
(10/7/12) Part 2: Using prior Information
(10/9/12) Part 3: beauty and the background knowledge

Comments on “How should prior information enter in statistical inference”

Christian Hennig 
Department of Statistical Science
University College London

Reading the blog entries on this topic, the Cox-Mayo Conversation and the linked paper by Gelman, I appreciate the valuable thoughts in both, which to me all make sense, specifying situations where prior information is rather not desired to enter, or rather in the Bayesian way.

Thinking more about the issue, however, I find both the frequentist and the Bayesian approach seriously wanting in this respect (and I don’t have a better one myself either).

A difference between the approaches seems to be that Cox/Mayo rather look at the analysis of data in an isolated situation whereas Gelman rather writes about conclusions from not only analysing a particular data set, but from aggregating all the information available.

Cox/Mayo do not advocate to ignore prior knowledge, but they prefer to keep it out of the process of actually analysing the data. Mayo talks of a piecemeal approach in which results from different data analyses can be put together in order to get an overall picture. Continue reading

Categories: Background knowledge, Error Statistics, Philosophy of Statistics, Statistics, Testing Assumptions, U-Phil | 11 Comments

Last part (3) of the deconstruction: beauty and background knowledge

Please see parts 1 and 2 and links therein. The background began in my Sept 12 post.

Gelman (2012) considers a case where the overall available evidence, E, is at odds with the indication of the results x from a given study:

Consider the notorious study in which a random sample of a few thousand people was analyzed, and it was found that the most beautiful parents were 8 percentage points more likely to have girls, compared to less attractive parents. The result was statistically significant (p<.05) and published in a reputable journal. But in this case we have good prior information suggesting that the difference in sex ratios in the population, comparing beautiful to less-beautiful parents, is less than 1 percentage point. A (non-Bayesian) design analysis reveals that, with this level of true difference, any statistically-significant observed difference in the sample is likely to be noise. At this point, you might well say that the original analysis should never have been done at all—but, given that it has been done, it is essential to use prior information (even if not in any formal Bayesian way) to interpret the data and generalize from sample to population.

Where did Fisher’s principle go wrong here? The answer is simple—and I think Cox would agree with me here. We’re in a setting where the prior information is much stronger than the data. (p. 3)

Let me simply grant Gelman that this prior information warrants (with severity) the hypothesis H:

H: “difference in sex ratios in the population, comparing beautiful to less-beautiful parents, is less than 1 percentage point,” (ibid.)

especially given my suspicions of the well-testedness of claims to show the effects of “beautiful to less-beautiful” on anything. I will simply take it as a given that it is well-tested background “knowledge.” Presumably, the well-tested claim goes beyond those individuals observed, and is generalizing at least to some degree. So we are given that the hypothesis H is one for which there is strong evidence. Continue reading

Categories: Background knowledge, Error Statistics, Philosophy of Statistics, Statistics, U-Phil | 14 Comments

Deconstructing Gelman part 2: Using prior Information

(Please see part 1 for links and references):

A Bayesian, Gelman tells us, “wants everybody else to be a non-Bayesian” (p. 5). Despite appearances, the claim need not be seen as self-contradictory, at least if we interpret it most generously, as Rule #2 (of this blog) directs. Whether or not “a Bayesian” refers to all Bayesians or only non-standard Bayesians (i.e., those wearing a hat of which Gelman approves), his meaning might be simply that when setting out with his own inquiry, he doesn’t want your favorite priors (be that beliefs or formally derived constructs) getting in the way. A Bayesian, says Gelman (in this article) is going to make inferences based on “trying to extract information from the data” in order to determine what to infer or believe (substitute your preferred form of output) about some aspect of a population (or mechanism) generating the data, as modeled. He just doesn’t want the “information from the data” muddied by your particular background knowledge. He would only have to subtract out all of this “funny business” to get at your likelihoods. He would only have to “divide away” your prior distributions before getting to his own analysis (p. 5). As in Gelman’s trial analogy (p. 5.), he prefers to combine your “raw data,” and your likelihoods, with his own well-considered background information. We can leave open whether he will compute posteriors (at least in the manner he recommends here) or not (as suggested in other work). So perhaps we have arrived at a sensible deconstruction of Gelman, free of contradiction. Whether or not this leaves texts open to some charge of disingenuity, I leave entirely to one side.

Now at this point I wonder: do Bayesian reports provide the ingredients for such “dividing away”?  I take it that they’d report the priors, which could be subtracted out, but how is the rest of the background knowledge communicated and used? It would seem to include assorted background knowledge of instruments, of claims that had been sufficiently well corroborated to count as knowledge, of information about that which was not previously well tested, of flaws and biases and threats of error to take into account in future designs, etc. (as in our ESP examples 9/22 and 9/25). The evidence for any background assumptions should also be made explicit and communicated (unless it consists of trivial common knowledge). Continue reading

Categories: Background knowledge, Error Statistics, Philosophy of Statistics, Statistics | 12 Comments

Deconstructing Gelman, Part 1: “A Bayesian wants everybody else to be a non-Bayesian.”

I was to have philosophically deconstructed a few paragraphs from (the last couple of sections) in a column Andrew Gelman sent me on “Ethics and the statistical use of prior information”[i]. The discussion begins with my Sept 12 post, and follows through several posts over the second half of September (see [ii]), all by way of background. But I got called away before finishing the promised deconstruction, and it was only this evening that I tried to wade through a circuitous swamp of remarks. I will just post the first part (of 2 or perhaps 3?), which is already too long.

Since I have a tendency to read articles from back to front, on a first read at least, let me begin with his last section titled:  “A Bayesian wants everybody else to be a non-Bayesian.”  Surely that calls for philosophical deconstruction, if anything does. It seems at the very least an exceptional view. Whether it’s widely held I can’t say (please advise). But suppose it’s true: Bayesians are publicly calling on everybody to use Bayesian methods, even though, deep down, they really, really hope everybody else won’t blend everything together before they can use the valid parts from the data—and they really, really hope that everybody else will provide the full panoply of information about what happened in other experiments, and what background theories are well corroborated, and about the precision of the instruments relied upon, and about other experiments that appear to conflict with the current one and with each other, etc., etc. Suppose that Bayesians actually would prefer, and are relieved to find, that, despite their exhortations, “everybody else” doesn’t report their posterior probabilities (whichever version of Bayesianism they are using) because then they can introduce their own background and figure out what is and is not warranted (in whatever sense seems appropriate).

At first glance, I am tempted to say that I don’t think Gelman really believes this statement himself if it were taken literally. Since he calls himself a Bayesian, at least of a special sort, then if he is wearing his Bayesian hat when he advocates others be non-Bayesian, then the practice of advocating others be non-Bayesian would itself be a Bayesian practice (not a non-Bayesian practice). But we philosophers know the danger of suggesting that authors under our scrutiny do not mean what they say—we may be missing their meaning and interpreting their words in a manner that is implausible. Though we may think, through our flawed interpretation, that they cannot possibly mean what they say, what we have done is substitute a straw view for the actual view (the straw man fallacy). (Note: You won’t get that I am mirroring Gelman unless you look at the article that began this deconstruction here.) Rule #2 of this blog[iii] is to interpret any given position in the most generous way possible; to do otherwise is to weaken our critical evaluation of it. This requires that we try to imagine a plausible reading, taking into account valid background information (e.g., other writings) that might bolster plausibility. This, at any rate, is what we teach our students in philosophy. So to begin with, what does Gelman actually say in the passage (in Section 4)?

“Bayesian inference proceeds by taking the likelihoods from different data sources and then combining them with a prior distribution (or, more generally, a hierarchical model). The likelihood is key. . . . No funny stuff, no posterior distributions, just the likelihood. . . . I don’t want everybody coming to me with their posterior distribution—I’d just have to divide away their prior distributions before getting to my own analysis. Sort of like a trial, where the judge wants to hear what everybody saw—not their individual inferences, but their raw data.” (p.5)

So if this is what he means by being a non-Bayesian, then his assertion that “a Bayesian wants everybody else to be a non-Bayesian” seems to mean that Bayesians want others to basically report their likelihoods. But again, if Gelman is wearing his Bayesian hat when he advocates others not wear theirs, i.e., be non-Bayesian, then his advising that everybody else not be Bayesian (in the sense of not combining priors and likelihoods), is itself a Bayesian practice (not a non-Bayesian practice). So either Gelman is not wearing his Bayesian hat when he recommends this, or his claim is self-contradictory—and I certainly do not want to attribute an inconsistent position to him. Moreover, I am quite certain that he would not advance any such inconsistent position.

Now, I do have some background knowledge. To ignore it is to fail to supply the most generous interpretation. Our background information—that is, Gelman’s (2011) RMM paper [iv]—tells me that he rejects the classic inductive philosophy that he has (correctly) associated with the definition of Bayesianism found on Wikipedia:

“Our key departure from the mainstream Bayesian view (as expressed, for example, [in Wikipedia]) is that we do not attempt to assign posterior probabilities to models or to select or average over them using posterior probabilities. Instead, we use predictive checks to compare models to data and use the information thus learned about anomalies to motivate model improvements” (p. 71).

So now Gelman’s assertion that “a Bayesian wants everybody else to be a non-Bayesian” makes sense and is not self-contradictory. Bayesian, in the term non-Bayesian, would mean something like a standard inductive Bayesian (where priors can be subjective or non-subjective). Gelman’s non-standard Bayesian wants everybody else not to be standard inductive Bayesians, but rather, something more akin to a likelihoodist. (I don’t know whether he wants only the likelihoods rather than the full panoply of background information, but I will return to this.) If Gelman’s Bayesian is not going to assign posterior probabilities to models, or select or average over them using posterior probabilities, then it’s pretty clear he will not find it useful to hear a report of your posterior probabilities. To allude to his trial analogy, the judge surely doesn’t want to hear your posterior probability in Ralph’s guilt, if he doesn’t even think it’s the proper way of couching inferences. Perhaps the judge finds it essential to know whether mistaken judgments of the pieces of evidence surrounding Ralph’s guilt have been well or poorly ruled out.That would be to require an error probabilistic assessment.

But a question might be raised: By “a Bayesian,” doesn’t Gelman clearly mean Bayesians in general, and not just one? And if he means all Bayesians, it would be wrong to think, as I have, that he was alluding to non-standard Bayesians (i.e., those wearing a hat of which Gelman approves). But there is no reason to suppose he means all Bayesians rather than all Bayesians who reject standard, Wiki-style Bayesianism, but instead favor something closer to the view in Gelman 2011, among other places.

Having gotten this far, however, I worry about using the view in Gelman 2011 to deconstruct the passages in the current article, in which, speaking of a Bayesian combining prior distributions and likelihoods, Gelman sounds more like a standard Bayesian. It would not help that he may be alluding to Bayesians in general for purposes of the article, because it is in this article that we find the claim: “A Bayesian wants everybody else to be a non-Bayesian.” So despite my attempts to sensibly deconstruct him, it appears that we are back to the initial problem, in which his claim that a Bayesian wants everybody else to be a non-Bayesian looks self-contradictory or at best disingenuous—and this in a column on ethics in statistics!

But we are not necessarily led to that conclusion!  Stay tuned for part 2, and part 3…..

(On how to do a philosophical analysis see here.)

[i]Gelman, A. “Ethics and the statistical use of prior information”

[ii] The main posts, following the first one, were:

More on using background info (9/15/12)
Statistics and ESP research (Diaconis) (9/22/12)
Insevere tests and pseudoscience (9/25/12)
Levels of inquiry (9/26/12)

[iii] This the Philosopher’s rule of “generous interpretation”, first introduced in this post.

[iv] Gelman, A. (2011).  “Induction and Deduction in Bayesian Data Analysis“, Rationality,  Markets, and Morals (RMM) 2, 67-78.

Categories: Background knowledge, Philosophy of Statistics, Statistics | 2 Comments

Blog at WordPress.com.