Matching Numbers Across Philosophies

Posted on April 25, 2012 by Mayo

The search for an agreement on numbers across different statistical philosophies is an understandable pastime in foundations of statistics. Perhaps identifying matching or unified numbers, apart from what they might mean, would offer a glimpse as to shared underlying goals? Jim Berger (2003) assures us there is no sacrilege in agreeing on methodology without philosophy, claiming “while the debate over interpretation can be strident, statistical practice is little affected as long as the reported numbers are the same” (Berger, 2003, p. 1).

Do readers agree?

Neyman and Pearson (or perhaps it was mostly Neyman) set out to determine when tests of statistical hypotheses may be considered “independent of probabilities a priori” ([p. 201). In such cases, frequentist and Bayesian may agree on a critical or rejection region.

The agreement between “default” Bayesians and frequentists in the case of one-sided Normal (IID) testing (known σ) is very familiar. As noted in Ghosh, Delampady, and Samanta (2006, p. 35), if we wish to reject a null value when “the posterior odds against it are 19:1 or more, i.e., if posterior probability of H₀ is < .05” then the rejection region matches that of the corresponding test of H₀, (at the .05 level) if that were the null hypothesis. By contrast, they go on to note the also familiar fact that there would be disagreement between the frequentist and Bayesian if one were instead testing the two sided: H₀: μ=μ₀ vs. H₁: μ≠μ₀ with known σ. In fact, the same outcome that would be regarded as evidence against the null in the one-sided test (for the default Bayesian and frequentist) can result in statistically significant results being construed as no evidence against the null —for the Bayesian– or even evidence for it (due to a spiked prior).[i]

J. A. Hartigan (1971), commenting on David Bartholomew, gives a 5 line argument that while Bayes and frequency intervals may sometimes agree with improper priors, they never exactly agree with proper priors (see below [ii]). But improper priors are not considered to provide degrees of belief (not even being proper probabilities). This would seem to suggest that when they (frequentists and Bayesians) agree on numbers, the prior cannot be construed as a proper degree of belief assignment.

What say you?

Berger, J. (2003), “Could Fisher,Jeffreys and Neyman Have Agreed on Testing?”, Statistical Science 18, 1–12.

Neyman and Pearson, “The Testing of Statistical Hypotheses in Relation to Probabilities a priori”, Joint Statistical Papers of Neyman and Pearson, 186-202.

Bartholomew, D. J., “A comparison of Frequentist and Bayesian Approaches to Inferences With Prior Knowledge,” in Godambe and Sprott, (1971), Foundations of Statistical Inference, 417-429.

Ghosh, Delampady, and Samanta (2006), An Introduction to Bayesian Analysis, Theory and Methods, Springer.

Mayo, D. G. (2003), Comment on J. O. Berger’s “Could Fisher,Jeffreys and Neyman Have Agreed on Testing?”, Statistical Science 18, 19-24.

[i] But not all default Bayesians endorse the spiked priors here, meaning there a lack of agreement on numbers even within the same philosophical school.

[ii] Here are the 5 lines:

We need P(θ < θ_α(x)| θ) = α all θ.

Assume θ_α(x) has positive density over line, all θ.

Then P(θ < θ_α(x)| θ, θ_α(x) >0) = α(θ) > α.

So P(θ < θ_α(x)| θ_α(x) > 0) > α averaging over θ.

So P(θ < θ_α(x)| x) = α is impossible.

(J.A. Hartigan, comment on D. J. Bartholomew (1971), “Comparison of Frequentist and Bayesian Approaches to Inference with Prior Knowledge”, in Godambe and Sprott (eds.), Foundations of Statistical Inference, p.432)

Categories: Statistics | Tags: Bayesian probability, Frequentist inference, frequentist-Bayesian unifications, Statistical hypothesis testing | 7 Comments

7 thoughts on “Matching Numbers Across Philosophies”

April 26, 2012

Mayo

I had to clarify a couple of referents in this post: in the last sentence of each of the last 2 paras.

Reply
April 26, 2012

Eileen

Your post seems to say that the Bayesian and frequentist could both agree that a result rejects the null in a one-sided test (such as Ho: m 0), but disagree on the same result in the two-sided test (m = 0 versus m not equal to 0)? But rejecting the one-sided null gives accept m > 0. How can a Bayesian believe m > 0 to a high degree and not believe that either m is bigger than or smaller than 0 to a high degree?

Reply

April 27, 2012

guest

@Eileen, for the one-sided problem, Bayesian posterior probabilities in support of the null and frequentist inference using p-values can agree very broadly – see G Casella and R Berger (1987, JASA). The reconciliation requires taking an infimum over a class of priors.

In the two-sided problem, the default approaches under the two paradigms don’t agree; for example, under the null, Bayesian support rapidly concentrates at the null, as we accrue data, whereas the default frequentist always rejects the null with probability alpha.

Reply

April 26, 2012

Mayo

Eileen: That’s a good question, and the answer is basically the difference in the prior probability to the null in the latter case…dashing out now…my next post will go further on this case.
by the way, haven’t heard from you in awhile…more soon.

Reply
April 27, 2012

Mayo

Yes, I see a guest has replied, thanks! It is odd they would be prepared to essentially violate the “consequence condition” (with the 2-sided default Bayes test). Casella and R. Berger* have an excellent paper!
*It can be confusing here because Roger Berger and Jim Berger are involved in this debate, on opposite sides. I will note some other consequences in my next…

Reply
April 27, 2012

Christian Hennig

Well, I’d say that we shouldn’t just report numbers and ignore their meaning. If two numbers mean different things, nothing is gained if they are about the same. I’m not sure whether anybody except of Berger would advertise to forget about understanding what the numbers precisely mean… (probably the practice of quite a few statisticians suggests that they agree with Berger but they would rarely admit it).

A more interesting question would be to what extent substantial interpretations are the same from different analyses, based on (potentially similar) numbers with different meanings. One may find good agreement between high quality Bayesian and frequentist data analyses in this respect, but this shouldn’t be boiled down to the numbers alone.

Reply

April 27, 2012

Mayo

Hennig: Thanks. You may be right that J. Berger is one of the few to say this out loud, but I find many others who quite earnestly and circuitously work to claim shared meanings for similar numbers between Bayesian and frequentist methods. I happen to just be rereading Kass’s interesting article (Statistical Science, Feb 2011) where he struggles mightily to try and explain that the confidence level of a frequentist interval estimate, not estimator, kind-of-sort-of-if you-wink-one-eye means the same thing as a corresponding Bayesian probability to the estimate (4) [not that I’m clear on what the latter really means]. I trace this to a presupposition that whenever we assert that data x is good evidence for a claim about some aspect of an underling process or entity, we must take the additional step of assigning a high probability to the assertion that is warranted. This I find at odds with assertions in science and day-to-day life (which may still be qualified as approximate). It is related to the idea that the truth of a scientific hypothesis is like the occurrence of an outcome of a trial (or a ‘unique event’). I am writing on this right now…or should be rather than blogging…

Reply

I welcome constructive comments that are of relevance to the post and the discussion, and discourage detours into irrelevant topics, however interesting, or unconstructive declarations that "you (or they) are just all wrong". If you want to correct or remove a comment, send me an e-mail. If readers have already replied to the comment, you may be asked to replace it to retain comprehension. Cancel reply

Matching Numbers Across Philosophies

Post navigation

7 thoughts on “Matching Numbers Across Philosophies”

The Statistics Wars & Their Casualties

Blog links (references)

Reviews of Statistical Inference as Severe Testing (SIST)

Interviews & Debates on PhilStat (2020)

Interviews on PhilStat (2019)

LSE PH500 Research Seminar (May 21-June 25, 2020): Controversies in Phil Stat

Summer Seminar 2019 (article)

Top Posts & Pages

Conferences & Workshops

RMM Special Topic

Mayo & Spanos, Error Statistics

Follow Blog via Email

My Websites

Recent Posts: PhilStatWars

The Statistics Wars and Their Casualties Videos & Slides from Sessions 1 & 2

THE STATISTICS WARS AND THEIR CASUALTIES VIDEOS & SLIDES FROM SESSIONS 3 & 4

Final session: The Statistics Wars and Their Casualties: 8 December, Session 4

SCHEDULE: The Statistics Wars and Their Casualties: 1 Dec & 8 Dec: Sessions 3 & 4

WORKSHOP

LOG IN/OUT

Archives

© Deborah G. Mayo, Error Statistics Philosophy, 2011-2018 All Rights Reserved.

Matching Numbers Across Philosophies

Related

Post navigation

7 thoughts on “Matching Numbers Across Philosophies”

The Statistics Wars & Their Casualties

Blog links (references)

Reviews of Statistical Inference as Severe Testing (SIST)

Interviews & Debates on PhilStat (2020)

Interviews on PhilStat (2019)

LSE PH500 Research Seminar (May 21-June 25, 2020): Controversies in Phil Stat

Summer Seminar 2019 (article)

Top Posts & Pages

Conferences & Workshops

RMM Special Topic

Mayo & Spanos, Error Statistics

Follow Blog via Email

My Websites

Recent Posts: PhilStatWars

The Statistics Wars and Their Casualties Videos & Slides from Sessions 1 & 2

THE STATISTICS WARS AND THEIR CASUALTIES VIDEOS & SLIDES FROM SESSIONS 3 & 4

Final session: The Statistics Wars and Their Casualties: 8 December, Session 4

SCHEDULE: The Statistics Wars and Their Casualties: 1 Dec & 8 Dec: Sessions 3 & 4

WORKSHOP

LOG IN/OUT

Archives

© Deborah G. Mayo, Error Statistics Philosophy, 2011-2018 All Rights Reserved.