Below are the slides from my Rutgers seminar for the Department of Statistics and Biostatistics yesterday, since some people have been asking me for them. The abstract is here. I don’t know how explanatory a bare outline like this can be, but I’d be glad to try and answer questions[i]. I am impressed at how interested in foundational matters I found the statisticians (both faculty and students) to be. (There were even a few philosophers in attendance.) It was especially interesting to explore, prior to the seminar, possible connections between severity assessments and confidence distributions, where the latter are along the lines of Min-ge Xie (some recent papers of his may be found here.)

**“Probing with Severity: Beyond Bayesian Probabilism and Frequentist Performance”**

[i]They had requested a general overview of some issues in philosophical foundations of statistics. Much of this will be familiar to readers of this blog.

Mayo, Thank you for visiting us. We really enjoyed your talk. It received many nice feedbacks from both our faculty and students. Statisticians and philosophers often work isolatedly, even though we share many common ideas and have common goals. I think more communications will definitely benefit both our disciplines. I look forward to keeping up our communications. Cheers to all!

I will check out the slides and the abstract this weekend

Mayo, Read these slides online, with interest. For me it was the clearest explanation of your positions I’ve seen so far. http://www.slideshare.net/jemille6/mayo-2014-rutgers-1?redirected_from=save_on_embed In case of interest, I noted a minor error on 51: You say “you would erroneously interpret data with probability 0.0000003” but you should have added “if H0 were true”, which you mention on 52. Also, there seems to be another omission on 63: “Probativism says H is not justified unless something has been done to probe ways we can be wrong about H.” Only “something”? Like reading chicken entrails? Surely you require something more than just “something” (which reads as if it were anything at all). Best Wishes, Sander

Hi Sander: Good to hear from you.

First I should say that these slides were just notes, and I got through them (without rushing) in a mere 45 minutes, as intended. In short, I left out a lot and just told the story in my on words. I really shouldn’t post things like this because they are just rather sloppy outlines for me (and the audience), and not intended to be quoted. It’s only because some people asked for them. There’s a paper or two or three with most of this material that I can link to (e.g.,from the RMM volume).

I guess I’m both pleased and disappointed that this was the clearest explanation of my position that you’ve seen so far. Granted I tried out one new spin: we get Popperian hopes for probativeness first and then Probabilism/Performance afterwards. Do you think that’s what helped? Or maybe after rubbing it in enough times, it seeps through (I hope)?

On the corrections, I grant that “something has been done” is the very weakest. On the earlier, p. 51, I think it’s OK because it’s giving a rule. It says if you follow the rule of rejecting the null whenever you have 5 sigma bumps, then you erroneously reject with probability .0000003, or however many 0’s that was. The “erroneously” already signifies the null is true.

We love Dr. Mayo’s Popperian premises and her analysis of the importance of severity. Our only (friendly) criticism is this: why can’t Bayesian methods be used to assess the severity of scientific tests? In other words, we agree with Dr Mayo’s point about the need to assess the severity of a given test (i.e. the need to test our tests). We would only add that the Bayesian approach is one such way of making this assessment. In short, we update our Bayesian priors up or down depending on the “severity” or “genuineness” of the method being used to test a hypothesis or claim. The more severe or genuine a test is, the more we are justified in updating our priors in a certain direction …

Enrique: I’m very happy that you like my analysis of the importance of severity. Your interesting and provocative remark about a Bayesian-severity connection deserves an analysis of at least 3 or 4 levels of depth, but now I’m locked into some deadlines + travel and only have time for 1-2: I should perhaps turn your comment into a post, so that we can officially address it. Maybe later.

First, to link to a response to a similar remark by George Casella in a response to a paper: http://www.phil.vt.edu/dmayo/personal_website/evidence_MAYO_opt.pdf

pp. 105-9.

Mayo, D. (2004). “An Error-Statistical Philosophy of Evidence,” in M. Taper and S. Lele (eds.) The Nature of Scientific Evidence: Statistical, Philosophical and Empirical Considerations. Chicago: University of Chicago Press: 79-118.

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Second, I don’t know what you mean when you suggest you will “update our Bayesian priors up or down depending on the “severity” or “genuineness” of the method being used to test a hypothesis or claim. The more severe or genuine a test is, the more we are justified in updating our priors in a certain direction.”

It sounds to be calling for something above and beyond existing Bayesian methods–which doesn’t mean it couldn’t be developed. How do you measure the “severity” or “genuineness” of the testing method? Let’s say you do it the way I am recommending. Hypothesis H passes a test with low severity according to Mayo. Does that mean your prior probability for H goes down? A central point of error statistics is that H’s plausibility/warrant (choose your term) is distinct from how well tested H is by data x. For me, there’s evidence against H only if “not-H” has passed with reasonable severity. Moreover, H might pass with low severity because of various selection effects that influence error probabilities but not likelihoods. Or priors. Low severity does not correspond to a Bayesian disconfirmation, in other words.

By contrast, there’s smoother sailing when H passes with high severity. Say a properly used significance test rejects the null Ho in the one-sided Normal testing example of the slides, so it’s denial, H’, passes with high severity. Are you saying that then you assign it’s denial H’ a high probability? Is the probability = to the error probability? Almost certainly not, (remember the differences between p-values and posteriors: https://errorstatistics.com/2014/07/23/continuedp-values-overstate-the-evidence-against-the-null-legit-or-fallacious-revised/)

But you could still use the severity assessment to guide the “direction” as you say.

There’s only one thing: what have you done but give (an informal) voice to the products of severity labors? No priors or posteriors were given, just a declaration that if H passes a Mayo severe test, then I give H a Bayes boost. But I may be missing your intended interpretation.

It seems there must be many ways to, for example penalize a Bayseian.model for poor severity. But, it also seems this breaks fundamentally with the LP ( as defined by many) and with Bayesian philosophy ( as defined by most). This would be heresy of the type espoused in Gelman and Shilizi, no?

More heretical than that I should think.

Pingback: Testing our scientific tests | prior probability

“Your interesting and provocative remark about a Bayesian-severity connection deserves an analysis of at least 3 or 4 levels of depth, but now I’m locked into some deadlines + travel and only have time for 1-2: I should perhaps turn your comment into a post, so that we can officially address it. Maybe later.”

A full posting on this would certainly also be of strong interest to me.

I’m writing a paper on frequentist methods for the evaluation of chemical kinetic parameters.

I find your powerpoint presentation extremely useful. 🙂