Bayesian/frequentist

Lucien Le Cam: “The Bayesians Hold the Magic”

lecamToday is the birthday of Lucien Le Cam (Nov. 18, 1924-April 25,2000): Please see my updated 2013 post on him.

 

Categories: Bayesian/frequentist, Statistics | Leave a comment

Oxford Gaol: Statistical Bogeymen

Memory Lane: 3 years ago. Oxford Jail (also called Oxford Castle) is an entirely fitting place to be on (and around) Halloween! Moreover, rooting around this rather lavish set of jail cells (what used to be a single cell is now a dressing room) is every bit as conducive to philosophical reflection as is exile on Elba! (It is now a boutique hotel, though many of the rooms are still too jail-like for me.)  My goal (while in this gaol—as the English sometimes spell it) is to try and free us from the bogeymen and bogeywomen often associated with “classical” statistics. As a start, the very term “classical statistics” should, I think, be shelved, not that names should matter.

In appraising statistical accounts at the foundational level, we need to realize the extent to which accounts are viewed through the eyeholes of a mask or philosophical theory.  Moreover, the mask some wear while pursuing this task might well be at odds with their ordinary way of looking at evidence, inference, and learning. In any event, to avoid non-question-begging criticisms, the standpoint from which the appraisal is launched must itself be independently defended.   But for (most) Bayesian critics of error statistics the assumption that uncertain inference demands a posterior probability for claims inferred is thought to be so obvious as not to require support. Critics are implicitly making assumptions that are at odds with the frequentist statistical philosophy. In particular, they assume a certain philosophy about statistical inference (probabilism), often coupled with the allegation that error statistical methods can only achieve radical behavioristic goals, wherein all that matters are long-run error rates (of some sort)Unknown-2

Criticisms then follow readily: the form of one or both:

  • Error probabilities do not supply posterior probabilities in hypotheses, interpreted as if they do (and some say we just can’t help it), they lead to inconsistencies
  • Methods with good long-run error rates can give rise to counterintuitive inferences in particular cases.
  • I have proposed an alternative philosophy that replaces these tenets with different ones:
  • the role of probability in inference is to quantify how reliably or severely claims (or discrepancies from claims) have been tested
  • the severity goal directs us to the relevant error probabilities, avoiding the oft-repeated statistical fallacies due to tests that are overly sensitive, as well as those insufficiently sensitive to particular errors.
  • Control of long run error probabilities, while necessary is not sufficient for good tests or warranted inferences.

Continue reading

Categories: 3-year memory lane, Bayesian/frequentist, Philosophy of Statistics, Statistics | Tags: , | 30 Comments

Gelman recognizes his error-statistical (Bayesian) foundations

karma

From Gelman’s blog:

“In one of life’s horrible ironies, I wrote a paper “Why we (usually) don’t have to worry about multiple comparisons” but now I spend lots of time worrying about multiple comparisons”

Posted by  on

Exhibit A: [2012] Why we (usually) don’t have to worry about multiple comparisons. Journal of Research on Educational Effectiveness 5, 189-211. (Andrew Gelman, Jennifer Hill, and Masanao Yajima)

Exhibit B: The garden of forking paths: Why multiple comparisons can be a problem, even when there is no “fishing expedition” or “p-hacking” and the research hypothesis was posited ahead of time, in press. (Andrew Gelman and Eric Loken) (Shortened version is here.)

 

The “forking paths” paper, in my reading,  basically argues that mere hypothetical possibilities about what you would or might have done had the data been different (in order to secure a desired interpretation) suffices to alter the characteristics of the analysis you actually did. That’s an error statistical argument–maybe even stronger than what some error statisticians would say. What’s really being condemned are overly flexible ways to move from statistical results to substantive claims. The p-values are illicit when taken to provide evidence for those claims because an actual p-value requires Prob(P < p;Ho) = p (and the actual p-value has become much greater by design). The criticism makes perfect sense if you’re scrutinizing inferences according to how well or severely tested they are. Actual error probabilities are accordingly altered or unable to be calculated. However, if one is going to scrutinize inferences according to severity then the same problematic flexibility would apply to Bayesian analyses, whether or not they have a way to pick up on it. (It’s problematic if they don’t.) I don’t see the magic by which a concern for multiple testing disappears in Bayesian analysis (e.g., in the first paper) except by assuming some prior takes care of it..

Categories: Error Statistics, Gelman | 15 Comments

Oy Faye! What are the odds of not conflating simple conditional probability and likelihood with Bayesian success stories?

Unknown

Faye Flam

Congratulations to Faye Flam for finally getting her article published at the Science Times at the New York Times, “The odds, continually updated” after months of reworking and editing, interviewing and reinterviewing. I’m grateful too, that one remark from me remained. Seriously I am. A few comments: The Monty Hall example is simple probability not statistics, and finding that fisherman who floated on his boots at best used likelihoods. I might note, too, that critiquing that ultra-silly example about ovulation and voting–a study so bad they actually had to pull it at CNN due to reader complaints[i]–scarcely required more than noticing the researchers didn’t even know the women were ovulating[ii]. Experimental design is an old area of statistics developed by frequentists; on the other hand, these ovulation researchers really believe their theory, so the posterior checks out.

The article says, Bayesian methods can “crosscheck work done with the more traditional or ‘classical’ approach.” Yes, but on traditional frequentist grounds. What many would like to know is how to cross check Bayesian methods—how do I test your beliefs? Anyway, I should stop kvetching and thank Faye and the NYT for doing the article at all[iii]. Here are some excerpts:

Statistics may not sound like the most heroic of pursuits. But if not for statisticians, a Long Island fisherman might have died in the Atlantic Ocean after falling off his boat early one morning last summer.

Continue reading

Categories: Bayesian/frequentist, Statistics | 47 Comments

Continued:”P-values overstate the evidence against the null”: legit or fallacious?

.

continued…

Categories: Bayesian/frequentist, CIs and tests, fallacy of rejection, highly probable vs highly probed, P-values, Statistics | 39 Comments

“P-values overstate the evidence against the null”: legit or fallacious? (revised)

0. July 20, 2014: Some of the comments to this post reveal that using the word “fallacy” in my original title might have encouraged running together the current issue with the fallacy of transposing the conditional. Please see a newly added Section 7.

Continue reading

Categories: Bayesian/frequentist, CIs and tests, fallacy of rejection, highly probable vs highly probed, P-values, Statistics | 71 Comments

Higgs Discovery two years on (1: “Is particle physics bad science?”)

Higgs_cake-s

July 4, 2014 was the two year anniversary of the Higgs boson discovery. As the world was celebrating the “5 sigma!” announcement, and we were reading about the statistical aspects of this major accomplishment, I was aghast to be emailed a letter, purportedly instigated by Bayesian Dennis Lindley, through Tony O’Hagan (to the ISBA). Lindley, according to this letter, wanted to know:

“Are the particle physics community completely wedded to frequentist analysis?  If so, has anyone tried to explain what bad science that is?”

Fairly sure it was a joke, I posted it on my “Rejected Posts” blog for a bit until it checked out [1]. (See O’Hagan’s “Digest and Discussion”) Continue reading

Categories: Bayesian/frequentist, fallacy of non-significance, Higgs, Lindley, Statistics | Tags: , , , , , | 4 Comments

Big Bayes Stories? (draft ii)

images-15“Wonderful examples, but let’s not close our eyes,”  is David J. Hand’s apt title for his discussion of the recent special issue (Feb 2014) of Statistical Science called Big Bayes Stories” (edited by Sharon McGrayne, Kerrie Mengersen and Christian Robert.) For your Saturday night/ weekend reading, here are excerpts from Hand, another discussant (Welsh), scattered remarks of mine, along with links to papers and background. I begin with David Hand:

 [The papers in this collection] give examples of problems which are well-suited to being tackled using such methods, but one must not lose sight of the merits of having multiple different strategies and tools in one’s inferential armory.(Hand [1])_

…. But I have to ask, is the emphasis on ‘Bayesian’ necessary? That is, do we need further demonstrations aimed at promoting the merits of Bayesian methods? … The examples in this special issue were selected, firstly by the authors, who decided what to write about, and then, secondly, by the editors, in deciding the extent to which the articles conformed to their desiderata of being Bayesian success stories: that they ‘present actual data processing stories where a non-Bayesian solution would have failed or produced sub-optimal results.’ In a way I think this is unfortunate. I am certainly convinced of the power of Bayesian inference for tackling many problems, but the generality and power of the method is not really demonstrated by a collection specifically selected on the grounds that this approach works and others fail. To take just one example, choosing problems which would be difficult to attack using the Neyman-Pearson hypothesis testing strategy would not be a convincing demonstration of a weakness of that approach if those problems lay outside the class that that approach was designed to attack.

Hand goes on to make a philosophical assumption that might well be questioned by Bayesians: Continue reading

Categories: Bayesian/frequentist, Honorary Mention, Statistics | 62 Comments

Deconstructing Andrew Gelman: “A Bayesian wants everybody else to be a non-Bayesian.”

At the start of our seminar, I said that “on weekends this spring (in connection with Phil 6334, but not limited to seminar participants) I will post some of my ‘deconstructions of articles”. I began with Andrew Gelman‘s note  “Ethics and the statistical use of prior information”[i], but never posted my deconstruction of it. So since it’s Saturday night, and the seminar is just ending, here it is, along with related links to Stat and ESP research (including me, Jack Good, Persi Diaconis and Pat Suppes). Please share comments especially in relation to current day ESP research. Continue reading

Categories: Background knowledge, Gelman, Phil6334, Statistics | 35 Comments

You can only become coherent by ‘converting’ non-Bayesianly

Mayo looks at Bayesian foundations

“What ever happened to Bayesian foundations?” was one of the final topics of our seminar (Mayo/SpanosPhil6334). In the past 15 years or so, not only have (some? most?) Bayesians come to accept violations of the Likelihood Principle, they have also tended to disown Dutch Book arguments, and the very idea of inductive inference as updating beliefs by Bayesian conditionalization has evanescencd. In one of Thursday’s readings, by Baccus, Kyburg, and Thalos (1990)[1], it is argued that under certain conditions, it is never a rational course of action to change belief by Bayesian conditionalization. Here’s a short snippet for your Saturday night reading (the full paper is http://errorstatistics.files.wordpress.com/2014/05/bacchus_kyburg_thalos-against-conditionalization.pdf): Continue reading

Categories: Bayes' Theorem, Phil 6334 class material, Statistics | Tags: , | 29 Comments

Phil 6334: Foundations of statistics and its consequences: Day #12

picture-216-1We interspersed key issues from the reading for this session (from Howson and Urbach) with portions of my presentation at the Boston Colloquium (Feb, 2014): Revisiting the Foundations of Statistics in the Era of Big Data: Scaling Up to Meet the Challenge. (Slides below)*.

Someone sent us a recording  (mp3)of the panel discussion from that Colloquium (there’s a lot on “big data” and its politics) including: Mayo, Xiao-Li Meng (Harvard), Kent Staley (St. Louis), and Mark van der Laan (Berkeley). 

See if this works: | mp3

*There’s a prelude here to our visitor on April 24: Professor Stanley Young from the National Institute of Statistical Sciences.

 

Categories: Bayesian/frequentist, Error Statistics, Phil6334 | 43 Comments

Phil 6334: Notes on Bayesian Inference: Day #11 Slides

 

.

A. Spanos Probability/Statistics Lecture Notes 7: An Introduction to Bayesian Inference (4/10/14)

Categories: Bayesian/frequentist, Phil 6334 class material, Statistics | 11 Comments

Phil 6334: Duhem’s Problem, highly probable vs highly probed; Day #9 Slides

 

picture-216-1April 3, 2014: We interspersed discussion with slides; these cover the main readings of the day (check syllabus): the Duhem’s Probem and the Bayesian Way, and “Highly probable vs Highly Probed”. syllabus four. Slides are below (followers of this blog will be familiar with most of this, e.g., here). We also did further work on misspecification testing.

Monday, April 7, is an optional outing, “a seminar class trip”

"Thebes", Blacksburg, VA

“Thebes”, Blacksburg, VA

you might say, here at Thebes at which time we will analyze the statistical curves of the mountains, pie charts of pizza, and (seriously) study some experiments on the problem of replication in “the Hamlet Effect in social psychology”. If you’re around please bop in!

Mayo’s slides on Duhem’s Problem and more from April 3 (Day#9):

 

 

Categories: Bayesian/frequentist, highly probable vs highly probed, misspecification testing | 8 Comments

Who is allowed to cheat? I.J. Good and that after dinner comedy hour….

UnknownIt was from my Virginia Tech colleague I.J. Good (in statistics), who died five years ago (April 5, 2009), at 93, that I learned most of what I call “howlers” on this blog. His favorites were based on the “paradoxes” of stopping rules. (I had posted this last year here.)

“In conversation I have emphasized to other statisticians, starting in 1950, that, in virtue of the ‘law of the iterated logarithm,’ by optional stopping an arbitrarily high sigmage, and therefore an arbitrarily small tail-area probability, can be attained even when the null hypothesis is true. In other words if a Fisherian is prepared to use optional stopping (which usually he is not) he can be sure of rejecting a true null hypothesis provided that he is prepared to go on sampling for a long time. The way I usually express this ‘paradox’ is that a Fisherian [but not a Bayesian] can cheat by pretending he has a plane to catch like a gambler who leaves the table when he is ahead” (Good 1983, 135) [*]

Continue reading

Categories: Bayesian/frequentist, Comedy, Statistics | Tags: , , | 18 Comments

Phil 6334: Day #3: Feb 6, 2014

img_1249-e1356389909748

Day #3: Spanos lecture notes 2, and reading/resources from Feb 6 seminar 

6334 Day 3 slides: Spanos-lecture-2

___

Crupi & Tentori (2010). Irrelevant Conjunction: Statement and Solution of a New Paradox, Phil Sci, 77, 1–13.

Hawthorne & Fitelson (2004). Re-Solving Irrelevant Conjunction with Probabilistic Independence, Phil Sci 71: 505–514.

Skryms (1975) Choice and Chance 2nd ed. Chapter V and Carnap (pp. 206-211), Dickerson Pub. Co.

Mayo posts on the tacking paradox: Oct. 25, 2013: “Bayesian Confirmation Philosophy and the Tacking Paradox (iv)*” &  Oct 25.

An update on this issue will appear shortly in a separate blogpost.

_

READING FOR NEXT WEEK
Selection (pp. 35-59) from: Popper (1962). Conjectures and RefutationsThe Growth of Scientific Knowledge. Basic Books. 

Categories: Bayes' Theorem, Phil 6334 class material, Statistics | Leave a comment

Objective/subjective, dirty hands and all that: Gelman/ Wasserman blogolog (ii)

Objectivity #2: The “Dirty Hands” Argument for Ethics in EvidenceAndrew Gelman says that as a philosopher, I should appreciate his blog today in which he records his frustration: “Against aggressive definitions: No, I don’t think it helps to describe Bayes as ‘the analysis of subjective beliefs’…”  Gelman writes:

I get frustrated with what might be called “aggressive definitions,” where people use a restrictive definition of something they don’t like. For example, Larry Wasserman writes (as reported by Deborah Mayo):

“I wish people were clearer about what Bayes is/is not and what 
frequentist inference is/is not. Bayes is the analysis of subjective
 beliefs but provides no frequency guarantees. Frequentist inference 
is about making procedures that have frequency guarantees but makes no 
pretense of representing anyone’s beliefs.”

I’ll accept Larry’s definition of frequentist inference. But as for his definition of Bayesian inference: No no no no no. The probabilities we use in our Bayesian inference are not subjective, or, they’re no more subjective than the logistic regressions and normal distributions and Poisson distributions and so forth that fill up all the textbooks on frequentist inference.

To quickly record some of my own frustrations:*: First, I would disagree with Wasserman’s characterization of frequentist inference, but as is clear from Larry’s comments to (my reaction to him), I think he concurs that he was just giving a broad contrast. Please see Note [1] for a remark from my post: Comments on Wasserman’s “what is Bayesian/frequentist inference?” Also relevant is a Gelman post on the Bayesian name: [2].

Second, Gelman’s “no more subjective than…” evokes  remarks I’ve made before. For example, in “What should philosophers of science do…” I wrote:

Arguments given for some very popular slogans (mostly by non-philosophers), are too readily taken on faith as canon by others, and are repeated as gospel. Examples are easily found: all models are false, no models are falsifiable, everything is subjective, or equally subjective and objective, and the only properly epistemological use of probability is to supply posterior probabilities for quantifying actual or rational degrees of belief. Then there is the cluster of “howlers” allegedly committed by frequentist error statistical methods repeated verbatim (discussed on this blog).

I’ve written a lot about objectivity on this blog, e.g., here, here and here (and in real life), but what’s the point if people just rehearse the “everything is a mixture…” line, without making deeply important distinctions? I really think that, next to the “all models are false” slogan, the most confusion has been engendered by the “no methods are objective” slogan. However much we may aim at objective constraints, it is often urged, we can never have “clean hands” free of the influence of beliefs and interests, and we invariably sully methods of inquiry by the entry of background beliefs and personal judgments in their specification and interpretation. Continue reading

Categories: Bayesian/frequentist, Error Statistics, Gelman, Objectivity, Statistics | 41 Comments

Mascots of Bayesneon statistics (rejected post)

bayes_theorem (see rejected posts)

Categories: Bayesian/frequentist, Rejected Posts | Leave a comment

U-Phil: Deconstructions [of J. Berger]: Irony & Bad Faith 3

Memory Lane: 2 years ago:
My efficient Errorstat Blogpeople1 have put forward the following 3 reader-contributed interpretive efforts2 as a result of the “deconstruction” exercise from December 11, (mine, from the earlier blog, is at the end) of what I consider:

“….an especially intriguing remark by Jim Berger that I think bears upon the current mindset (Jim is aware of my efforts):

Too often I see people pretending to be subjectivists, and then using “weakly informative” priors that the objective Bayesian community knows are terrible and will give ridiculous answers; subjectivism is then being used as a shield to hide ignorance. . . . In my own more provocative moments, I claim that the only true subjectivists are the objective Bayesians, because they refuse to use subjectivism as a shield against criticism of sloppy pseudo-Bayesian practice. (Berger 2006, 463)” (From blogpost, Dec. 11, 2011)
_________________________________________________
Andrew Gelman:

The statistics literature is big enough that I assume there really is some bad stuff out there that Berger is reacting to, but I think that when he’s talking about weakly informative priors, Berger is not referring to the work in this area that I like, as I think of weakly informative priors as specifically being designed to give answers that are _not_ “ridiculous.”

Keeping things unridiculous is what regularization’s all about, and one challenge of regularization (as compared to pure subjective priors) is that the answer to the question, What is a good regularizing prior?, will depend on the likelihood.  There’s a lot of interesting theory and practice relating to weakly informative priors for regularization, a lot out there that goes beyond the idea of noninformativity.

To put it another way:  We all know that there’s no such thing as a purely noninformative prior:  any model conveys some information.  But, more and more, I’m coming across applied problems where I wouldn’t want to be noninformative even if I could, problems where some weak prior information regularizes my inferences and keeps them sane and under control. Continue reading

Categories: Gelman, Irony and Bad Faith, J. Berger, Statistics, U-Phil | Tags: , , , | 3 Comments

A. Spanos lecture on “Frequentist Hypothesis Testing”

may-4-8-aris-spanos-e2809contology-methodology-in-statistical-modelinge2809d

Aris Spanos

I attended a lecture by Aris Spanos to his graduate econometrics class here at Va Tech last week[i]. This course, which Spanos teaches every fall, gives a superb illumination of the disparate pieces involved in statistical inference and modeling, and affords clear foundations for how they are linked together. His slides follow the intro section. Some examples with severity assessments are also included.

Frequentist Hypothesis Testing: A Coherent Approach

Aris Spanos

1    Inherent difficulties in learning statistical testing

Statistical testing is arguably  the  most  important, but  also the  most difficult  and  confusing chapter of statistical inference  for several  reasons, including  the following.

(i) The need to introduce numerous new notions, concepts and procedures before one can paint —  even in broad brushes —  a coherent picture  of hypothesis  testing.

(ii) The current textbook discussion of statistical testing is both highly confusing and confused.  There  are several sources of confusion.

  • (a) Testing is conceptually one of the most sophisticated sub-fields of any scientific discipline.
  • (b) Inadequate knowledge by textbook writers who often do not have  the  technical  skills to read  and  understand the  original  sources, and  have to rely on second hand  accounts  of previous  textbook writers that are  often  misleading  or just  outright erroneous.   In most  of these  textbooks hypothesis  testing  is poorly  explained  as  an  idiot’s guide to combining off-the-shelf formulae with statistical tables like the Normal, the Student’s t, the chi-square,  etc., where the underlying  statistical  model that gives rise to the testing procedure  is hidden  in the background.
  • (c)  The  misleading  portrayal of Neyman-Pearson testing  as essentially  decision-theoretic in nature, when in fact the latter has much greater  affinity with the Bayesian rather than the frequentist inference.
  • (d)  A deliberate attempt to distort and  cannibalize  frequentist testing by certain  Bayesian drumbeaters who revel in (unfairly)  maligning frequentist inference in their  attempts to motivate their  preferred view on statistical inference.

(iii) The  discussion of frequentist testing  is rather incomplete  in so far as it has been beleaguered by serious foundational problems since the 1930s. As a result, different applied fields have generated their own secondary  literatures attempting to address  these  problems,  but  often making  things  much  worse!  Indeed,  in some fields like psychology  it has reached the stage where one has to correct the ‘corrections’ of those chastising  the initial  correctors!

In an attempt to alleviate  problem  (i),  the discussion  that follows uses a sketchy historical  development of frequentist testing.  To ameliorate problem (ii), the discussion includes ‘red flag’ pointers (¥) designed to highlight important points that shed light on certain  erroneous  in- terpretations or misleading arguments.  The discussion will pay special attention to (iii), addressing  some of the key foundational problems.

[i] It is based on Ch. 14 of Spanos (1999) Probability Theory and Statistical Inference. Cambridge[ii].

[ii] You can win a free copy of this 700+ page text by creating a simple palindrome! http://errorstatistics.com/palindrome/march-contest/

Categories: Bayesian/frequentist, Error Statistics, Severity, significance tests, Statistics | Tags: | 36 Comments

The error statistician has a complex, messy, subtle, ingenious, piece-meal approach

RMM: "A Conversation Between Sir David Cox & D.G. Mayo"A comment today by Stephen Senn leads me to post the last few sentences of my (2010) paper with David Cox, “Frequentist Statistics as a Theory of Inductive Inference”:

A fundamental tenet of the conception of inductive learning most at home with the frequentist philosophy is that inductive inference requires building up incisive arguments and inferences by putting together several different piece-meal results; we have set out considerations to guide these pieces[i]. Although the complexity of the issues makes it more difficult to set out neatly, as, for example, one could by imagining that a single algorithm encompasses the whole of inductive inference, the payoff is an account that approaches the kind of arguments that scientists build up in order to obtain reliable knowledge and understanding of a field.” (273)[ii]

A reread for Saturday night?

[i]The pieces hang together by dint of the rationale growing out of a severity criterion (or something akin but using a different term.)

[ii]Error and Inference: Recent Exchanges on Experimental Reasoning, Reliability and the Objectivity and Rationality of Science (D Mayo and A. Spanos eds.), Cambridge: Cambridge University Press: 1-27. This paper appeared in The Second Erich L. Lehmann Symposium: Optimality, 2006, Lecture Notes-Monograph Series, Volume 49, Institute of Mathematical Statistics, pp. 247-275.

Categories: Bayesian/frequentist, Error Statistics | 20 Comments

Blog at WordPress.com. The Adventure Journal Theme.

Follow

Get every new post delivered to your Inbox.

Join 481 other followers