Update: Feb. 21, 2014 (slides at end): Ever find when you begin to “type” a paper to which you gave an off-the-cuff title months and months ago that you scarcely know just what you meant or feel up to writing a paper with that (provocative) title? But then, pecking away at the outline of a possible paper crafted to fit the title, you discover it’s just the paper you’re up to writing right now? That’s how I feel about “Is the Philosophy of Probabilism an Obstacle to Statistical Fraud Busting?” (the impromptu title I gave for my paper for the Boston Colloquium for the Philosophy of Science):
The conference is called: “Revisiting the Foundations of Statistics in the Era of Big Data: Scaling Up to Meet the Challenge.”
Here are some initial chicken-scratchings (draft (i)). Share comments, queries. (I still have 2 weeks to come up with something*.)
“Is the Philosophy of Probabilism an Obstacle to Statistical Fraud Busting?”
(1) Statistical debunking, and the promotion of non-fraudulent uses of statistics, call for an account to be capable of registering gambits that alter the error probing capacities of methods. Both the avoidance and detection of flaws and fallacies turn on error statistical considerations that are absent in accounts that fall under the umbrella of “probabilism”. Probabilisms, be they in terms of Bayesian posteriors, Bayes ratios, or likelihood ratios do not adequately pick up on error probabilities (Likelihood Principle)[i]. Examples of such gambits are: stopping rules (optional stopping), data-dependent selections (cherry picking, significance seeking, multiple testing, post hoc subgroup analysis), flawed randomization, and violated statistical model assumptions.
(2) If little has been done to rule out errors in construing the data as evidence for claim H, then H “passes” a test that lacks severity. Methods that scrutinize a method’s capabilities, according to their severity, I call error statistical. Probabilistic methods are used to qualify claims in terms of how well or poorly probed they are. Using probability to discern probativeness differs from the goals of probabilism, as well as that of long-run performance.[ii] Frequentist error probabilities can (but need not) provide ways to assess and control the error-probing capacities of methods.
(3) Tests of model assumptions are based on error-statistical methods (e.g., simple significance tests), as are the latest fraud busting procedures. Probabilists, at times, echo criticisms that presuppose a statistical philosophy at odds with their own probabilism. [Are p-values to be trusted (only) to show that p-values cannot be trusted?] This is either inconsistent or disingenuous.
(4) Assuming probabilism often leads to presupposing that p-values, confidence levels and other error statistical properties are misinterpreted. Reinterpreting them as degrees of belief, support, plausibility (absolute or comparative) begs the question against the error statistical goals at the heart of debunking. If taken seriously, such twists turn out to license, rather than hold accountable, the original questionable science (she showed the greatest integrity in promoting her beliefs in H), while some of the most promising research (e.g., discovery of the Higgs boson) is misunderstood and/or dubbed pseudoscientific.
(5) A new paradigm–a hybrid of probabilism and performance– sets out to assess “science-wise rates of false discoveries” but (unintentionally) institutionalizes howlers, abuses, and cookbook statistics. [(5) is just a place holder. I am referring to the movement described here: “Beware of questionable front page articles warning you to beware of questionable front page articles“.]
Slides (draft from Feb 21, 2014)
[i] Feb 5 note: Let me be clear that Bayesian techniques may be used within an error statistical philosophy (error-statistical Bayes?). Gelman and Shalizi 2013, in their “meeting of the minds” even suggest as much. They end their rejoinder: “Bayesian methods have seen huge advances in the past few decades. It is time for Bayesian philosophy to catch up. and we see our paper as the beginning, not the end, of that process.” (Gelman and Shalizi, Rejoinder, p. 79).
As a philosopher, it is precisely that philosophical process that I am interested in.
[ii] Feb 5 note:The core of “probabilism” is the supposition that claims are warranted by being true or probably true (absolute or comparative). Many would stop at capturing a comparative increase in warrant, but then when is a claim well or poorly warranted? Of the two, I’m most interested in capturing “poorly warranted”. Saying H has passed a highly insevere test–one that has scarcely done anything at all to subject H to a “hard time”–is not adequately captured by saying the test hasn’t firmed up the evidence for H. You can’t deem someone culpable of bad science, much less fraud, for merely giving us an uninformative test. Even Bayesians should see the need for the distinct assessment I am suggesting.
*As Nietzsche said some place: “I have to be unprepared to be master of myself”.
Cosponsored by the Department of Mathematics & Statistics at Boston University.