Monthly Archives: January 2013

U-Phil: J. A. Miller: Blogging the SLP

Jean Miller

Jean Miller

Jean A. Miller, PhD
Department of Philosophy
Virginia Tech


Mayo in her “rejected” post (12/27/12) briefly points out how Mark Chang, in his book Paradoxes of Scientific Inference (2012, pp. 137-139), took pieces from the two distinct variations she gives of Birnbaum’s arguments, either of which shows the unsoundness of Birnbaum’s purported proof, and illegitimately combines them. He then mistakenly maintains that it is Mayo’s conclusions that are “faulty” rather than Birnbaum’s argument. In this note, I just want to fill in some of the missing pieces of what is going on here, so that others will not be misled. I put together some screen shots so you can read exactly what he wrote pp. 137-139. (See also Mayo’s note to Chang on Xi’an’s blog here.) Continue reading

Categories: Statistics, strong likelihood principle, U-Phil

U-Phil: S. Fletcher & N.Jinn

Samuel Fletcher

“Model Verification and the Likelihood Principle” by Samuel C. Fletcher
Department of Logic & Philosophy of Science (PhD Student)
University of California, Irvine

I’d like to sketch an idea concerning the applicability of the Likelihood Principle (LP) to non-trivial statistical problems.  What I mean by “non-trivial statistical problems” are those involving substantive modeling assumptions, where there could be any doubt that the probability model faithfully represents the mechanism generating the data.  (Understanding exactly how scientific models represent phenomena is subtle and important, but it will not be my focus here.  For more, see In such cases, it is crucial for the modeler to verify, inasmuch as it is possible, the sufficient faithfulness of those assumptions.

But the techniques used to verify these statistical assumptions are themselves statistical. One can then ask: do techniques of model verification fall under the purview of the LP?  That is: are such techniques a part of the inferential procedure constrained by the LP?  I will argue the following:

(1) If they are—what I’ll call the inferential view of model verification—then there will be in general no inferential procedures that satisfy the LP.

(2) If they are not—what I’ll call the non-inferential view—then there are aspects of any evidential evaluation that inferential techniques bound by the LP do not capture. Continue reading

Categories: Statistics, strong likelihood principle, U-Phil

Coming up: December U-Phil Contributions….

Dear Reader: You were probably* wondering about the December U-Phils (blogging the strong likelihood principle (SLP)). They will be posted, singly or in pairs, over the next few blog entries. Here is the initial call, and the extension. The details of the specific U-Phil may be found here, but also look at the post from my 28 Nov. seminar at the London School of Economics (LSE), which was on the SLP. Posts were to be in relation to either the guest graduate student post by Gandenberger, and/or my discussion/argument and reactions to it. Earlier U-Phils may be found here; and more by searching this blog. “U-Phil” is short for “you ‘philosophize”.

If you have ideas for future “U-Phils,” post them as comments to this blog or send them to

*This is how I see “probability” mainly used in ordinary English, namely as expressing something like “here’s a pure guess made without evidence or with little evidence,” be it sarcastic or quite genuine.


Categories: Announcement, Likelihood Principle, U-Phil

P-values as posterior odds?

METABLOG QUERYI don’t know how to explain to this economist blogger that he is erroneously using p-values when he claims that “the odds are” (1 – p)/p that a null hypothesis is false. Maybe others want to jump in here?

On significance and model validation (Lars Syll)

Let us suppose that we as educational reformers have a hypothesis that implementing a voucher system would raise the mean test results with 100 points (null hypothesis). Instead, when sampling, it turns out it only raises it with 75 points and having a standard error (telling us how much the mean varies from one sample to another) of 20. Continue reading

Categories: fallacy of non-significance, Severity, Statistics

New PhilStock

stock picture smaillSee Rejected Posts: Beyond luck or method.

Categories: PhilStock, Rejected Posts

Saturday Night Brainstorming and Task Forces: (2013) TFSI on NHST

img_0737Saturday Night Brainstorming: The TFSI on NHST–reblogging with a 2013 update. Please see most recent 2015 update.

Each year leaders of the movement to reform statistical methodology in psychology, social science and other areas of applied statistics get together around this time for a brainstorming session. They review the latest from the Task Force on Statistical Inference (TFSI), propose new regulations they would like the APA publication manual to adopt, and strategize about how to institutionalize improvements to statistical methodology. 

While frustrated that the TFSI has still not banned null hypothesis significance testing (NHST), since attempts going back to at least 1996, the reformers have created, and very successfully published in, new meta-level research paradigms designed expressly to study (statistically!) a central question: have the carrots and sticks of reward and punishment been successful in decreasing the use of NHST, and promoting instead use of confidence intervals, power calculations, and meta-analysis of effect sizes? Or not?  

This year there are a couple of new members who are pitching in to contribute what they hope are novel ideas for reforming statistical practice. Since it’s Saturday night, let’s listen in on part of an (imaginary) brainstorming session of the New Reformers. This is a 2013 update of an earlier blogpost. Continue reading

Categories: Comedy, reformers, statistical tests, Statistics | Tags: , , , , , ,

New Kvetch/PhilStock

headlesstsa TSA to remove nudie scanners from airports. See Rejected Posts

Categories: Rejected Posts

Ontology & Methdology: Second call for Abstracts, Papers

Conference Graphic

Deadline for submission of (abstracts for) contributed papers*:
February 1, 2013

Dates of Conference: May 4-5, 2013
Blacksburg, Va

  Special invited speakers:

David Danks (CMU), Peter Godfrey-Smith (CUNY), Kevin Hoover (Duke), Laura Ruetsche (U. Mich.), James Woodward (Pitt)

Virginia Tech speakers:
Benjamin Jantzen, Deborah Mayo, Lydia Patton, Aris Spanos

*Accommodation costs will be covered for accepted contributed papers.

  • How do scientists’ initial conjectures about the entities and processes under their scrutiny influence the choice of variables, the structure of mature scientific theories, and methods of interpretation of those theories?
  • How do methods of data generation, statistical modeling, and analysis influence the construction and appraisal of theories at multiple levels?
  • How does historical analysis of the development of scientific theories illuminate the interplay between scientific methodology, theory building, and the interpretation of scientific theories?

This conference brings together prominent philosophers of science, biology, cognitive science, causation, economics, and physics with philosophically minded scientists engaged in research into these interconnected methodological and ontological questions.

We invite (extended abstracts for) contributed papers that illuminate these issues as they arise in general philosophy of science, in causal explanation and modeling, in the philosophy of experiment and statistics, and in the history and philosophy of science.

For further information on submitting a paper or extended abstract, please visit the conference website:

Organizers: Benjamin Jantzen, Deborah Mayo, Lydia Patton

Sponsors: The Virginia Tech Department of Philosophy and the Fund for Experimental Reasoning, Reliability, and the Objectivity and Rationality of Science (E.R.R.O.R.)

Categories: Announcement

Error Statistics Blog: Table of Contents

Organized by Jean Miller, Nicole Jinn

September 2011

Categories: Metablog, Statistics

Aris Spanos: James M. Buchanan: a scholar, teacher and friend

 ob buchanan0011357770191Aris Spanos
Wilson Schmidt Professor of Economics
Department of Economics, Virginia Tech

Although I have known of James M. Buchanan all of my academic career, I got to known him at a personal level as a colleague and a friend in 2000.

Looking back, our first meeting established the nature of our relationship since then. Jim walked into my office at Virginia Tech, and began to introduce himself. I felt somewhat uncomfortable and interrupted him, saying that I knew who he was. Of course he did not know who I was and asked me what area of economics I have been working in. I replied that I was ‘an econometrician, working with actual data aiming to learn about economic phenomena of interest using statistical modeling and inference’, and I hastened to add that our two areas of expertise were rather far apart. His immediate response took me by surprise: ‘From what I know, one cannot do statistical inference unless one’s data come from random samples, which is not the case in economics’. My reply was equally surprising to him: ‘Jim, where have you been for the last 50 years?’ I went on to elaborate that he was expressing an erroneous view that was held in economics in the 1930s. We spent the rest of that afternoon educating each other about our respective areas of expertise and discussing their potential overlap. Continue reading

Categories: Announcement, Statistics

James M. Buchanan

James M. Buchanan HeadshotYesterday, our colleague and friend, James Buchanan (Nobel prize-winner: 1986, Economics) died at 93.

From a NY Times obit [that runs a full half page]:

[He] was a leading proponent of public choice theory, which assumes that politicians and government officials, like everyone else, are motivated by self-interest — getting re-elected or gaining more power — and do not necessarily act in the public interest… He argued that their actions could be analyzed, and even predicted, by applying the tools of economics to political science in ways that yield insights into the tendencies of governments to grow, increase spending, borrow money, run large deficits and let regulations proliferate. Continue reading

Categories: Announcement, Statistics | Tags:

RCTs, skeptics, and evidence-based policy

Senn’s post led me to investigate some links to Ben Goldacre (author of “Bad Science” and “Bad Pharma”) and the “Behavioral Insights Team” in the UK.  The BIT was “set up in July 2010 with a remit to find innovative ways of encouraging, enabling and supporting people to make better choices for themselves. A BIT blog is here”. A promoter of evidence-based public policy, Goldacre is not quite the scientific skeptic one might have imagined. What do readers think?  (The following is a link from Goldacre’s Jan. 6 blog.)

Test, Learn, Adapt: Developing Public Policy with Randomised Controlled Trials

‘Test, Learn, Adapt’ is a paper which the Behavioural Insights Team* is publishing in collaboration with Ben Goldacre, author of Bad Science, and David Torgerson, Director of the University of York Trials Unit. The paper argues that Randomised Controlled Trials (RCTs), which are now widely used in medicine, international development, and internet-based businesses, should be used much more extensively in public policy.
 …The introduction of a randomly assigned control group enables you to compare the effectiveness of new interventions against what would have happened if you had changed nothing. RCTs are the best way of determining whether a policy or intervention is working. We believe that policymakers should begin using them much more systematically. Continue reading

Categories: RCTs, Statistics | Tags:

Guest post: Bad Pharma? (S. Senn)

SENN FEBProfessor Stephen Senn*
Full Paper: Bad JAMA?
Short version–Opinion Article: Misunderstanding publication bias
Video below

Data filters

The student undertaking a course in statistical inference may be left with the impression that what is important is the fundamental business of the statistical framework employed: should one be Bayesian or frequentist, for example? Where does one stand as regards the likelihood principle and so forth? Or it may be that these philosophical issues are not covered but that a great deal of time is spent on the technical details, for example, depending on framework, various properties of estimators, how to apply the method of maximum likelihood, or, how to implement Markov chain Monte Carlo methods and check for chain convergence. However much of this work will take place in a (mainly) theoretical kingdom one might name simple-random-sample-dom. Continue reading

Categories: Statistics, Stephen Senn | Tags: ,

Severity Calculator

Severitiy excel program pic

SEV calculator (with comparisons to p-values, power, CIs)

In the illustration in the Jan. 2 post,

H0: μ < 0 vs H1: μ > 0

and the standard deviation SD = 1, n = 25, so σx  = SD/√n = .2
Setting α to .025, the cut-off for rejection is .39.  (can round to .4).

Let the observed mean X  = .2 , a statistically insignificant result (p value = .16)
SEV (μ < .2) = .5
SEV(μ <.3) = .7
SEV(μ <.4) = .84
SEV(μ <.5) = .93
SEV(μ <.6*) = .975

Some students asked about crunching some of the numbers, so here’s a rather rickety old SEV calculator*. It is limited, rather scruffy-looking (nothing like the pretty visuals others post) but it is very useful. It also shows the Normal curves, how shaded areas change with changed hypothetical alternatives, and gives contrasts with confidence intervals. Continue reading

Categories: Severity, statistical tests

Severity as a ‘Metastatistical’ Assessment

Some weeks ago I discovered an error* in the upper severity bounds for the one-sided Normal test in section 5 of: “Statistical Science Meets Philosophy of Science Part 2” SS & POS 2.  The published article has been corrected.  The error was in section 5.3, but I am blogging all of 5.  

(* μo was written where xo should have been!)

5. The Error-Statistical Philosophy

I recommend moving away, once and for all, from the idea that frequentists must ‘sign up’ for either Neyman and Pearson, or Fisherian paradigms. As a philosopher of statistics I am prepared to admit to supplying the tools with an interpretation and an associated philosophy of inference. I am not concerned to prove this is what any of the founders ‘really meant’.

Fisherian simple-significance tests, with their single null hypothesis and at most an idea of  a directional alternative (and a corresponding notion of the ‘sensitivity’ of a test), are commonly distinguished from Neyman and Pearson tests, where the null and alternative exhaust the parameter space, and the corresponding notion of power is explicit. On the interpretation of tests that I am proposing, these are just two of the various types of testing contexts appropriate for different questions of interest. My use of a distinct term, ‘error statistics’, frees us from the bogeymen and bogeywomen often associated with ‘classical’ statistics, and it is to be hoped that that term is shelved. (Even ‘sampling theory’, technically correct, does not seem to represent the key point: the sampling distribution matters in order to evaluate error probabilities, and thereby assess corroboration or severity associated with claims of interest.) Nor do I see that my comments turn on whether one replaces frequencies with ‘propensities’ (whatever they are). Continue reading

Categories: Error Statistics, philosophy of science, Philosophy of Statistics, Severity, Statistics

Blog at