Monthly Archives: November 2017

The Conversion of Subjective Bayesian, Colin Howson, & the problem of old evidence (i)


“The subjective Bayesian theory as developed, for example, by Savage … cannot solve the deceptively simple but actually intractable old evidence problem, whence as a foundation for a logic of confirmation at any rate, it must be accounted a failure.” (Howson, (2017), p. 674)

What? Did the “old evidence” problem cause Colin Howson to recently abdicate his decades long position as a leading subjective Bayesian? It seems to have. I was so surprised to come across this in a recent perusal of Philosophy of Science that I wrote to him to check if it is really true. (It is.) I thought perhaps it was a different Colin Howson, or the son of the one who co-wrote 3 editions of Howson and Urbach: Scientific Reasoning: The Bayesian Approach espousing hard-line subjectivism since 1989.[1] I am not sure which of the several paradigms of non-subjective or default Bayesianism Howson endorses (he’d argued for years, convincingly, against any one of them), nor how he handles various criticisms (Kass and Wasserman 1996), I put that aside. Nor have I worked through his, rather complex, paper to the extent necessary, yet. What about the “old evidence” problem, made famous by Clark Glymour 1980?  What is it?

Consider Jay Kadane, a well-known subjective Bayesian statistician. According to Kadane, the probability statement: Pr(d(X) ≥ 1.96) = .025

“is a statement about d(X) before it is observed. After it is observed, the event {d(X) ≥ 1.96} either happened or did not happen and hence has probability either one or zero” (2011, p. 439).

Knowing d0= 1.96, (the specific value of the test statistic d(X)), Kadane is saying, there’s no more uncertainty about it.* But would he really give it probability 1? If the probability of the data x is 1, Glymour argues, then Pr(x|H) also is 1, but then Pr(H|x) = Pr(H)Pr(x|H)/Pr(x) = Pr(H), so there is no boost in probability for a hypothesis or model arrived at after x. So does that mean known data doesn’t supply evidence for H? (Known data are sometimes said to violate temporal novelty: data are temporally novel only if the hypothesis or claim of interest came first.) If it’s got probability 1, this seems to be blocked. That’s the old evidence problem. Subjective Bayesianism is faced with the old evidence problem if known evidence has probability 1, or so the argument goes.

What’s the accepted subjective Bayesian solution to this?  (I’m really asking.) One attempt is to subtract out, or try to, the fact that x  is known, and envision being in a context prior to knowing x. That’s not very satisfactory or realistic, in general. Subjective Bayesians in statistics, I assume, just use the likelihoods and don’t worry about this: known data are an instance of a general random variable X, and you just use the likelihood once it’s known that {Xx}. But can you do this and also hold, with Kadane, that it’s an event with probability 1? I’ve always presumed that the problem was mainly for philosophers who want to assign probabilities to statements in a language, rather than focusing on random variables and their distributions, or statistical models (a mistake in my opinion). I also didn’t think subjective Bayesians in statistics were prepared to say, with Kadane, that an event has probability 1 after it’s observed or known. Yet if probability measures your uncertainty in the event, Kadane seems right. So how does the problem of old evidence get solved by subjective Bayesian practitioners? I asked Kadane years ago, but did not get a reply.

Any case where the data are known prior to constructing or selecting a hypothesis to accord with them, strictly speaking, would count as cases where data are known, or so it seems.** The most well known cases in philosophy allude to a known phenomenon, such as Mercury’s perihelion, as evidence for Einstein’s GTR. (The perihelion was long known as anomalous for Newton, yet GTR’s predicting it, without adjustments, is widely regarded as evidence for GTR.)[2] You can read some attempted treatments by philosophers in Howson’s paper; I discuss Garber’s attempt in Chapter 10, Mayo 1996 [EGEK], 10.2.[3] I’d like to hear from readers, regardless of statistical persuasion, how it’s handled in practice (or why it’s deemed unproblematic).

But wait, are we sure it isn’t also a problem for non-subjective or default Bayesians? In this paradigm (and there are several varieties), the prior probabilities in hypotheses are not taken to express degrees of belief but are given by various formal assignments, so as to have minimal impact on the posterior probability. Although the holy grail of finding “uninformative” default priors has been given up, default priors are at least supposed to ensure the data dominate in some sense.[4] A true blue subjective Bayesian like Kadane is unhappy with non-subjective priors. Rather than quantify prior beliefs, non-subjective priors are viewed as primitives or conventions or references for obtaining posterior probabilities. How are they to be interpreted? It’s not clear, but let’s put this aside to focus on the “old evidence” problem.

OK, so how do subjective Bayesians get around  the old evidence problem?

*I thank Jay Kadane for noticing I used the inequality in my original post 11/27/17. I haven’t digested his general reaction yet, stay tuned.
**There’s a place where Glymour (or Glymour, Scheines, Spirtes, and Kelly 1987) slyly argues that, strictly speaking, the data are always known by the time you appraise some some model–or so I seem to recall. But I’d have to research that or ask him.

[1] I’ll have to add a footnote to my new book (Statistical Inference as Severe Testing: How to Get Beyond the Statistics Wars, CUP, 2018), as I allude to him as a subjectivist Bayesian philosopher throughout.
[2] I argue that the reason it was good evidence for GTR is precisely because it was long known, and yet all attempts to explain it were ad hoc, so that they failed to pass severe tests. The deflection effect, by contrast, was new and no one had discerned it before, let alone tried to explain it. Note that this is at odds with the idea that novel results count more for a theory H when H (temporally) predicts them, than when H accounts for known results. (Here the known perihelion of Mercury is thought to be better evidence for GTR than the novel deflection effect.) But the issue isn’t the novelty of the results, it’s how well-tested H is, or so I argue. (EGEK Chapter 8, p. 288).
[3] I don’t see that the newer attempts avoid the key problem in Garber’s. I’m not sure if Howson is rescinding the remark I quote from him in EGEK, p. 333. Here he was trying to solve it by subtracting the data out from what’s known.
[4] Some may want to use “informative” priors as well, but their meaning/rationale is unclear. Howson mentions Wes Salmon’s style of Bayesianism in this paper, but Salmon was a frequentist.


-Glymour, C. (1980), Theory and Evidence, Princeton University Press. I’ve added a link to the relevant chapter, “Why I am Not a Bayesian” (from Fitelson resources). The relevant pages are 85-93.
Howson, C (2017), “Putting on the Garber Style? Better Not”, Philosophy of Science, 84 (October 2017) pp. 659–676.
-Kass, R. and Wasserman, L. (1996), “The Selection of Prior Distributions By Formal Rules”,  JASA 91: 1343-70.

Further References to Solutions (to this or Related problems): 

-Garber, Daniel. 1983. “Old Evidence and Logical Omniscience in Bayesian Confirmation Theory.” In Minnesota Studies in the Philosophy of Science, ed. J. Earman, 99–131. Minneapolis: University of Minnesota Press.
-Hartmann, Stephan, and Branden Fitelson. 2015. “A New Garber-Style Solution to the Problem of Old Evidence.” Philosophy of Science 82 (4): 712–17. H
Seidenfeld, T., Schervish, M., and Kadane, T. 2012. “What kind of uncertainty is that ?” Journal of Philosophy, (2012), pp 516-533.

Categories: Bayesian priors, objective Bayesians, Statistics | Tags: | 25 Comments

Erich Lehmann’s 100 Birthday: Neyman Pearson vs Fisher on P-values

Erich Lehmann 20 November 1917 – 12 September 2009

Erich Lehmann was born 100 years ago today! (20 November 1917 – 12 September 2009). Lehmann was Neyman’s first student at Berkeley (Ph.D 1942), and his framing of Neyman-Pearson (NP) methods has had an enormous influence on the way we typically view them.*


I got to know Erich in 1997, shortly after publication of EGEK (1996). One day, I received a bulging, six-page, handwritten letter from him in tiny, extremely neat scrawl (and many more after that).  He began by telling me that he was sitting in a very large room at an ASA (American Statistical Association) meeting where they were shutting down the conference book display (or maybe they were setting it up), and on a very long, wood table sat just one book, all alone, shiny red.

He said ” I wonder if it might be of interest to me!”  So he walked up to it….  It turned out to be my Error and the Growth of Experimental Knowledge (1996, Chicago), which he reviewed soon after[0]. (What are the chances?) Some related posts on Lehmann’s letter are here and here.

Continue reading

Categories: Fisher, P-values, phil/history of stat | 3 Comments


3 years ago...

3 years ago…

MONTHLY MEMORY LANE: 3 years ago: November 2014. I mark in red 3-4 posts from each month that seem most apt for general background on key issues in this blog, excluding those reblogged recently[1], and in green 3- 4 others of general relevance to philosophy of statistics (in months where I’ve blogged a lot)[2].  Posts that are part of a “unit” or a group count as one (11/1/14 & 11/09/14 and 11/15/14 & 11/25/14 are grouped). The comments are worth checking out.


November 2014

  • 11/01 Philosophy of Science Assoc. (PSA) symposium on Philosophy of Statistics in the Higgs Experiments “How Many Sigmas to Discovery?”
  • 11/09 “Statistical Flukes, the Higgs Discovery, and 5 Sigma” at the PSA
  • 11/11 The Amazing Randi’s Million Dollar Challenge
  • 11/12 A biased report of the probability of a statistical fluke: Is it cheating?
  • 11/15 Why the Law of Likelihood is bankrupt–as an account of evidence


  • 11/18 Lucien Le Cam: “The Bayesians Hold the Magic”
  • 11/20 Erich Lehmann: Statistician and Poet
  • 11/22 Msc Kvetch: “You are a Medical Statistic”, or “How Medical Care Is Being Corrupted”
  • 11/25 How likelihoodists exaggerate evidence from statistical tests

[1] Monthly memory lanes began at the blog’s 3-year anniversary in Sept, 2014.

[2] New Rule, July 30,2016, March 30,2017 -a very convenient way to allow data-dependent choices (note why it’s legit in selecting blog posts, on severity grounds).












Categories: 3-year memory lane | 1 Comment

Yoav Benjamini, “In the world beyond p < .05: When & How to use P < .0499…"


These were Yoav Benjamini’s slides,”In the world beyond p<.05: When & How to use P<.0499…” from our session at the ASA 2017 Symposium on Statistical Inference (SSI): A World Beyond p < 0.05. (Mine are in an earlier post.) He begins by asking:

However, it’s mandatory to adjust for selection effects, and Benjamini is one of the leaders in developing ways to carry out the adjustments. Even calling out the avenues for cherry-picking and multiple testing, long known to invalidate p-values, would make replication research more effective (and less open to criticism). Continue reading

Categories: Error Statistics, P-values, replication research, selection effects | 22 Comments

Blog at