Excursion 1 Tour II: Error Probing Tools versus Logics of Evidence-Excerpt

For the first time, I’m excerpting all of Excursion 1 Tour II from SIST (2018, CUP).

1.4 The Law of Likelihood and Error Statistics

If you want to understand what’s true about statistical inference, you should begin with what has long been a holy grail–to use probability to arrive at a type of logic of evidential support–and in the first instance you should look not at full-blown Bayesian probabilism, but at comparative accounts that sidestep prior probabilities in hypotheses. An intuitively plausible logic of comparative support was given by the philosopher Ian Hacking (1965)–the Law of Likelihood. Fortunately, the Museum of Statistics is organized by theme, and the Law of Likelihood and the related Likelihood Principle is a big one.

Law of Likelihood (LL):Data x are better evidence for hypothesis H₁than for H₀if x is more probable under H₁than under H₀: Pr(x; H₁) > Pr(x; H₀) that is, the likelihood ratio LR of H₁over H₀exceeds 1.

H₀and H₁are statistical hypotheses that assign probabilities to the values of the random variable X. A fixed value of X is written x₀, but we often want to generalize about this value, in which case, following others, I use x. The likelihood of the hypothesis H, given data x, is the probability of observing x, under the assumption that H is true or adequate in some sense. Typically, the ratio of the likelihood of H₁over H₀also supplies the quantitative measure of comparative support. Note when X is continuous, the probability is assigned over a small interval around X to avoid probability 0.

Does the Law of Likelihood Obey the Minimal Requirement for Severity?

Likelihoods are vital to all statistical accounts, but they are often misunderstood because the data are fixed and the hypothesis varies. Likelihoods of hypotheses should not be confused with their probabilities. Two ways to see this. First, suppose you discover all of the stocks in Pickrite’s promotional letter went up in value (x)–all winners. A hypothesis H to explain this is that their method always succeeds in picking winners. H entails x, so the likelihood of H given x is 1. Yet we wouldn’t say H is therefore highly probable, especially without reason to put to rest that they culled the winners post hoc. For a second way, at any time, the same phenomenon may be perfectly predicted or explained by two rival theories; so both theories are equally likely on the data, even though they cannot both be true.

Suppose Bristol-Roach, in our Bernoulli tea tasting example, got two correct guesses followed by one failure. The observed data can be represented as x₀=<1,1,0>. Let the hypotheses be different values for θ, the probability of success on each independent trial. The likelihood of the hypothesis H₀: θ = 0.5, given x₀, which we may write as Lik(0.5), equals (½)(½)(½) = 1/8. Strictly speaking, we should write Lik(θ;x₀), because it’s always computed given data x₀; I will do so later on. The likelihood of the hypothesis θ = 0.2 is Lik(0.2)= (0.2)(0.2)(0.8) = 0.032. In general, the likelihood in the case of Bernoulli independent and identically distributed trials takes the form: Lik(θ)= θ^s(1- θ)^f, 0< θ<1, where s is the number of successes and f the number of failures. Infinitely many values for θ between 0 and 1 yield positive likelihoods; clearly, then likelihoods do not sum to 1, or any number in particular. Likelihoods do not obey the probability calculus.

The Law of Likelihood (LL) will immediately be seen to fail on our minimal severity requirement – at least if it is taken as an account of inference. Why? There is no onus on the Likelihoodist to predesignate the rival hypotheses – you are free to search, hunt, and post-designate a more likely, or even maximally likely, rival to a test hypothesis H₀

Consider the hypothesis that θ = 1 on trials one and two and 0 on trial three. That makes the probability of x maximal. For another example, hypothesize that the observed pattern would always recur in three-trials of the experiment (I. J. Good said in his cryptoanalysis work these were called “kinkera”). Hunting for an impressive fit, or trying and trying again, one is sure to find a rival hypothesis H₁much better “supported” than H₀even when H₀is true. As George Barnard puts it, “there always is such a rival hypothesis, viz. that things just had to turn out the way they actually did” (1972, p. 129).

Note that for any outcome of n Bernoulli trials, the likelihood of H₀: θ = 0.5 is (0.5)ⁿ, so is quite small. The likelihood ratio (LR) of a best-supported alternative compared to H₀would be quite high. Since one could always erect such an alternative,

(*) Pr(LR in favor of H₁over H₀; H₀) = maximal.

Thus the LL permits BENT evidence. The severity for H₁is minimal, though the particular H₁is not formulated until the data are in hand.I call such maximally fitting, but minimally severely tested, hypotheses Gellerized, since Uri Geller was apt to erect a way to explain his results in ESP trials. Our Texas sharpshooter is analogous because he can always draw a circle around a cluster of bullet holes, or around each single hole. One needn’t go to such an extreme rival, but it suffices to show that the LL does not control the probability of erroneous interpretations.

What do we do to compute (*)? We look beyond the specific observed data to the behavior of the general rule or method, here the LL. The output is always a comparison of likelihoods. We observe one outcome, but we can consider that for any outcome, unless it makes H₀maximally likely, we can find an H₁that is more likely. This lets us compute the relevant properties of the method: its inability to block erroneous interpretations of data. As always, a severity assessment is one level removed: you give me the rule, and I consider its latitude for erroneous outputs. We’re actually looking at the probability distribution of the rule, over outcomes in the sample space. This distribution is called a sampling distribution. It’s not a very apt term, but nothing has arisen to replace it. For those who embrace the LL, once the data are given, it’s irrelevant what other outcomes could have been observed but were not. Likelihoodists say that such considerations make sense only if the concern is the performance of a rule over repetitions, but not for inference from the data. Likelihoodists hold to “the irrelevance of the sample space” (once the data are given). This is the key contrast between accounts based on error probabilities (error statistical) and logics of statistical inference.

To continue reading Excursion 1 Tour II, go here.

__________

This excerpt comes from Statistical Inference as Severe Testing: How to Get Beyond the Statistics Wars (Mayo, CUP 2018).

Earlier excerpts and mementos from SIST up to Dec 31, 20018 are here.

Jan 10, 2019 Excerpt from SIST is here, Jan 27 is here, and Feb 23 here.

Jan 13, 2019 Mementos from SIST (Excursion 4) are here. These are summaries of all 4 tours.

March 5, 2019 Blurbs of all 16 Tours can be found here.

Where YOU are in the journey

2 thoughts on “Excursion 1 Tour II: Error Probing Tools versus Logics of Evidence-Excerpt”

April 5, 2019

Michael J Lew

Your ‘kinkera’ where the probability of outcomes varies between trials is not available among the parameter values within the statistical model for Bernoulli trials that you describe in the paragraph starting with “Suppose Bristol-Roach”. That means that a kinkera is not among the ‘hypotheses’ that a likelihoodist is free to designate as a rival to the parameter values of theta. Therefore your criticism is at least incomplete and probably false, as I have argued on several occasions on this blog.

I have an arxived paper that explores the issue in full: https://arxiv.org/abs/1507.08394

April 5, 2019

Mayo

Yes, and I do say that. But using the max likely alt leads to the same problem for the severe tester. I’m surprised, frankly, that Royall and others invoke trick decks. The point is, again, that the problem doesn’t depend on the case of finding an alternative that makes the outcome something that had to happen (as Barnard puts it). We have discussed this, as you observe.

I welcome constructive comments that are of relevance to the post and the discussion, and discourage detours into irrelevant topics, however interesting, or unconstructive declarations that "you (or they) are just all wrong". If you want to correct or remove a comment, send me an e-mail. If readers have already replied to the comment, you may be asked to replace it to retain comprehension. Cancel reply

Excursion 1 Tour II: Error Probing Tools versus Logics of Evidence-Excerpt

Post navigation

2 thoughts on “Excursion 1 Tour II: Error Probing Tools versus Logics of Evidence-Excerpt”

The Statistics Wars & Their Casualties

Blog links (references)

Reviews of Statistical Inference as Severe Testing (SIST)

Interviews & Debates on PhilStat (2020)

Interviews on PhilStat (2019)

LSE PH500 Research Seminar (May 21-June 25, 2020): Controversies in Phil Stat

Summer Seminar 2019 (article)

Top Posts & Pages

Conferences & Workshops

RMM Special Topic

Mayo & Spanos, Error Statistics

Follow Blog via Email

My Websites

Recent Posts: PhilStatWars

The Statistics Wars and Their Casualties Videos & Slides from Sessions 1 & 2

THE STATISTICS WARS AND THEIR CASUALTIES VIDEOS & SLIDES FROM SESSIONS 3 & 4

Final session: The Statistics Wars and Their Casualties: 8 December, Session 4

SCHEDULE: The Statistics Wars and Their Casualties: 1 Dec & 8 Dec: Sessions 3 & 4

WORKSHOP

LOG IN/OUT

Archives

© Deborah G. Mayo, Error Statistics Philosophy, 2011-2018 All Rights Reserved.

Excursion 1 Tour II: Error Probing Tools versus Logics of Evidence-Excerpt

Related

Post navigation

2 thoughts on “Excursion 1 Tour II: Error Probing Tools versus Logics of Evidence-Excerpt”

The Statistics Wars & Their Casualties

Blog links (references)

Reviews of Statistical Inference as Severe Testing (SIST)

Interviews & Debates on PhilStat (2020)

Interviews on PhilStat (2019)

LSE PH500 Research Seminar (May 21-June 25, 2020): Controversies in Phil Stat

Summer Seminar 2019 (article)

Top Posts & Pages

Conferences & Workshops

RMM Special Topic

Mayo & Spanos, Error Statistics

Follow Blog via Email

My Websites

Recent Posts: PhilStatWars

The Statistics Wars and Their Casualties Videos & Slides from Sessions 1 & 2

THE STATISTICS WARS AND THEIR CASUALTIES VIDEOS & SLIDES FROM SESSIONS 3 & 4

Final session: The Statistics Wars and Their Casualties: 8 December, Session 4

SCHEDULE: The Statistics Wars and Their Casualties: 1 Dec & 8 Dec: Sessions 3 & 4

WORKSHOP

LOG IN/OUT

Archives

© Deborah G. Mayo, Error Statistics Philosophy, 2011-2018 All Rights Reserved.