First, Edwards specifies that likelihoods and the likelihood principle are model-bound. It seems to me that he is unarguably correct, and if we insist that likelihoods have to be from a single (statistical) model in order to be comparable, and thus subject to the likelihood principle, then many of your objections to the likelihood principle disappear.

The evil demon arguments like “things had to turn out the way they did” in your section 2 have no power if the likelihoods in question have scope limited to their respective models. The “things had to” model is not a statistical model in the same way as the (0.5)^k model is.

Hacking’s concern about the interpretation of likelihood ratios for tanks and diffraction gratings is easily disarmed after recognition that the likelihoods come from two different models. (I would point out the truth that he could obtain a different pair of likelihood ratios simply by choosing a different model for estimation of the tank serial number and for the distribution of diffraction grating sizes.)

Second, according the Royall the likelihood principle and law of likelihood do not say anything about interpretation of the relative evidential meanings of simple versus composite hypotheses. The comparison of the (0.5)^k coin hypothesis H0 is simple but, if it is anything, the “things had to” hypothesis H1 is composite.

]]>‘In terms of rejection:

“An hypothesis should be rejected if and only if there is some rival hypothesis much better supported [i.e., much more likely] than it is.” (Hacking 1965, 89)’

Bayes theory doesn’t reject based off likelihood ratios. It uses Bayes factors which can be greater or less than 1 regardless of what the LR is.

If this is an example of the “law of likelihood” then Bayes breaks it.

]]>Bayesian theory uses P(H|x) in it’s equations to reason about H in light of x, not the LR in isolation. They aren’t guilty of the crimes you claim for those who do use LR and nothing else.

]]>“It can hardly be called an “assumption” when there are entire books defending it!” Hmm, like the books reviewed in this post? As philosophers, of course, our job is to question the laws with problematic consequences, especially when carved into stone. And of course I know you know this.

]]>This post suggests Bayesians are guilty of this.

Bayesians are not. They judge the model in light of the data using P(H|x) which causes them to maximize a different quantity than the likelihood.

The quantity they do maximize over is formally identical to the penalized maximum (log) likelihood procedures which are commonly used in practice to avoid over-fitting.

]]>To anon-fan: A Bayesian obeys the Law of Likelihood in the sense that the odds of H1 against H2 increase upon conditioning on E if and only if E favors H1 over H2 according to the Law of Likelihood:

Pr(H1|E)/Pr(H2|E)=[Pr(H1)/Pr(H2)][Pr(E|H1)/Pr(E|H2)]

]]>Translating “Data x are better evidence for hypothesis H1 than for H0″ into Bayesian terms gives:

P(H1|x) is greater than P(H0|x)

directly. The posterior (not the likelihood ratio directly) is what’s used by Bayesians in decisions analysis, hypothesis testing, parameter estimation, you name it.

This isn’t a technicality. As stated before, Bayesian breaking of the “law of likelihood” is directly related the most common ways of avoiding over-fitting in practice. Over-fitting being the chief negative consequence of adhering religiously to the law of likelihood.

]]>The pos result gives comparatively more support to disease than no disease; posterior for disease goes up, even if still lower than its denial.

This is the whole distinction between “making more firm” and having a high posterior. e.g.

http://errorstatistics.com/2013/10/19/bayesian-confirmation-philosophy-and-the-tacking-paradox-in-i/

I’m not sure how this bears on the point I’ve already noted, group (1) differs from group (2).

Will be away the rest of the day.

]]>You were careful in the post and comments to say you were talking about the “law of likelihood” not the likelihood principle. I’ve been talking about the former, which is what I thought you were talking about as well.

Regardless of anything else, the import of the data x for hypothesis H is judged by Bayesians through P(H|x). So it possible to adhere to the likelihood principle, and still violate the Law of Likelihood.

]]>“Data x are better evidence for hypothesis H1 than for H0 if x are more probable under H1 than under H0.”

If a Bayesian takes “data x are better evidence for H1 than for H1″ to be “the posterior for H1 is larger than H0″ then Bayesians do not adhere to the “law of likelihood”.

]]>Doesn’t follow at all, there’d be no reason to add the priors to (1) if they didn’t make a difference. But the data import, with which the priors are combined, would still come through the likelihoods/LRs or the like.

Didn’t say anything about the value or disvalue of “penalties” for overfitting in model selection.

]]>