Its always a insurmountable challenge to get Peirce right – see below.

My comment was motivated by Peirce’s work (via Paul Forster U of Ottawa) and I carefully said “experiences of reality” rather than reality itself but my constraints or [of] reality was sloppy (along with the typo of or for of).

For keeping to rational numbers for underlying rates (which may be continuous in reality) I should have said unless our powers of discrimination continue to increase without bound our experiences of reality will no longer provide any sense of being wrong somewhere somehow – if investigators hit upon a rational number close enough. But then 100 billion years from now is not a big concern for me😉

But agree reality may be continuous and I believe pierce though so.

I have yet to read Peirce’s Clarifications of Continuity Jérôme Havenel http://www.jstor.org/stable/40321237?seq=1#page_scan_tab_contents that provides an argument that Peirce changed his view in 1908 nor may have JL Bell.

Keith O’Rourke

(If you do read it perhaps comment here or email me)

I don’t generally have any big issues with your approach on the simple examples I’ve seen – though still I fail to see how you would intend it to apply to anything beyond a normal distribution with unknown mean. My biggest problem is how you present other approaches such as Bayesian and Likelihoodist etc. If people were to rely on your accounts rather than read the best presentations by the main advocates they would get a very misleading view (of course, same applies to eg many accounts of Frequentist inference by Bayesians).

I’m interested in an honest appraisal of the merits and troubles of various approaches by those with a genuine desire to understand. That these discussions end up about debate tactics, rhetoric, picking a team etc is very disheartening.

People like Sanders are rightly commended for giving each approach proper consideration, being open to other points of view and testing out methods by putting them to ‘severe test’ – seeing how they stand up against non-trivial problems.

]]>Thanks for your comment. I agree that the continuity is respect to possibilities and that our direct observations may typically be discrete. On the broader philosophical view of eg Evans, you and Jaynes prioritising discrete/finite as fundamental I can’t really agree (not that it matters here!). There are numerous theoretical physicists who would say the discrete arises from the continuous not vice versa. I find this general view of continuity as primary aesthetically pleasing. Incidentally I believe Peirce would likely agree?

But of course we don’t really know either way. And interesting survey of some issues is JL Bell’s ‘The Continuous and the Infinitesimal in Mathematics and Philosophy.’

From memory he has some Peirce references.

]]>Some discussion of nuisance parameters (Starting Section 9):

http://www.phil.vt.edu/dmayo/personal_website/ch%207%20cox%20&%20mayo.pdf

None of this discussion pertains to the topic of the post. ]]>

If you are not aware, Richard McElreath also uses information theory in his intro book and course videos http://xcelab.net/rm/statistical-rethinking/

]]>‘Continuity of analysis’ – agree but I think it needs to be highlighted that the continuity is for the _space of possibilities_ not actualities.

Possibilities are various entertained representations of reality or models where as actualities are our (more) brute force experiences of reality or likely constraints or reality (i.e. observations being being of limited accuracy and hence discrete and constraints making say underlying rates of infections limited to rational numbers.)

So likelihoods are properly defined as probabilities of observations (hence discrete) and parameter spaces of unknowns in reality are actually discrete. But continuous models are far more convenient and productive models of the first and required by principle for the second as _we_ can’t rule out any value before hand and so must allow them all for possibilities.

(Now this was long but it relates to Mike Evans view on insisting on continuous models that remain good approximations for finite models.)

Keith O’Rourke

]]>As I’ve said, I have my issue with pure Likelihoodist approaches, but I do find them quite appealing in many respects. And certainly suggestive of fruitful directions, even if you end up with something else.

As Hacking’s review concludes:

“I hope Edwards’s book will encourage others to enter the labyrinth and see where it goes”

]]>https://errorstatistics.files.wordpress.com/2014/08/hacking_review-likelihood.pdf

I realize we’ve gotten way off topic.

]]>I would suggest instead that the easiest way to improve on *your account* of existing ‘Likelihoodlums’ would be to read Edwards and Pawitan in detail, work through their examples and give a detailed comparison of your severity analysis on these specific examples.

Also, when you say

> He does say that he can give the likelihood ratios of “the totality of possible values of θ” (p. 20) so if he were to allow composite inferences, with the qualifications of which alternatives are better supported than the null, it seems we’re not so different after all, except for two things…

This suggests that you are surprised that a Likelihoodist would present a likelihood function – i.e. plot the (normalised) likelihood as a function of the parameter for all possible values of theta. This is in fact the standard likelihood approach, especially if the goal is not a comparison with frequentist tests but simply to work within the paradigm.

]]>Royall will contrast his view with significance tests by means of a test of H0: θ ≤ 0.2 vs H1: θ > 0.2 , where θ is the “success probability associated with a new treatment” (in a Bernoulli model). Royall considers a result that reaches statistical significance at level ~.003. While the significance tester takes this as indicating evidence for H1, “the law of likelihood does not allow the characterization of these observations as strong evidence for H1 over H0 (Royall 1997, p. 20). [He uses H1 and H2.] Why? The problem is that even though the likelihood of θ = 0.2 is small, there are values within composite alternative H1 that are even less likely on the data, for instance θ = 0.9.

This is supposed to show a problem with significance tests, because sig tests are not comparative (even though they have an alternative, the composite one). This would be to say that rejecting H0: θ ≤ 0.2 and inferring H1: θ > 0.2 is to assert every parameter point within H1 is more likely (or better supported) than every point in H0. That seems an idiosyncratic meaning to attach to “infer evidence of θ > 0.2”. It doesn’t explain what the problem is for the significance tester who just takes it to mean what it says:

To reject H0: θ ≤ 0.2 is to infer some positive discrepancy from .2.

And a proper use of tests (admittedly Fisher didn’t require it) would indicate that there’s terrible evidence for θ = 0.9; you’d expect observed proportions greater than observed with probability ~1 were the data to have come from θ that large.(I’m not giving the specifics of the outcome, you can check the page.)

He does say that he can give the likelihood ratios of “the totality of possible values of θ” (p. 20) so if he were to allow composite inferences, with the qualifications of which alternatives are better supported than the null, it seems we’re not so different after all, except for two things: (1) Hypotheses to which he’ll give maximum comparative support are ones for which we’d say there’s poor evidence and (2) He will not differentiate cases where the alternative is a data-dependent selected hypothesis. To take his example, success associated with a new treatment can be measured in scads of different ways. That’s why it’s usually required to preregister what features are going to be measured in determining success. Otherwise there’s a high probability of finding success on some feature or other, even if it’s due to chance variability. Royall could not be more dismissive of such error statistical reasoning. If you don’t like the data-dependent alternative, it must be because you give it a low degree of belief. But it’s not belief–it’s the method that bothers us.

With 2 prespecified hypotheses there’s error control (and to some extent with finitely many prespecified hypotheses—yet the error control goes down the more there are), but if it’s open-ended there is not. That’s why binary choices are the typical form–but there’s no onus on predesignation. Now Lew will say that I’m entering the land of “action”, but I’m interested in inference. Even if we suppose there’s no issue of selection effects or stopping rules (which is to sidestep a central problem of statistical inference), we at most have a report of data and are still in need of an account of inference. By Royall’s own admission, his other categories do not count as inferences about what’s warranted by evidence. Perhaps this is the place that Lew can improve on existing Likelihoodlums.

Sorry this got long, and no time to proof it.

]]>