“Fisher, although originally recommending the use of such levels, later strongly attacked any standard choice.”[p. 1248] As a matter of fact, Fisher spent a lot of time after 1935 renouncing his older positions–the ones Neyman tried so hard to capture in his tests and confidence intervals–because he was disgruntled with Neyman who a) wouldn’t use his book and b) dared to point out inconsistencies in his fiducial frequencies. So Fisher’s war with Neyman is largely a war with himself, and a refusal to admit his mistaken claims about fiducial probabilities.

In a nutshell, take a specific .95 lower confidence interval bound for mu (sigma known or estimated) in a Normal distribution: CI lower (.95). This Fisher called the fiducial 5% limit for mu, and claimed the probability that mu < the CI lower (.95) = .05. This probability holds for the CI (or fiducial) estimatOR, but once you substitute the data and get a specific value for the lower bound, the probability no longer holds. At most you could claim, as Fisher himself does, that the aggregate of outputs of form: mu < CI lower(.95) has 5% false claims. But when Neyman described them this way,* Fisher said he was turning his methods into acceptance sampling devices. See the top half of p.75 of Fisher’s contribution to what I call the “triad” (1955:

http://www.phil.vt.edu/dmayo/personal_website/Fisher-1955.pdf

This whole issue is complicated, but revelatory, and I will say more about it in a post at some point.

*Neyman was only trying to offer a revised wording of Fisher’s 1930 paper on fiducial frequencies, to avoid falsehoods which he attributed to accidental "lapses of language" common in describing a new idea.

“The value of P is between .02 and .05, so that sex difference in the classification by hair colours is probably significant as judged by this district alone.”

Also:

“The χ2 test does not attempt to measure the degree of association, but as a test of significance it is independent of all additional hypotheses as to the nature of the association.”

http://psychcentral.com/classics/Fisher/Methods/chap4.htm

I’m interested that you suggest this quote from Fisher (1946) shows that he favoured “Neyman-Pearsonian” critical-value, yes-or-no decision-making (neither of us really likes my term “absolutist” for this – http://wp.me/p5x2kS-cR):

*“If P is between .1 and .9, there is certainly no reason to suspect the hypothesis tested. If it is below .02, it is strongly indicated that the hypothesis fails to account for the whole of the facts. We shall not often be astray if we draw a conventional line at .05 and consider that higher values of [chi square] indicate a real discrepancy.”*

This seems to me to suggest a framework with at least three (likely 4) interesting regions: p0.1 is “no reason to suspect”, and p<0.05 is "real discrepancy". Implicitly, then, 0.05<p<0.1 is short of "real discrepancy" but better than "no reason to suspect", or, as some might say today, "nearly significant", while the 0.02<p<0.05 part of "real discrepancy" is not quite enough for "strong indication" . Perhaps I'm reading too much into a short quote, but it seems a short walk from here to a completely continuous interpretation of the p-value – am I wrong?

The more I learn about the history of all this, the more interesting it gets. Thanks!

]]>Relatedly, I searched the other day to see if there was any good philosophical literature on the physics/applied math ideas of renormalization etc.

I found this paper in Synthese (which is pretty reputable, right?). It was an interesting read:

‘The conceptual foundations and the philosophical aspects of renormalization theory’

]]>I also like parts of Spanos’ ‘inductive step’ view. I take a more hierarchical view of theories tho – see hierarchical bayes or even the phil lit on physicist’s renormalization group theory.

This world is not necessarily ‘regular’ so we can’t carry out naive induction. All is not lost – we can embed this observed world in various expanded, ‘regularised’ contexts. For reasons I’ve tried to explain many times I view the parameter embedding in a possibility space as an important task and a prior as one way of achieving this.

Because hierarchical theories always use temporary approximate ‘closures’ (think eg moment closure in the BBGKY hierarchy of statistical mechanics) these are always open to expansion when internal contradictions are reached. We have to ‘go to the next level’ in the hierarchy when this happens.

Luckily, sequences can have rates of convergence so we can often order and improve our models. ‘Turtles all the way down’ ignores the mathematical concept of a limit and the tools developed to speak about them. Bifurcations and phase transitions challenge global stability, however. Even the simplest of dynamical systems can exhibit complex behaviour; the geometry of the ‘space of scientific theories’ is probably quite complicated – if past experience is anything to go by (induction joke).

I have a very simple high school example in my latest blog post but also have links to ideas of regularisation and renormalization. It’s a bit subtle probably since it’s based on a high school math problem so seems straightforward…

In summary – generally, these days I tend to take the ideas of well-posed problem, regularisation, stability, bifurcation, formal/constructive dualism, static/dynamic dualism as fundamental (as does Laurie Davies, in my interpretation) tho’ am open to different implementations and interpretations. ‘Performance’ may be (and seems to be) a useful perspective but I can’t take it as fundamental, unless it means the same sort of thing as regularisation or regularity condition. Which it might.

]]>