This continues my previous post: “Can’t take the fiducial out of Fisher…” in recognition of Fisher’s birthday, February 17. I supply a few more intriguing articles you may find enlightening to read and/or reread on a Saturday night
Move up 20 years to the famous 1955/56 exchange between Fisher and Neyman. Fisher clearly connects Neyman’s adoption of a behavioristic-performance formulation to his denying the soundness of fiducial inference. When “Neyman denies the existence of inductive reasoning, he is merely expressing a verbal preference. For him ‘reasoning’ means what ‘deductive reasoning’ means to others.” (Fisher 1955, p. 74).
Fisher was right that Neyman’s calling the outputs of statistical inferences “actions” merely expressed Neyman’s preferred way of talking. Nothing earth-shaking turns on the choice to dub every inference “an act of making an inference”.[i] The “rationality” or “merit” goes into the rule. Neyman, much like Popper, had a good reason for drawing a bright red line between his use of probability (for corroboration or probativeness) and its use by ‘probabilists’ (who assign probability to hypotheses). Fisher’s Fiducial probability was in danger of blurring this very distinction. Popper said, and Neyman would have agreed, that he had no problem with our using the word induction so long it was kept clear it meant testing hypotheses severely.
In Fisher’s next few sentences, things get very interesting. In reinforcing his choice of language, Fisher continues, Neyman “seems to claim that the statement (a) “μ has a probability of 5 per cent. of exceeding M” is a different statement from (b) “M has a probability of 5 per cent. of falling short of μ”. There’s no problem about equating these two so long as M is a random variable. But watch what happens in the next sentence. According to Fisher,
Neyman violates ‘the principles of deductive logic [by accepting a] statement such as
[1] Pr{(M – ts) < μ < (M + ts)} = α,
as rigorously demonstrated, and yet, when numerical values are available for the statistics M
and s, so that on substitution of these and use of the 5 per cent. value of t, the statement would read[2] Pr{92.99 < μ < 93.01} = 95 per cent.,
to deny to this numerical statement any validity. This evidently is to deny the syllogistic process of making a substitution in the major premise of terms which the minor premise establishes as equivalent (Fisher 1955, p. 75).
But the move from (1) to (2) is fallacious! Could Fisher really be commiting this fallacious probabilistic instantiation? I.J. Good (1971) describes how many felt, and often still feel:
…if we do not examine the fiducial argument carefully, it seems almost inconceivable that Fisher should have made the error which he did in fact make. It is because (i) it seemed so unlikely that a man of his stature should persist in the error, and (ii) because he modestly says(…[1959], p. 54) his 1930 explanation left a good deal to be desired’, that so many people assumed for so long that the argument was correct. They lacked the daring to question it.
In responding to Fisher,Neyman (1956, p.292) declares himself at his wit’s end in trying to find a way to convince Fisher of the inconsistencies in moving from (1) to (2).
When these explanations did not suffice to convince Sir Ronald of his mistake, I was tempted to give up. However, in a private conversation David Blackwell suggested that Fisher’s misapprehension may be cleared up by the examination of several simple examples. They illustrate the general rule that valid probability statements regarding relations involving random variables may cease and usually do cease to be valid if random variable are replaced by their observed particular values.(p. 292)[ii]
“Thus if X is a normal random variable with mean zero and an arbitrary variance greater than zero, we may agree” that Pr(X < 0)= .5 But observing, say X = 1.7 yields Pr(1.7< 0) = .5, which is clearly illicit. “It is doubtful whether the chaos and confusion now reigning in the field of fiducial argument were ever equaled in any other doctrine. The source of this confusion is the lack of realization that equation (1) does not imply (2)” (Neyman (1956).
For decades scholars have tried to figure out what Fisher might have meant, and while the matter remains unsettled, this much is agreed: The instantiation that Fisher is yelling about 20 years after the creation of N-P tests and the break with Neyman, is fallacious. Fiducial probabilities can only properly attach to the method. Keeping to “performance” language, is a sure way to avoid the illicit slide from (1) to (2). Once the intimate tie-ins with Fisher’s fiducial argument is recognized, the rhetoric of the Neyman-Fisher dispute takes on a completely new meaning. When Fisher says “Neyman only cares for acceptance sampling contexts” as he does after around 1950, he’s really saying Neyman thinks fiducial inference is contradictory unless it’s viewed in terms of properties of the method in (actual or hypothetical) repetitions. The fact that Neyman (with the contributions of Wald, and later Robbins) went overboard in his behaviorism [iii], to the extent that even Egon wanted to divorce him—ending his 1955 reply to Fisher with the claim that inductive behavior was “Neyman’s field rather than mine”—is a different matter.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[i] Fisher also commonly spoke of the output of tests as actions. Neyman rightly says that he is only following Fisher. As the years went by, Fisher comes to renounce things he himself had said earlier in the midst of polemics against Neyman.
[ii] But surely this is the kind of simple example that would have been brought forward right off the bat, before the more elaborate, infamous cases (Fisher-Behrens). Did Fisher ever say “oh now I see my mistake” as a result of these simple examples? Not to my knowledge. So I find this statement of Neyman’s about the private conversation with Blackwell a little curious. Anyone know more about it?
[iii]At least in his theory, but not not in his practice. A relevant post is “distinguishing tests of statistical hypotheses and tests of significance might have been a lapse of someone’s pen“.
Fisher, R.A. (1955). “Statistical Methods and Scientific Induction”.
Good, I.J. (1971b), In reply to comments on his “The probabilistic explication of information, evidence, srprise, causality, explanation and utility’. In Godambe and Sprott (1971).
Neyman, J. (1956). “Note on an Article by Sir Ronald Fisher”.
Pearson, E.S. (1955). “Statistical concepts in Their Relation to Reality“.
I find it difficult to keep the fiducial argument clear in my mind, and I suspect that others similarly find it elusive. However, that difficulty is not a sufficient reason to criticise the argument using a single claim of “fallacious” and a couple of calls on dead authorities. Just putting up a single instantiation of the use of the fiducial argument is an insufficient basis for reasoned criticism. I understand that the fiducial argument (probably I misunderstand it, I will admit, but tell me how) gives license to the act of moving the `random variableness’ from M-ts and M+ts onto mu. If that is correct, then Fisher’s move from 1 to 2 is entirely legal within the resulting framework.
Of course Fisher’s switch from equation 1 to equation 2 is illegal when it is viewed in a Frequentist light, but that doesn’t make it fallacious in light of the fiducial argument that Fisher assumed valid. (Bayesian approaches seem illicit when viewed through a Frequentist prism, just as Frequentist approaches seem silly when viewed in Bayesian terms.)
What is your understanding of the fiducial argument itself?
Michael: There had been an enormous amount of discussion (e.g., in the 30s) as to whether some “new principle” was needed. Little agreement emerged, though Fisher seemed to say “no”. But Yates (close friend of Fisher) declared that either we need a new principle or a prior of some kind is assumed.
But notice, and here’s why I quote the 1955 paper, Fisher is claiming the move from (1) to (2) is a mere deductive instantiation as in syllogistic reasoning. That’s the really peculiar thing. That’s what makes the reference to Blackwell curious.
What I mean is, after 20+ years, it seems very odd that Fisher would claim it was a proper deductive instantiation,expressing surprise that it wasn’t obvious to Neyman, (as opposed to something that made sense only if you accepted Fisher’s view, whatever it was, of fiducial probability).
Hi Michael: see the comments below for some links to ‘structural probability’. I think this would accord with your general worldview
Wow! I’ve had a quick look at the introduction of the Fraser (1966) paper and, yes, it does seem that I might be a structuralist as well as a likelihoodlum. Unfortunately there is quite a lot of text in that paper that on my first read comes into my head as “blah blah blah” for me to comment in any detail, but I love this part that is emphasised in the original:
“x is known, [theta] is unknown, the distribution of [theta] position with respect to x is known, and the origin of the scale of measurement is conventional.”
That seems to match with my ephemeral understanding of Fisher’s fiducial argument, and is much more understandable than how Fisher puts it (in any of the versions that I have read). People should re-read Mayo’s original post with that quoted text in mind. It seems to me to be both attractive and workable.
(I’m posting this comment at the bottom of the threads as well as here, as it is likely to be missed here.)
“We could only obtain an inverse inference, Fisher explains, by considering μ to have been selected from a superpopulation of μ‘s with known distribution. But then the inverse inference (posterior probability) would be a deductive inference and not properly inductive. Here, Fisher is quite clear, the move is inductive”
In some ways a Bayesian following the ‘deductivising’ route seems closer to Popper by denying inductive inference beyond a model and/or ‘detaching’ a deductive. At least this gives an explicit model of all the assumptions required for the inference to hold.
I never really understood Fisher’s argument either btw but Fraser’s writings were the closest to helping me understand.
(*detaching an inductive inference, oops)
Om: Popper is like Neyman actually. Formal logic is deductive, but there’s such a thing as well-corroborated, severely tested, claims. He too was contrasting his view with “probabilists”–confirmation theorists, Bayesians. (Carnapians are to Bayesians as Popperians are to N-P theorists.) The only big difference is that Popper, officially, was forced to deny there were any reliable methods (at most, their reliability had been corroborated, which for him entitled nothing about the future). Neyman, by contrast, developed empirically reliable methods. (Lakatos was right that N-P theory exemplified (Popperian) methodological falsification.) In other words, Popper was short-shrifted by inadequate knowledge of statistics (as he admitted to me once), and being unable to entirely break out of the logical empiricist paradigm everyone was caught up in. But that seems accidental almost, so it doesn’t take away that much from Popper if you imagine supplementing him a bit.
By the way, I’d be interested to hear your take on Fraser. I’ve long tried to figure him out, but feel I’m only guessing, even after asking him directly.
See p. 107 of Michael Evans’ book – Popper’s 1983 proposal of a measure of corroboration is a 1-1 increasing function of the relative belief measure of evidence.
RE: Fraser, it’s been awhile since I read so I’ll have to take another look and get back to you. His group-theoretic approach seemed particularly interesting, but in general I think I liked that he gives fairly explicit mathematical constructions even if I don’t always agree with the broader philosophical claims.
Om: I’m sorry but Popper could not have been more opposed to the idea of measuring corroboration as belief or changes in belief or wanting high belief or probability. It’s best not to try and reinvent a philosopher according to one’s favorite statistical philosophy, especially when it’s so at odds with the philosopher.
I wouldn’t claim that Popper would endorse it. I’m just saying it’s a mathematical fact.
And rightly or wrongly there are a number of Bayesians influenced by Popper – Gelman, Tarantola, Dawid…Senn mentions the similarities of de Finetti and Popper in his Dicing book too.
Om: I’m missing your drift. I was trying to clarify a very much misunderstood view of Popperians, criticial rationalists, Neyman, Peirce. You can find a lot on Popperian falsification on this blog, and what I say about Gelman’s deductivism in my response to Gelman and Shalizi, also searchable from this blog
I’ve read most of your writing on this. And I’ve read Gelman & Shalizi and your response and some of Peirce and Popper.
In fact I was surprised when I actually tried manipulating Popper’s confirmation measure how similar it was to the likelihood approach. It made even more sense when I realised it was essentially equivalent to the calculations of a number of Bayesians, whether or not people would admit it.
Reading Mike Evans made it even clearer. Also Tarantola, who described a ‘Popper-Bayes’ algorithm, Gelman’s Falsificationist Bayes and Senn’s Popperian interpretation of de Finetti.
So my point is: just because Popper wouldn’t endorse an interpretation doesn’t mean there aren’t strong mathematical similarities, even equivalences, behind the different approaches.
If people prefer to stick to a team then that’s fine but for me ‘the meaning is in the use’ and as far as I can tell many people are doing the same thing and calling it something else. A 1-1 increasing transformation of a temperature scale is still a temperature scale.
Om: I neglected to reiterate that none of Popper’s attempted quantitative measures of corroboration can be used. They are all “evidential-relation” measures. That’s what my point about Popper never taking the error statistical turn is all about! He gets credit for listing essentially all the confirmation measures people have dreamt up then and since. None are error probabilistic. Popperian Alan Chalmers says somewhere, that the only real difference between Mayo and Popper is that she has a better notion of severity. (No time to get quote). The point is that Popper had the right idea–most but not all of the time–but he never had the statistics to capture his notion of severe testing. Filling in this gap was one of the contributions of my work, as in Error and the Growth of Experimental Knowledge (EGEK 1996). Sorry for your confusion.
Oh, and while I’m at it, you should equally abjure his attempts to quantify something he called “verisimilitude”. It’s now regarded as a mistake even by Popperians (as Musgrave told me in 2006).
Fair enough – for now 🙂 See below for more on topic.
PS more on topic – the first two pages of this Fraser paper seem very relevant to your original post –
Click to access 34.pdf
Another interesting discussion: ‘Statistical inference: fiducial and structural vs. likelihood’ by Bunke
http://www.tandfonline.com/doi/abs/10.1080/02331887508801245#.VspIY5N97JI
I think I agree with his point that the key distinction to make seems to be between ‘functional’ (structural) and distributional models (also Fraser’s point I believe). I think I tend to favour functional models as default, however, in constrast to Bunke. Probably my background in physical science style problems.
Also – I think this functional – distributional distinction lies at the heart of our disagreements about confirmation/evidence measures. Iny view they are valid given structural assumptions but not without these. This is possibly Gelman’s ‘within/without’ distinction too.
Om:
You might be interested in Frasers: “Is Bayes posterior just quick and dirty confidence?”
Click to access fraser_is-bayes-posterior-just-quick-and-dirty-confidence.pdf
His short comment on my likelihood principle paper also talks about conditioning:
Click to access 5-fraser_comment-on-mayo.pdf
Yes another aspect I like of his writings is his emphasis on the role of continuity (regularity) assumptions as *additional* principles – not just mathematical side issues. Again *regularisation* is shown to play a crucial role!
I like this:
> If continuity is included as an ingredient of many model-data combinations, then, as we have indicated, likelihood analysis produces p-values and confidence intervals, and these are not available from the likelihood function alone.
> This thus demonstrates that with such continuity-based conditioning the likelihood principle is not a consequence of sufficiency and conditioning principles. But if we omit the continuity then we are directly faced with the issue addressed by Mayo.
BTW Continuity/regularisation can also be related to assumptions of invariances under transformations (hence the link between his structural probability and his group theoretic treatment mentioned e.g. in the first link I posted).
Another quick note. This appears to me to be one way of looking at the fiducial/structural probability inversion argument:
First, for motivation, consider a constrained deterministic problem where the goal is to determine the x satisfying the conditions.
Given:
(1) u = g(I)
(2) y = f(x,u) = y0 for some fixed y0.
For deterministic functions f,g
These determine the condition:
(3) y0 = f(a,g(I))
The solution set for a is {a| f(a,g(I))=y0}
Now consider the analogous constrained stochastic problem.
Given:
(1)’ u = p(I)
(2) y = f(x,u) = y0
These determine a stochastic relation since u is stochastic:
(3)’ y = f(x,p(I)) = y0
Where (1) and (3) have been modified to (1)’ and (3)’ but (2) is the same.
Now solve this by:
generating a collection of u ~ p(I)
fix y0
for each u in the collection accept x such that (x,u) = y0
We then get a histogram/multiset for x indicating the number of times x was accepted after running through all u vales.
Call this p(x|I,f,y0), the ‘structural/fiducial’ probability of x.
‘for each u in the collection accept x such that (x,u) = y0’
should of course be
‘for each u in the collection accept x such that f(x,u) = y0’
Last comment for a bit:
A key implicit assumption (made clearer through writing it out in terms of probability calculations) is that the steps:
generating a collection of u ~ p(I)
fix y0
Should be exchangeable. That is, fixing y0 should not change what info you have about u, equivalently:
p(u|I,y0) = p(u|I)
[I should have written p(u|I) instead of p(I) before as well.]
This is also noted by Fraser in the paper I linked.
Om: I don’t really understand your two bulleted points from Fraser’s comment on me. If continuity is included we get things like p-values that we couldn’t have gotten from likelihood alone. But w/o continuity we get to “the issue addressed by Mayo”. What’s your understanding?
In this particular case (I’ll have to check the paper he cites as containing the details) I believe it is something like the following (and which is I think similar to aspects of his quick and dirty paper):
Likelihood is evaluated at fixed data. Frequentists want to answer ‘what if the data had been different’ ie stability under perturbations of the observed data.
If we assume likelihood has a sufficiently smooth dependence on the data – ie take y0new = y0+delta – then for sufficently small delta we can calculate the local change in likelihood (to its value at y0new) as a function of the likelihood evaluated at y0 *as well as its derivatives evaluated at y0* by using a Taylor series. Thus we can approximate variations in data space (of particular interest to frequentists) by local knowledge of the likelihood and its higher derivatives.
I read the quick and dirty paper as saying something like if you care most about variations in data space then you can only get good approximations by varying the parameters through a prior under linear mappings from parameter space to the data space. One immediate criticism without thinking too hard is that this neglects eg hierarchical bayes models and other Bayesian methods capable of working more directly in data space and/or that the method of frequentist evaluation should be applied to Bayes etc etc.
(I believe he does something like compare a likelihood expansion chosen to match the confidence variation in data space with the implied data variation of a likelihood + prior combo for a given prior. Note tho that many Bayesians would actually be happy to directly specify a ‘prior’ on the data space and calculate the implied parameter prior if that’s what they ‘believed’ most.)
(In response to omaclaren’s urging to me to read Fraser’s paper on structural probability.)
Wow! I’ve had a quick look at the introduction of the Fraser (1966) paper and, yes, it does seem that I might be a structuralist as well as a likelihoodlum. Unfortunately there is quite a lot of text in that paper that on my first read comes into my head as “blah blah blah” for me to comment in any detail, but I love this part that is emphasised in the original:
“x is known, [theta] is unknown, the distribution of [theta] position with respect to x is known, and the origin of the scale of measurement is conventional.”
That seems to match with my ephemeral understanding of Fisher’s fiducial argument, and is much more understandable than how Fisher puts it (in any of the versions that I have read). People should re-read Mayo’s original post with that quoted text in mind. It seems to me to be both attractive and workable.
Note also how in the Bunke paper he relates the failure to distinguish structural vs distributional assumptions to apparent marginalisation paradoxes.
Jaynes’ proposed resolution of Dawid et als marginalisation paradoxes (in his book, can’t remember chapter off top of my head) similarly appears to involve appeal to structural assumptions.
Furthermore, Judea Pearl’s proposed resolution of Simpson’s paradox involves ‘causal structure’ expressed via a DAG/set of structual equations. So, same again.
In my view these seem to be strong arguments for the structural approach.
The journal Mathematische Operationsforschung und Statistik is not available to me 😦
I’ll link you a pdf in a sec. (Dawid cited it in one of his later papers)
Try this: https://omaclaren.files.wordpress.com/2016/02/bunke-1975-statistical-inference-fiducial-and-structural-vs-likelihood.pdf
Another way I’ve tried to express it is as being analogous to
P: All men are mortal
Q: Socrates is a man
Doesn’t lead to
R: Socrates is mortal
If P, Q and R are simple propositions with no additional structure.
Going to the next ‘level’ allows to express the structure however :
P(x): x is mortal.
P: (for all men x in X) P(x)
Socrates s is a man in X
Therefore
P(s): Socrates is mortal.
This is obviously the principle of universal instantiation.
Similarly Mayo accuses Fisher’s of ‘fallacious probabilistic instantiation’.
This is not a fallacy however if one takes (eg Fraser’s) structural interpretation of Fisher in which he is essentially assuming the universally quantified (structural) premise.
Om: Problem is trying to perform probabilistic operations on the resulting instantiations. Can it work in Fraser’s account? (look at the more recent paper I posted). I asked Cox recently, and he said no (if I’ve understood him). That’s why I don’t really get Fraser’s quick and dirty confidence paper, although I see his point that the error statistical performance requirements for “confidence” are sustained (while not sustained for Bayesians). Fraser claims Lindley decided, because of this disagreement, to deny the use of “probability” in Fraser’s sense of confidence. Fraser says if the “confidence” construal is absent it’s false advertising or something like that.*
If I see him in April I will try to get an answer.
*Christian Robert, in his comment on Fraser, says conventional Bayesians aren’t interested in performance (in the sense of confidence).
I think it (or something like it) can work, meaning Fraser’s early paper and related work. I will write up an example properly when I get time (about to start teaching tho…). I think there is a common idea in the various structuralist approaches and I think I have a way to express it much more clearly than my sloppy attempt above.
I need to read Fraser’s more recent paper more carefully before I can comment on it but am not surprised that linearity vs nonlinearity (and differential geometry in general) can play a role here.
The way one of Fraser’s former students explained it his graduate course (~ 30 years ago) was (roughly) – the observed data and model provide an unknown transformation of the true unknown parameter, the class of the transformation is known as its the probability of all members of the class – so the probability is on the transformations not the parameters themselves.
Keith O’Rourke
Interesting. I think that is compatible with the following? Take a deterministic expression
(1) f(x,y,u) = 0.
Now make u a stochastic variable. If for each realization of u the (structural) equation (1) is assumed to hold then you now get an associated family of deterministic equations of the form (1), one for each u value. This gives the class of transformations and probability of each instance of the class that Keith refers to I think.
You might think of these as applications of the universal instantiation rule for u giving a stochasically generated set of possible constraints between particular instances. The structural relationship (class) is invariant but the particular instances are stochastic.
If you further fix one variable eg y=y0, then clearly you can solve each realization for the compatible x values. This gives a stochastically generated set of x values associated to each realization of the same structural form.
This is the same procedure used for eg solving differential equations with random coefficients. As long as the form of the equation is asserted to hold for any realization then a random coefficient gives a collection of instances of equations with that structure, and hence you can talk about the probability of a particular instance of that equation occurring.
Om” We can talk about the probability of an instance occurring (your last sentence), which, to me, is like the prob a method yields such and such result. But it’s not the probability of a parameter taking on a value. However, I’ve never felt that direct probability provided a measure of evidence or corroboration. So, unlike others, I see no reason to strive for that kind of posterior probability assignment in the first place. Instead one moves from a probabilistic qualification of a method to an assessment of how well tested claims are. So I wouldn’t rule out a fiducial-style probability offering a measure of corroboration or severity or the like.
It can be considered the probability of a parameter taking a value given a fixed structure. Which is basically Gelman’s approach. I don’t think it is too different from your approach – identify ‘model structure’ and ‘method’ and the results seem to be analogous. Gelman relativises probability to model structure, you to methods.
Om: I don’t get what you just wrote. The difficulty in moving around what’s getting the probability is the tendency to blur what may legitimately be said. The probability the method correctly/incorrectly interprets the data is one thing, the “probability of a parameter taking a value given a fixed structure” is another. But I’m willing to try to identify model structure and method, if you think it explains Fraser or fiducial.
I agree it’s a bit subtle. I mean something like ‘the parameter takes the value x0 in 67% of the realisations of the model structure/applications of the method’.
I do think a model structure – method identification is a good place to start to relate Fraser’s interp of Fisher, your methods and Gelman’s Bayes.
(I personally think Nozick’s work in epistemology is similar, as well as [maybe] Haack’s crossword metaphor but I’m not philosophically qualified enough to justify this).
Goodluck!
Om: ‘the parameter takes the value x0 in 67% of the realisations of the model structure/applications of the method’.
This sounds like the confidence idea. We still don’t have grounds for making an inference about the parameter in the case at hand, but at least this is understandable. It falls under the “performance” use of probability, with a bit of “rubbing off”. (see my first fiducial blog).
On Haack, she’s also a Peircean, but doesn’t delve into statistics. I once suggested to Haack an important revision to her Peircean crossword puzzle idea, by the way.
You can also find non-statistical “arguments from coincidence” discussed in Wes Salmon, Musgrave, Hacking, me.
I tend to assume that everyone accepts a strong argument from coincidence (e.g, from a coincidence of outputs from multiple, well understood measurement tools to a “real effect”). That’s how I start my new book. But why?
The next question is how to capture the intuition (using statistics).
Yes I agree, I think.
I’d be tempted, however, to just not make any one statement about the case at hand – instead present the information as is (a distribution over parameter values, given an ensemble of model structure realisations). Is a further interpretation required? At some point we stop interpreting in terms of other things and just point at something, right?
If the goal is to take further action based on this information – eg the action might be to guess a
point estimate of the ‘true’ value – then I think risk/reward/decision theory and all that (maybe other things too – performance is perhaps part of decision theory or maybe distinct in your account eg an argument from coincidence?) become relevant.
Om: No, we don’t stop and point. We might start with ostensive definitions, but if pointing and shaking our heads are the upshot of all this statistical modeling and inference, it’s rather bankrupt.
Go back to the previous step, see what comes of it, and let me know.
Another quick point – as pointed out in the refs there are often multiple structural relationships, usually involving unobserved vars, for any given distributional relationship for observables. Some don’t like this.
My view is that you hence can only compare theories that correspond to different realisations of the same structure (compare within a model as Gelman would say where his model is my model structure).
‘Paradigm shifts’ are given by structural instability (a dynamical systems term).
Phan: OK, but can you do ordinary probability operations on the results? Perhaps you could do whatever can be done with error probabilities of methods.
Am I the only one who has no idea about what is going on?
What is going on where? In my post? Or in the discussion? The latter got off topic.
Perhaps were I can step back in to the going on.
> However, I’ve never felt that direct probability provided a measure of evidence or corroboration. So, unlike others, I see no reason to strive for that kind of posterior probability assignment in the first place. Instead one moves from a probabilistic qualification of a method to an assessment of how well tested claims are.
Thanks, that’s clear.
> ‘the parameter takes the value x0 in 67% of the realisations of the model structure/applications of the method’
Not if x0 is a fixed value and applications of the method are to this “universe” – as nothing in a “representation” can determine the distribution of unknowns in this “universe”. Now if x0 is a random interval, then yes the true unknown constant will be contained in those intervals. This I believe was David Cox’s position – the math/prob stuff is fine – its the interpreting the probabilities as (incorrectly) applying to the parameters.
Now, I am reaching back 30 years, but Fraser’s take on fiducial was as frequency error calculator par excellence, fully conditional, uniform over the full parameter space, no (really less suspicion of) relevant subsets and full accuracy of p_values and confidence levels.
On the other hand, Liu and Meng cast Fudicial as a way to treat the prior as a nuisance parameter (in Meng’s JSM2015 talk) – how to reduce or even eliminate the influence of the prior but still get useful (rather than literal?) posterior probabilities.
Keith O’Rourke
Phanerono: Thank you, this is helpful. But when you talk about Fraser, I’m less clear:
” as frequency error calculator par excellence, fully conditional, uniform over the full parameter space, no (really less suspicion of) relevant subsets and full accuracy of p_values and confidence levels.”
On the Meng idea of treating the prior as a nuisance parameter, is this different from one of the conventional (default/reference) priors? I hadn’t heard conventional Bayesians seeing their posteriors as useful but not literal? Given that many/most/all?conventional Bayesians view the prior as an undefined mathematical entity to get a posterior, why revive “fiducial”. Maybe the answer is that the fiducialists still require a frequentist construal of the method that “rubs off” on the particular case, and conventional Bayesians interpret posteriors Bayesianly (as degrees of belief or the like). I know they don’t always get the same answers
It is only with great reluctance that I can bring myself to read discussions on Fisher and Neyman and what they wrote and meant. Tukey is never mentioned in all of this. Did he ever write anything on the topic? The discussions have now been going on for 80 years without any signs of a conclusion being reached. On the one hand my reluctance can be explained by the feeling that it has nothing to do with the way I think about statistics. On the other hand it can also be explained that the fact that I simply do not understand the arguments. It is difficult to understand why one doesn’t understand something but I will make a first attempt. One reason could be the fact that the discussion lacks precision, the terms involved are not well defined.
a) “μ has a probability of 5 per cent. of exceeding M” is a different
statement from (b) “M has a probability of 5 per cent. of falling
short of μ”. There’s no problem about equating these two so long as M
is a random variable. But watch what happens in the next sentence. According to Fisher,
Neyman violates ‘the principles of deductive logic [by accepting a] statement such as
[1] Pr{(M – ts) < μ < (M + ts)} = α,
as rigorously demonstrated, and yet, when numerical values are
available for the statistics M and s, so that on substitution of
these and use of the 5 per cent. value of t, the statement would read
[2] Pr{92.99 < μ < 93.01} = 95 per cent.,
to deny to this numerical statement any validity. This evidently
is to deny the syllogistic process of making a substitution in the
major premise of terms which the minor premise establishes as
equivalent (Fisher 1955, p. 75).
The standard probability model is a sample space Omega, a sigma-field {\mathcal F} of subset F, a probability measure P over {\mathcal F}, a parameter space \Theta and a random variable M_{mu}
{\mathcal F} measurable from Omega to the real line with mu in Theta. The statement [1] is P(M_{mu}<mu)=0.05 which is a legitimate expression within the mathematical model. The expression [2] is not
legitimate within the model.
However it seems that the discussion does not take place within a mathematical model. For example 'there is no problem in equating the two when M is a random variable'. Here 'random variable' does not mean a measurable function from Omega into the real line. It means something else. This also applies to probability Pr and to the parameter mu. This means that probability, the parameter mu and the random variable M_{mu} require an interpretation, in other words some semantics are required. As far as I can see there have been no semantics in the discussion so far. The word probability has several meanings: I will probably go to the
theatre on Saturday; the coin turned up heads in 48% of the throws; Great Britain will probably leave the European Union. Take the amount of copper in a sample of water. We speculatively identify mu with the
real amount of copper in the water. In an interlaboratory test the samples are prepared so, given the speculative identification, mu is known. Here [2] does not make sense. The institute conducting the
test will not always prepare water samples with the same amount of copper. Given repetitions of the interlaboratory test both [1] and [2] can make sense. There is no need to treat mu in [2] as a random
variable, it is simply a frequentist statement based on the empirical data. Finally it is difficult to give the word random a precise interpretation. The institute preparing the samples could decide on the following values for the quantity of copper, 5.44, 3.96, 5.22, 5.75,9.00,5.43, 2.95 …. This looks random but it is deterministic being taken from the decimal expansion of pi*(log 2)/4 =0.54439652257590054329… .
Laurie: I now see your full comment, so I add to this below*. I’m totally prepared to share your exasperation in going back to what Neyman and Fisher said. Unfortunately, there is a baked-on, accepted view of the Fiserh vs Neyman dispute that keeps coming into today’s discussions.. Knee-jerk positions continue to fill current-day advisories and glosseries, based on the supposition that p-values are “evidential” while N-P tests and confidence intervals are merely for acceptance sampling. The basis for these claims? That Fisher castigated Neyman for invoking the probabilities associated with a method, rather than a parameter, overlooking the fact that these remarks reflect Fisher’s supposition that the move from (1) to (2) was a proper probabilistic instantiation. It matters not that various fiducialists have erected ways to make sense of the move. I’m revealing important aspects underlying the rhetoric (say from 1935-1960 and longer), between Neyman and Fisher, that has been taken out of context.
*Whatever disagreements arose, it was assumed that a formal notion of probability was being used, not an informal notion, as in “I probably should have checked the chicken after 4 hours”. A central issue was whether some “new principle” was needed to make sense out of fiducial probability, assuming (as Fisher insisted) there was no prior probability. I take it that it’s agreed that some “new principle” (if one is to call it that) is needed, but even so (I think) the probability is attached to the method on the order of a confidence distribution. That’s how the discussion of Fraser arose in the comments. I’m in no way claiming to settle anything in current-day fiducial approaches.
Laurie – Tukey is cited in the Bunke paper as confusing structural and distribution models.
Keith – x0 can take different values relative to a *varying model* even if it is fixed wrt to ‘this universe’ or the ‘truth’.
I continue to like the structural-distribution distinction. I started writing up a more formal/less ambiguous note that might be more acceptable to Laurie but I’ve also just hit the start of semester and teaching…will finish one day.
PS I agree that this particular quoted statement by Fisher is wrong. It seems likely that he had something like the structural interpretation in mind and chose his words poorly in this case.
Fraser mentions somewhere that Fisher gave approval to his group theoretical formulation of the structural probability interpretation of the Fiducial argument. We have a family of probability models with common structure; one way of expressing this is through group invariance. Another is given by Fraser in the paper I linked.
The probability interpretation appears to be of confidence type in both cases. Either the probability of a parameter relative to model realisations or of model realisations relative to a fixed parameter. They are equivalent representations of the same thing. Moving relative to a fixed frame or stationary relative to a moving frame – no difference. Hence the invariance argument.
Om: not sure about “probability of a parameter relative to model realisations” unless it really is just what’s meant in the corresponding confidence statement.
Like I said, they seem to me to be the same, based on a structural interpretation. Which is perhaps what Fisher was trying to say (but expressed incorrectly in this case).
I noticed Shalizi had a post last week that cites the same paper by Fraser and the fact that Bayesian betting rates needn’t have much to do with warranting frequentist performance. But the part about it that I find most interesting is his discussion of how people “Bayesify” (a new term for me I think) results carried out by non-Bayesian means. I’ll try to get him to comment here.
http://bactra.org/weblog/1135.html
The link for Fisher 1955 actually goes to Neyman 1956.
Thank you! It is fixed now.