“If you make a critique, it cannot be based on the post-data properties.”

It’s not, it’s a pre-data animal the could warrant saying a significant result is good evidence of a discrepancy 4 SE greater than the null, say.

“You need a different kind of argument to say that the rejection ratio is not a pre-data frequentist method.”

No, the situation is very simple. You can make up all kinds of functions of power and alpha–square them, add and subtract whatever–that simply ARE NOT methods from any frequentist school, just as not all functions of probs that you can dream up obey the probability calculus. Now as I explained in my post the rejection ratio could look sensible –in an affirming the consequent kind of way. i.e., I’m not saying it is just any old arbitrary invention (see footnote 1), but it’s not a method from a frequentist methodology. I don’t know how to say this any more clearly. ]]>

By the way, my first comment is still awaiting moderation and I think our exchange is not visible to anyone else.

]]>From that paper: “Seidenfeld argues that while the ‘best’ NP confidence interval may be reasonable before observing the data (i.e., on the ‘forward look’) it may no longer be reasonable once the data is observed (i.e., on the ‘backward look’)”

You argue, reasonably, that he cannot apply post-data arguments to criticize NP intervals. For the same reason, you shouldn’t use post-data arguments to criticize the pre-experimental rejection ration as defined by BBBS.

]]>But it’s not a “measure of initial strength of evidence provided by a significant result”-whatever that means. ]]>

I think we pretty much agreed here : https://errorstatistics.com/2016/05/22/frequentstein-whats-wrong-with-1-%CE%B2%CE%B1-as-a-measure-of-evidence-against-the-null/#comment-141011

Readers may be interested in the full discussion. ]]>

*Bayesians–if they hold to inference by Bayes theorem, and thus the likelihood principle–can’t directly pick up on biasing selection effects that alter the sampling distribution, e.g., optional stopping, multiple testing, enabling them to wrong with high probability. *

There’s an extensive literature on Bayesian multiple testing. The most widely-used multiple-testing techniques all have close Bayesian analogs, that are widely applicable – they are not due to convenient choices of prior. So I don’t think this particular claim holds up.

Optional stopping is more contentious, though if study design is allowed to inform the parameters on which one makes inference rapprochement is possible.

* Inference by Bayes theorem doesn’t statistically falsify (unless you add a falsification rule which goes beyond Bayesian inference), doesn’t give means to test assumptions of models, and fails to control error probabilities.*

See my earlier comments, in this discussion, on practical means to test assumptions in the Bayesian framework.

Regarding failure to control error probabilities, this is certainly not immediate with Bayes, but nothing stops one from considering (and doing inference on) how frequently errors would be made in replicate studies, and whether their frequency would be controlled by a procedure of interest.

As we’ve come a long way from where we started (what a posterior probability means) I think this is an appropriate place to sign off. Thanks for the interesting discussion, I hope you found it helpful too.

Look forward to the new book.

]]>The problem is exactly that anything goes in a Bayesian system. You can find a prior to sustain any claim you want. Bayesians–if they hold to inference by Bayes theorem, and thus the likelihood principle–can’t directly pick up on biasing selection effects that alter the sampling distribution, e.g., optional stopping, multiple testing, enabling them to wrong with high probability. Inference by Bayes theorem doesn’t statistically falsify (unless you add a falsification rule which goes beyond Bayesian inference), doesn’t give means to test assumptions of models, and fails to control error probabilities.

As I say, deductive systems are there for the free use by any inference account, the trouble will be to show soundness that doesn’t merely say, “I define this as rationale belief” or other non-testable claim. We all know, or should know, by now the shortcomings of the Cox-Savage attempts to stipulate an inductive logic by fiat. I realize some are still enamored of them. Go easy on the Kool Aid. ]]>

You say “I think you’re up against an axiom there; it’s a baked-in part of the Bayesian approach..”.

No empirical claim “comes up against an axiom”, because an axiom is a mere piece of syntax that only gets its meaning, and thus the possibility of being true and open to appraisal, by giving it a semantics. The creator of a formal system sets out axioms by fiat, and the grammar is also a piece of syntax. The Bayesian meaning, you say, “baked in” to the approach is degree of knowledge or how much “information”. What needs to be substantiated is that any resulting assertions, now being given meanings, latch up with what is taken to be true about information, inference, knowledge. A more common semantics is in terms of betting, and people argue as to whether the resulting claims holds true for bets, but you have nixed that (I’m not saying that’s had great success). So you’d have to show why your semantics is adequate for expressing truths about knowledge and inference, or whatever you have in mind. Few people, these days, think probability logic captures how people actually make inferences, but it might be alleged, as it often is, that they ought to reason this way if they were rational, or the like. Is that what you’d aver in the case of your semantics?

No one is ever robbed of using a formal system or logic, and (having started life as a formal logician) I’m aware of quite a few logics out there that could be contenders for representing epistemic stances—given the right semantics. Logics can be assured to be deductively valid (so long as they’re created consistently) but that means very little when it comes to soundness. The trick is to show that what’s derivable from your axioms captures what’s true about the domain of interest–and conversely. ]]>

*I don’t understand how these probabilities describe what is known.*

I think you’re up against an axiom there; it’s a baked-in part of the Bayesian approach that information can be encoded in this way, i.e. that there is some fixed total amount of knowledge that it’s spread out over possible (but unknown) true states of Nature, and that the support for the union of disjoint sets of states of Nature is the sum of their supports.

With regard to “cashing out” (which I interpret as just considering the consequences of these axioms) it’s fairly obvious that this grammar means one cannot end up supporting states of Nature one did not support *a priori*, which seems like a weakness. But it is possible to end up with Bayesian inferences along the lines of “here’s what we now believe about Nature, and also that the data we observed is nevertheless nothing like what we’d expect under those states of Nature”. So this weakness is perhaps not so bad as it first seems; at least in a practical sense models can still be checked against data, and perhaps rejected.

Other consequences of the axioms are, well, all of probability theory. Despite their simplicity the axioms are known for providing an immensely rich grammar: therefore it’s reasonable to require those who maintain Bayes can’t provide the inferences they seek to strongly justify their claims.

Foregoing the probability grammar to describe knowledge, one can either have no grammar at all, or some alternative. With no grammar, there seems (to me) to be no principles against which to evaluate difference procedures. i.e. no way of saying whether inferences are appropriate or not. With some other non-Bayesian grammar it is extremely difficult to rule out incoherence, which seems a serious shortcoming for a scientific system.

]]>“Probability is just a set of rules – a grammar, if you like – for describing what’s known about fixed parameters. If the arguments for it being a sensible grammar don’t convince, pragmatists will note it happens to be a grammar that makes it very easy to update knowledge in the light of data that wasn’t previously used.”

I don’t understand how these probabilities describe what is known, and if they fail to represent what’s known, it’s not obviously relevant pragmatically to state that they make it easy to update what’s known. Even alluding to an uncontroversial case of the probability of an event (not a statistical hypothesis), it needs to be cashed out. Generally it’s in terms of how frequently the event would occur in some hypothetical population. How does this work in describing “what is known” about parameters? I’m not sure how describing it as a “grammar” helps to make the numbers relevant for empirical inquiry.

“Regarding well-testedness – which I do think is interesting – it would really help to have examples where this differs importantly from reasonable interpretations of sensibly-constructed confidence (or credible) intervals, which give an indication of how different the truth would have to be in order to give a different testing outcome.”

Since this is to provide an inferential interpretation and rationale of existing (error statistical) methods like CIs, it wouldn’t make sense for an intuitively plausible assessment of well-testedness to be at odds numerically with a good or best CI–even if there are differences in interpretation. It’s what’s poorly tested that will differ. For ex. points within a CI are non-rejectable at a given level, but they may be very poorly tested.

If there was no controversy surrounding the right interpretation and justification of existing methods, there would perhaps be no need for recognizing a different goal, but that’s not the case. If it turns out to underlie the implicit aim of existing methods, lying below the stated goals–that’s all to the good, that’s my hope. But I’ve never seen it defended or used to adjudicate current criticisms.

“It therefore seems unlikely that one could show that a particular way of proceeding provides all your goals”.

This is a piecemeal approach; I didn’t say a single application of a stat procedure would do all this, I was just trying to suggest a view that stands as a rival to an aim that is thought to be uncontroversial but which isn’t argued for e.g.,formal degrees of belief, and updating them. (The “growth of knowledge” goes beyond formal statistical aims.) Explaining what’s wrong with inferring poorly tested claims, and why we’d want claims that survive stringent testing isn’t at all difficult. What’s the reason for desiring a grammar to assign numbers, when I don’t know what they mean or how to use them or why it’s thought one particular choice of grammar expresses knowledge?

]]>