**MONTHLY MEMORY LANE: 3 years ago: October 2013. **I mark in **red** **three** posts from each month that seem most apt for general background on key issues in this blog, excluding those reblogged recently**[1], and in ****green**** up to 3 others I’d recommend[2]**.** **Posts that are part of a “unit” or a pair count as one.

**October 2013**

**(10/3) Will the Real Junk Science Please Stand Up? (critical thinking)**- (
**10/5)****Was Janina Hosiasson pulling Harold Jeffreys’ leg?** **(10/9) Bad statistics: crime or free speech (II)? Harkonen update: Phil Stat / Law /Stock****(10/12) Sir David Cox: a comment on the post, “Was Hosiasson pulling Jeffreys’ leg?”(10/5 and 10/12 are a pair)**- (10/19) Blog Contents: September 2013
**(10/19) Bayesian Confirmation Philosophy and the Tacking Paradox (iv)*****(10/25) Bayesian confirmation theory: example from last post…(10/19 and 10/25 are a pair)****(10/26) Comedy hour at the Bayesian (epistemology) retreat: highly probable vs highly probed (vs what ?)****(10/31) WHIPPING BOYS AND WITCH HUNTERS**(interesting to see how things have changed and stayed the same over the past few years, share comments)

**[1]** Monthly memory lanes began at the blog’s 3-year anniversary in Sept, 2014.

**[2]** New Rule, July 30, 2016-very convenient.

Is the tacking paradox still something that philosophers take seriously?

Om: yes.

Interesting. It seems like there are numerous simple solutions in the statistical literature from either likelihood or Bayes perspective but all involve treating any given ‘hypothesis’ as one parameter value within a set of possible values. The philosophical literature seems to treat hypotheses as isolated simple propositions. Only this latter approach seems to obscure the issue.

OM: Look at the comments, e.g., by Corey and others. In general, if B entails E, E increases the probability of B, if it’s not 1 already.

That’s exactly what I mean by obscuring the issue.

A non-rigorous argument (but mostly correct) is as follows. This was essentially given by Basu.

Define unrelated parameters to mean p(y;a,b) factors as

p(y;a,b) = f(y)p(y;a)p(y;b)

The given is

p(y;a,b) = p(y;a)

Hence

p(y;b) = f(y) = constant for fixed data.

Hence likelihood inference on b involves a flat likelihood and hence all values are equally well supported (or not supported).

On the other hand inference for the parameter a is unaffected by the presence of b.

Oops, p(y;b) = 1/f(y), everything follows as before.

Argument is even cleaner from Bayes perspective- use independent prior p(a,b) = p(a)p(b) and calculate marginals. No support for an unrelated and non-predictive parameter b, as expected.

Om: It’s the conjunction that gets a B-boost given a conjunct is supported.

So:

The combination of a good theory with parameter ‘a’ & an irrelevant (neither good nor bad) theory with a parameter ‘b’ gets a boost overall.

When the boost is ‘localised’ to the parameters of each theory, using standard likelihood or Bayes arguments, to see where the boost ‘comes from’, we find that the good theory is boosted and the irrelevant theory is not.

No paradox as far as I can see.

On the other hand, the ‘paradox’ is usually generated by using an additional hand-wavy ‘philosophical’ principle (which is contra statistical practice) that ‘a&b is boosted therefore b is boosted’. This is stated in the post as

“However, it is also plausible to hold :

(2) Entailment condition: If x confirms T, and T entails J, then x confirms J

”

So, it appears that the ‘paradox’ is really a reductio of the suspect ‘entailment condition’ which is not supported by any statistical or probability theory but is rather a ‘philosophical intuition’.

Using the standard statistical methods of dealing with nuisance parameters to ‘localise’ inferences on the other hand gives sensible conclusions.

The ‘logical’ language used in the philosophical literature, and used to ‘justify’ the entailment condition, obscures the fact that you can’t generally go from information about a function of two variables f(a,b) to information about a function of one variable without e.g. making an assumption on how the two-variable function factorises into two one-variable functions.

Om: I don’t think this is the same as the claim about functions. Here you must allow that x confirms (A and B) even though x does not confirm B. It is an unusual thing to say, is it not? It is at the heart of criticisms of Bayes factors as not being measures of evidence. The link (Schervish) should be in the post.

http://www.cs.ubc.ca/~murphyk/Teaching/CS532c_Fall04/Papers/schervish-pvalues.pdf

“It is an unusual thing to say, is it not?”

My point is no it is not. It follows from standard statistical practice. The point is presumably that B is a ‘neutral’ parameter, neither good nor bad. If B was a false theory then this would be strange, but B is not contradicted by the data either.

Compare ‘predictions from a good physical theory and my hat is red’ vs ‘predictions from a bad physical theory and my hat is blue’. The former is better supported than the latter. We can also localise to see that it is the physical theory part doing the work.

Om: Of course, and insofr as the choice for confirmation is boost, Popper is saying, you can’t also have confirmation be identified with the posterior probability. Probability is, on the resulting view, not a good measure of confirmation so the search is on to find one. Popper, and more recently, Fitelson list around 10 or 20. Moreover, if you reject entailment (sometimes called special consequence) then various criticisms aimed at frequentist inference can’t be lodged. I’m fairly sure that this all came up in detail in the comments. By the way, I noticed that Briggs quotes my blogpost on this in his new book on uncertainty.

RE:

“if you reject entailment (sometimes called special consequence) then various criticisms aimed at frequentist inference can’t be lodged”

I have no particular desire to criticise frequentist inference and advocate for Bayes/likelihood or vice versa.

I just think this particular argument is a poor criticism of Bayes/likelihood, and think entailment is evidently a bad idea (regardless of whether some philosophical Bayesians argue for it – I don’t think any Bayesian statistician holds it since it contradicts probability theory and their methods for dealing with nuisance parameters).

Also just one last point emphasis again – the parameter space should be taken to be the Cartesian product of a and b values. So yes, a function of two variables (eg the likelihood function is a function of two variables here.)

That is the standard statistical approach, which appears different from the philosophical approach in which simple propositions are used. I find it unclear how the philosophers are formulating the question. Is the ‘confirmation’ function of two propositions defined for all T/F combinations? Ie (T,T), (T,F), (F,T), (F,F). And then compared over these combinations to see which case is supported? That is the analogue of the statistical approach.

A philosopher would perhaps call this a counterfactual approach- I would call it ‘using functions and variables to model things’.

A particular instance of the conjunction ‘A & B’ is then ‘parameter a takes the value a* and parameter b takes the value b*’ say.

We can compare all such pairs (a*,b*). To say something like ‘the value b* of b is supported’ is ill-defined unless an ‘a value’ is also given, for the simple reason that your function takes two arguments.

In the case of orthogonal parameters (ie the product likelihood factorisation above holds) then we can ‘localise’ inferences. For a Bayesian they can assume a product factorisation of the prior for independent parameters.

All of this is straightforward if you adopt a ‘functions and variables’ formulations or, perhaps, at least define and compare the confirmation measure (or whatever) for all possible propositional (T/F) combinations.

Against my better judgement (in terms of time management, at least) I wrote up some more thoughts on the tacking ‘paradox’ here:

https://omaclaren.com/2016/10/21/the-tacking-paradox-revisited-notes-on-the-dimension-and-ordering-of-propositional-space/

I’d be interested if you had any comments or response, since you (and other philosophers) apparently view this as a real issue for e.g. Bayes or Likelihood inference, while it seems like a misleading example to me.

Om: Will have a look.

Hi Mayo, have you had a chance to have a look yet? Saw you are doing a panel thing with Glymour soon. Does he still think the tacking paradox argument holds? Was he originator? Or (one of) first to state clearly?

Also, do you know if the usual Bayesian philosophers have made the same point as me, or do they address differently? For fun I note that a logician who finds constructive logic (or Kripke semantics etc) compelling would likely come to the equivalent conclusion as me.

Om: Sorry not to have done so yet. Glymour is the one who originated the “old evidence” problem. The tacking paradox has been around since the hypothetical-deductive approach. It was a problem for Popper, and one of the initial goals of the severity idea was to avoid it. I’ll twitter you a page shot from Popperian Chalmers where he claims severity dissolves the problem (not sure how to place it in a comment directly):

//platform.twitter.com/widgets.js

Thanks for this. As I mentioned on twitter, though, I’ve read that section before.

My main point is that Likelihoodists and Bayesians have had a natural response to this for years, and have used it in practice – they treat it as a problem of nuisance parameters.

A couple of more philosophical points

1. Chalmers’ discussion is slightly different in that he doesn’t use the entailment condition and focuses the ‘paradox’ on the fact that the conjunct ‘good theory & irrelevant theory’ can be confirmed.

Your solution is that the conjunct is not confirmed. The e.g. Bayesian/Likelihood solution is that only the conjunct but not necessarily the ‘parts’ are confirmed.

This leads to the question – can we ‘confirm’ in some way or other a good theory with a few possibly eliminable (irrelevant) parts? Surely we do in fact want this (or something like this) – better a good theory with some irrelevant parts than no theory, right?

Otherwise no theory could ever be confirmed because it could be argued to depend on details beyond our measurement capacity e.g. whether string theory or some other theory of quantum gravity is ‘correct’ should not affect our evaluation of some macroscopic theory.

This would be another paradox – good theories are so sensitive that they can be destroyed by tacking on irrelevant propositions. But irrelevant propositions are always lurking about.

2. In terms of the argument (in your original post) that *does* use the entailment condition, my post basically argues that it is a ‘type’ error: a theory is (more like) a function from propositions to observations, i.e. AxB->Y, than just propositions, i.e. AxB.

So we cannot in general go from a function AxB->Y to a function A->Y without a B value, though we can go from a proposition on AxB to a new proposition C using the propositional operators and their ‘truth-functionality’.

3. In general I would prefer to use (similarly to Laurie, and many others) evaluations that specify a model as ‘adequate wrt’ or ‘consistent with’ some data, rather than ‘confirmed’. This is not that important for the present arguments though.

So with 1. The point is that we need to allow overall confirmation even if potential irrelevant parts exist, as long as there are *some* good parts to the theory.

With 2. we block inconsistent localisation via entailment by recognising that theories are functions not propositions.

Om: I’d love to answer this in detail and will when I can. It seems clear to me that you’re endorsing the position of those who reject the entailment or special consequence condition while allowing the irrelevant conjunction to be confirmed. This has been deemed counterintuitive and problematic for a long time, hence, the tacking paradox.It is not a paradox raised by Glymour, but an old one. He offered a way to pinpoint portions confirmed, still in the spirit of the logical positivists. He doesn’t think it works (that 1a 1980). I have a different way that shares the spirit of his.

“Consistent with” is multiply ambiguous. Merely not contradicting–which is its strict meaning–is no longer to give a theory of evidence or inference (x can be utterly irrelevant to while consistent with H, x can even support ~H and be con with H) . Being “consistent with” in the sense used by Cox, say, wrt significance tests, or ordinary usage, actually is much stronger. More later on in the week.

Thanks, I look forward to your detailed response. I hope it takes up the points as I have stated them, ie on my argument as given.

RE:’deemed counterintuitive’ – ironically my position is ‘consistent with’ those who adopt intuitionist logic!

(Though this is really just a result of prioritising trying to state it in a natural mathematical formulation instead of the ‘logical’ formulation. My biggest peeve with philsci is the use of artificial logical language where it doesn’t fit).

RE:’consistent’ – as mentioned I wanted to keep this as a side issue, an ‘irrelevant conjunct’ say. I have an explicit formulation in mind, but that would distract the issue I think.