http://www.nytimes.com/2014/01/05/magazine/a-speck-in-the-sea.html?_r=0

Really a gripping story, as so many fisher accident stories are. I am always in awe of boat fishers, and at the same time ashamed of how poorly we as a society treat fishers, though in this case thankfully there were resources to rescue one of them in need.

The clues that Sosinski put together were hugely valuable: “Sosinski had also been having second thoughts about the search area. After his initial conversation with Davis, he inspected the boat more carefully, and he found a few important clues. . . Together Sosinski and Winters came up with a new theory: Aldridge had gone overboard somewhere between the 40-fathom curve, about 25 miles offshore, and the Anna Mary’s first trawl, about 40 miles offshore. ”

Then the inevitable, the Sarops computer crashes. The report is unclear about whether Sarops was generating new maps, or whether the team was looking at a pre-crash map, and “Averill proposed a simple track-line search: the Jayhawk would head south-southeast for about 10 miles, straight through the main search area, then turn sharply to the north for another 10 miles, then veer north-northwest, which would take the crew straight back to Air Station Cape Cod. It wasn’t a conventional pattern, and it wasn’t Sarops-generated, but it would have to do.”

The key phrase to me in the story is “It wasn’t a conventional pattern, and it wasn’t Sarops-generated, but it would have to do.” This is a story of human ingenuity, and is yet another illustration of how powerful human minds are at computation, pattern recognition and problem solving, still more powerful than any computer algorithm, Bayesian or otherwise. Sosinski searched the boat for valuable clues, and Averill quickly crafted a search pattern that accommodated the helicopter’s position and fuel restrictions.

Using this story as some kind of proof that Bayesian methods are somehow superior is disingenuous. If the entire search effort had blindly followed Sarops patterns, who knows if Aldridge would be alive today.

As Simonsohn aptly notes, “if people misused or misunderstood one system, they would do just as badly with the other. Bayesian statistics, in short, can’t save us from bad science”, a point clearly exemplified by the Duke University cancer research fiasco.

]]>Not surprised that Gelman wanted to correct what seems to have been attributed to him.

“Today, this kind of number is called a p-value, the probability that an observed phenomenon or one more extreme could have occurred by chance.”

Boy, if they were going to play out Gelman’s criticisms of frequentist statistics, you would think that they could have defined a p-value correctly!?

This was hardly a rush article for page one.

Nathan

]]>Rule R: Whenever data differs from the null by more than k, then regard the observed difference as statistically significant at level alpha (small like .01)

But we MUST consider the properties of the test rule:

Prob(R regards difference as significant at level alpha; Ho) must = ~ alpha.

This is an error probability, and it attaches to the method or rule.

So if people are using a test rule that would readily declare a difference “rare” under Ho–even when the difference is quite common under Ho– then that requirement is violated.

The high probability of relevance concerns the test rule: that it probably would have warned us:

Prob(Rule would have warned me it’s mere chance, when it is mere chance) = high

for details, see articles.

]]>Or should it be, show me the prior probabilities?

]]>The boots gave Aldridge a chance to think.”

As for your other comments, I don’t know anyone who equates using a theorem on inverse probability with Bayesian statistics.

]]>“But the whole thing is silly as some success story for Bayesian inference as opposed to a mere use of Bayes’ rule.”

Well, this is a curious standard — a kind of converse of the no-true-Scotsman fallacy: any use of Bayes’ rule that you’d license as legitimate is, by definition, not Bayesian. I could argue until I’m blue in the face that SAROPS approach is grounded in ideas clearly expressed in Myron Tribus’s book Rational Descriptions, Decisions, and Designs, which is itself firmly in the Cox-Jaynes tradition, and you’ll just say that the provenance of the ideas doesn’t matter — it’s “mere” rudimentary probability.

]]>We have had A LOT of discussion of the role of background on this blog, including several exchanges with Gelman. Please search the blog if interested.

All that said, I admit that frequentist vs subjective probability fails to capture the central debate. But to your remark about long-runs, frequentist claims have short run implications that are testable now.

]]>There are more things in heaven and earth than are dreamt of in your philosophy. For you, probabilities are either frequencies or beliefs, you can’t image anything else.

There is a third option, which is mentioned here in hopes of avoiding further “talking past each other”. Probabilities distributions define a range of possibilities to consider for an unknown but true value.

For example, an error distribution defines a range of possibilities of the true errors that exist in the data taken.

This is not a frequency, even approximately. The “range” doesn’t specify frequencies of any kind. It, in essence, is a concrete representation of the uncertainty in the true errors existing in the data taken. The smaller the range the better they are known.

This is not an “opinion”. Either the true errors lie in the region considered or they don’t. It’s objective.

This is testable. It’s significantly easier to “test” whether the true errors are in this range than it is to test frequentist beliefs about the limiting frequencies of future errors in measurements which will never be taken.

]]>Just want to note that “finding that fisherman who floated on his boots at best used likelihoods” is false. This description of SAROPS isn’t great — it’s not detailed enough and stats jargon is misused — but anyone familiar with Bayesian decision theory in its online sequential form will recognize the paradigm.

]]>This is wrong; p value is a random variable that will take different “observed” values, from experiment to experiment. Type I error is a fixed value, ie a constant, set up in advance.

(2) p values have frequentist meaning, not only because they are based on the sampling distributions, but more importantly because if the rejection rule is stated as Reject when p < 0,05, overall in 5% of cases true Ho will be rejected, when we test 100% of true nulls. However, if you do not state any kind of rejection rule (or significance limit) and perform a single experiment, and if you obtain say p=0,003 how can you say that your error rate is 0.003%? This does not make sense at all, since your next experiment would certainly give a very different p observed! What would be the meaning of (post hoc statement): Type I error is 0,003%?

(3) p values do not overstate evidence compared with the Bayesian posteriors in the case of absolutely sharp null hypothesis. The (often) huge discrepancises are caused because Bayesians can only attack this problem by assigning positive probability mass to a subatom point that have Lebesque measure zero. Therefore, it is a Bayesian who perform rituals and does mindless statistics, because his "God" Jeffrey advocated it!!!

(4) The P-value can then be interpreted as the smallest level of significance, that is, the ‘borderline level’, since the outcome observed would be judged significant at all levels greater than or equal to the P-value[i] but not significant at any smaller levels..

Once again wrong, sorry Ms Jean Dickinson Gibbons and Mr Pratt.

The second (incredible) mistake here is that outcome observed would be judged significant at all levels SMALLER (not greater) than or equal to the P-value but not significant at any (GREATER, not smaller) levels.

]]>