philosophy of science

Severity as a ‘Metastatistical’ Assessment

Some weeks ago I discovered an error* in the upper severity bounds for the one-sided Normal test in section 5 of: “Statistical Science Meets Philosophy of Science Part 2″ SS & POS 2.  The published article has been corrected.  The error was in section 5.3, but I am blogging all of 5.  

(* μo was written where xo should have been!)

5. The Error-Statistical Philosophy

I recommend moving away, once and for all, from the idea that frequentists must ‘sign up’ for either Neyman and Pearson, or Fisherian paradigms. As a philosopher of statistics I am prepared to admit to supplying the tools with an interpretation and an associated philosophy of inference. I am not concerned to prove this is what any of the founders ‘really meant’.

Fisherian simple-significance tests, with their single null hypothesis and at most an idea of  a directional alternative (and a corresponding notion of the ‘sensitivity’ of a test), are commonly distinguished from Neyman and Pearson tests, where the null and alternative exhaust the parameter space, and the corresponding notion of power is explicit. On the interpretation of tests that I am proposing, these are just two of the various types of testing contexts appropriate for different questions of interest. My use of a distinct term, ‘error statistics’, frees us from the bogeymen and bogeywomen often associated with ‘classical’ statistics, and it is to be hoped that that term is shelved. (Even ‘sampling theory’, technically correct, does not seem to represent the key point: the sampling distribution matters in order to evaluate error probabilities, and thereby assess corroboration or severity associated with claims of interest.) Nor do I see that my comments turn on whether one replaces frequencies with ‘propensities’ (whatever they are). Read more »

Categories: Error Statistics, philosophy of science, Philosophy of Statistics, Severity, Statistics | 5 Comments

Seminars at the London School of Economics: Contemporary Problems in Philosophy of Statistics

As a visitor of the Centre for Philosophy of Natural and Social Science (CPNSS) at the London School of Economics and Political Science, I am leading 3 seminars in the department of Philosophy, Logic, and Scientific Method on Wednesdays from Nov. 28-Dec 12 on Contemporary Philosophy of Statistics under the PH500 rubric, Room: Lak 2.06 (Lakatos building). Interested individuals who have not yet contacted me, write:  error@vt.edu .*
The Autumn seminars will also feature discussions with distinguished guest statisticians: Sir David Cox (Oxford); Dr. Stephen Senn: (Competence Center for Methodology and Statistics, Luxembourg); Dr. Christian Hennig (University College, London):
  • 28 November: (10 – 12 noon): Mayo: On Birnbaum’s argument for the Likelihood Principle: A 50-year old error and its influence on statistical foundations (See my blog and links within.)

5 December and 12 December: Statistical Science meets philosophy of science: Mayo and guests:

  • 5 Dec: 12 (noon)- 2p.m.: Sir David Cox
  • 12 Dec (10-12).Dr. Stephen Senn;
    Dr. Christian Hennig: TBA

Topics, activities, readings :TBA (Two 2012 Summer Seminars may be found here).

Blurb: Debates over the philosophical foundations of statistical science have a long and fascinating history marked by deep and passionate controversies that intertwine with fundamental notions of the nature of statistical inference and the role of probabilistic concepts in inductive learning. Progress in resolving decades-old controversies which still shake the foundations of statistics, demands both philosophical and technical acumen, but gaining entry into the current state of play requires a roadmap that zeroes in on core themes and current standpoints. While the seminar will attempt to minimize technical details, it will be important to clarify key notions to fully contribute to the debates. Relevance for general philosophical problems will be emphasized. Because the contexts in which statistical methods are most needed are ones that compel us to be most aware of strategies scientists use to cope with threats to reliability, considering the nature of statistical method in the collection, modeling, and analysis of data is an effective way to articulate and warrant general principles of evidence and inference.
Room 2.06 Lakatos Building; Centre for Philosophy of Natural and Social Science
 London School of Economics
 Houghton Street
London WC2A 2AE
Administrator: T. R. Chivers@lse.ac.uk

For  updates, details, and associated readings: please check the LSE Ph500 page on my blog or write to me.
*It is not necessary to have attended the 2 sessions held during the summer of 2012.

Categories: Announcement, philosophy of science, Statistics | Tags: , | 28 Comments

PhilStat: So you’re looking for a Ph.D dissertation topic?

Maybe you’ve already heard Hal Varian, Google’s chief economist: “The next sexy job in the next ten years will be statisticians.” Even Larry Wasserman declares that “statistics is sexy.” In that case, philosophy of statistics must be doubly so!

Thus one wonders at the decline of late in the lively and long-standing exchange between philosophers of science and statisticians. If you are a graduate student wondering how you might make your mark in a philosophy of science area, philosophy of statistical science, fairly brimming over with rich and open philosophical problems, may be the thing for you!* Surprising, pressing, intriguing, and novel philosophical twists on both traditional and cutting-edge controversies are going begging for analysis—they not only bear on many areas of popular philosophy but also may offer you ways of getting out in front of them.

I came across a spotty blog by Pitt graduate student Gregory Gandenberger awhile back (not like his new, frequently updated one) where he was wrestling with a topic for his masters thesis, and some years later, wrangling over dissertation topics in philosophy of statistics. After I started this blog, I looked for it again, and now I’ve invited him to post, on the topic of his choice, as he did here, and I invite other graduate students though the U-Phil call. Read more »

Categories: Error Statistics, philosophy of science, Philosophy of Statistics | 3 Comments

Mayo: (section 6) “StatSci and PhilSci: part 2″

Here is section 6 of my new paper: “Statistical Science Meets Philosophy of Science Part 2: Shallow versus Deep Explorations” SS & POS 2. Section 5 is in my last post.

6. Some Knock-Down Criticisms of Frequentist Error Statistics

 With the error-statistical philosophy of inference under our belts, it is easy to run through the classic and allegedly damning criticisms of frequentist errorstatistical methods. Open up Bayesian textbooks and you will find, endlessly reprised, the handful of ‘counterexamples’ and ‘paradoxes’ that make up the charges leveled against frequentist statistics, after which the Bayesian account is proffered as coming to the rescue. There is nothing about how frequentists have responded to these charges; nor evidence that frequentist theory endorses the applications or interpretations around which these ‘chestnuts’ revolve.

If frequentist and Bayesian philosophies are to find common ground, this should stop. The value of a generous interpretation of rival views should cut both ways. A key purpose of the forum out of which this paper arises is to encourage reciprocity.

Read more »

Categories: Error Statistics, philosophy of science, Philosophy of Statistics | 1 Comment

Mayo: (section 5) “StatSci and PhilSci: part 2″

Here is section 5 of my new paper: “Statistical Science Meets Philosophy of Science Part 2: Shallow versus Deep Explorations” SS & POS 2. Sections 1 and 2 are in my last post.*

5. The Error-Statistical Philosophy

I recommend moving away, once and for all, from the idea that frequentists must ‘sign up’ for either Neyman and Pearson, or Fisherian paradigms. As a philosopher of statistics I am prepared to admit to supplying the tools with an interpretation and an associated philosophy of inference. I am not concerned to prove this is what any of the founders ‘really meant’.

Fisherian simple-significance tests, with their single null hypothesis and at most an idea of  a directional alternative (and a corresponding notion of the ‘sensitivity’ of a test), are commonly distinguished from Neyman and Pearson tests, where the null and alternative exhaust the parameter space, and the corresponding notion of power is explicit. On the interpretation of tests that I am proposing, these are just two of the various types of testing contexts appropriate for different questions of interest. My use of a distinct term, ‘error statistics’, frees us from the bogeymen and bogeywomen often associated with ‘classical’ statistics, and it is to be hoped that that term is shelved. (Even ‘sampling theory’, technically correct, does not seem to represent the key point: the sampling distribution matters in order to evaluate error probabilities, and thereby assess corroboration or severity associated with claims of interest.) Nor do I see that my comments turn on whether one replaces frequencies with ‘propensities’ (whatever they are). Read more »

Categories: Error Statistics, philosophy of science, Philosophy of Statistics, Severity | 5 Comments

Insevere tests and pseudoscience

Against the PSI skeptics of this period (discussed in my last post), defenders of PSI would often erect means to take experimental results as success stories (e.g., if he failed to correctly predict the next card, maybe he was aiming at the second or third card). If the data could not be made to fit some ESP claim or other (e.g., through multiple end points) it might, as a last resort, be explained away as due to negative energy of nonbelievers (or being on the Carson show). They manage to get their ESP hypothesis H to “pass,” but the “test” had little or no capability of finding (uncovering, admitting) the falsity of H, even if H is false. (This is the basis for my term “Gellerization”.) In such cases, I would deny that the results afford any evidence for H. They are terrible evidence for H. Now any domain will have some terrible tests, but a field that routinely passes off terrible tests as success stories I would deem pseudoscientific. 

We get a kind of minimal requirement for a test result to afford any evidence of assertion H, however partial and approximate H may be:  If a hypothesis H is assured of having* “passed” a test T, even if H is false, then test T is a terrible test or no test at all.**

Far from trying to reveal flaws, it masks them or prevents them from being uncovered. No one would be impressed to learn their bank had passed a “stress test” if it turns out that the test had little or no chance of giving a failing score to any bank, regardless of its ability to survive a stressed economy. (Would they?)

There are a million different ways to flesh out the idea, and I welcome hearing others. Now you might say that no one would disagree with this. Great. Because a core requirement for an adequate account of inquiry, as I see it, is that it be able to capture this rationale for pretty terrible evidence and fairly pseudoscientific inquiry– and it should do so in such a way that affords a starting point for not-so-awful tests, and rather reliable learning.

* or very probably would have passed.

**QUESTION: I seek your input: which sounds better, or is more accurate: saying a test T passes a hypothesis H, or that a hypothesis H passes a test T? I’ve used both and want to settle on one.

Categories: Error Statistics, philosophy of science | 5 Comments

Statistics and ESP research (Diaconis)

In the early ‘80s, fresh out of graduate school, I persuaded Persi Diaconis, Jack Good, and Patrick Suppes to participate in a session I wanted to organize on ESP and statistics. It seems remarkable to me now—not only that they agreed to participate*, but the extent that PSI research was taken seriously at the time. It wasn’t much later that all the recurring errors and loopholes, and the persistent cheating self-delusion —despite earnest attempts to trigger and analyze the phenomena—would lead many nearly everyone to label PSI research a “degenerating programme” (in the Popperian-Lakatosian sense).

(Though I’d have to check names and dates, I seem to recall that the last straw was when some of the Stanford researchers were found guilty of (unconscious) fraud. Jack Good continued to be interested in the area, but less so, I think. I do not know about the others.)

It is interesting to see how background information enters into inquiry here. So, even though it’s late on a Saturday night, here’s a snippet from one of the papers that caught my interest in graduate school: Diaconis’s (1978) “Statistical Problems in ESP Research”, in Science, along with some critical “letters”

Summary. In search of repeatable ESP experiments, modern investigators are using more complex targets, richer and freer responses, feedback, and more naturalistic conditions. This makes tractable statistical models less applicable. Moreover, controls often are so loose that no valid statistical analysis is possible. Some common problems are multiple end points, subject cheating, and unconscious sensory cueing. Unfortunately, such problems are hard to recognize from published records of the experiments in which they occur; rather, these problems are often uncovered by reports of independent skilled observers who were present during the experiment. This suggests that magicians and psychologists be regularly used as observers. New statistical ideas have been developed for some of the new experiments. For example, many modern ESP studies provide subjects with feedback—partial information about previous guesses—to reward the subjects for correct guesses in hope of inducing ESP learning. Some feedback experiments can be analyzed with the use of skill-scoring, a statistical procedure that depends on the information available and the way the guessing subject uses this information. (p. 131) Read more »

Categories: philosophy of science, Philosophy of Statistics, Statistics | 13 Comments

More on using background info

For the second* bit of background on the use of background info (for the new U-Phil for 9/21/12 9/25/12, I’ll reblog:

Background Knowledge: Not to Quantify, But To Avoid Being Misled By, Subjective Beliefs

…I am discovering that one of the biggest sources of confusion about the foundations of statistics has to do with what it means or should mean to use “background knowledge” and “judgment” in making statistical and scientific inferences. David Cox and I address this in our “Conversation” in RMM (2011)….

Insofar as humans conduct science and draw inferences, and insofar as learning about the world is not reducible to a priori deductions, it is obvious that “human judgments” are involved. True enough, but too trivial an observation to help us distinguish among the very different ways judgments should enter according to contrasting inferential accounts. When Bayesians claim that frequentists do not use or are barred from using background information, what they really mean is that frequentists do not use prior probabilities of hypotheses, at least when those hypotheses are regarded as correct or incorrect, if only approximately. So, for example, we would not assign relative frequencies to the truth of hypotheses such as (1) prion transmission is via protein folding without nucleic acid, or (2) the deflection of light is approximately 1.75” (as if, as Pierce puts it, “universes were as plenty as blackberries”). How odd it would be to try to model these hypotheses as themselves having distributions: to us, statistical hypotheses assign probabilities to outcomes or values of a random variable. Read more »

Categories: Background knowledge, philosophy of science, Philosophy of Statistics, Statistics | Tags: , | 21 Comments

After dinner Bayesian comedy hour….

Given it’s the first anniversary of this blog, which opened with the howlers in “Overheard at the comedy hour …” let’s listen in as a Bayesian holds forth on one of the most famous howlers of the lot: the mysterious role that psychological intentions are said to play in frequentist methods such as statistical significance tests. Here it is, essentially as I remember it (though shortened), in the comedy hour that unfolded at my dinner table at an academic conference:

 Did you hear the one about the researcher who gets a phone call from the guy analyzing his data? First the guy congratulates him and says, “The results show a statistically significant difference at the .05 level—p-value .048.” But then, an hour later, the phone rings again. It’s the same guy, but now he’s apologizing. It turns out that the experimenter intended to keep sampling until the result was 1.96 standard deviations away from the 0 null—in either direction—so they had to reanalyze the data (n=169), and the results were no longer statistically significant at the .05 level.

 Much laughter.

 So the researcher is tearing his hair out when the same guy calls back again. “Congratulations!” the guy says. “I just found out that the experimenter actually had planned to take n=169 all along, so the results are statistically significant.”

 Howls of laughter.

 But then the guy calls back with the bad news . . .

It turns out that failing to score a sufficiently impressive effect after n’ trials, the experimenter went on to n” trials, and so on and so forth until finally, say, on trial number 169, he obtained a result 1.96 standard deviations from the null.

It continues this way, and every time the guy calls in and reports a shift in the p-value, the table erupts in howls of laughter! From everyone except me, sitting in stunned silence, staring straight ahead. The hilarity ensues from the idea that the experimenter’s reported psychological intentions about when to stop sampling is altering the statistical results. Read more »

Categories: Comedy, philosophy of science, Philosophy of Statistics, Statistics | Tags: , , , | 3 Comments

knowledge/evidence not captured by mathematical prob.

Mayo mirror

Equivocations between informal and formal uses of “probability” (as well as “likelihood” and “confidence”) are responsible for much confusion in statistical foundations, as is remarked in a famous paper I was rereading today by Allan Birnbaum:

“It is of course common nontechnical usage to call any proposition probable or likely if it is supported by strong evidence of some kind. .. However such usage is to be avoided as misleading in this problem-area, because each of the terms probability, likelihood and confidence coefficient is given a distinct mathematical and extramathematical usage.” (1969, 139 Note 4).

For my part, I find that I never use probabilities to express degrees of evidence (either in mathematical or extramathematical uses), but I realize others might. Even so, I agree with Birnbaum “that such usage is to be avoided as misleading in” foundational discussions of evidence. We know, infer, accept, and detach from evidence, all kinds of claims without any inclination to add an additional quantity such as a degree of probability or belief arrived at via, and obeying, the formal probability calculus.

It is interesting, as a little exercise, to examine scientific descriptions of the state of knowledge in a field. A few days ago, I posted something from Weinberg on the Higgs particle. Here are some statements, with some terms emphasized:

The general features of the electroweak theory have been well tested; their validity is not what has been at stake in the recent experiments at CERN and Fermilab, and would not be seriously in doubt even if no Higgs particle had been discovered.

I see no suggestion of a formal application of Bayesian probability notions. Read more »

Categories: philosophy of science, Philosophy of Statistics | Tags: , , , | 10 Comments

“Did Higgs Physicists Miss an Opportunity by Not Consulting More With Statisticians?”

On August 20 I posted the start of  “Discussion and Digest” by Bayesian statistician Tony O’Hagan– an oveview of  responses to his letter (ISBA website) on the use of p-values in analyzing the Higgs data, prompted, in turn, by a query of subjective Bayesian Dennis Lindley.  I now post the final section in which he discusses his own view. I think it raises many  questions of interest both as regards this case, and more generally about statistics and science. My initial July 11 post is here.

“Higgs Boson – Digest and Discussion” By Tony O’Hagan

Discussion

So here are some of my own views on this.

There are good reasons for being cautious and demanding a very high standard of evidence before announcing something as momentous as H. It is acknowledged by those who use it that the 5-sigma standard is a fudge, though. They would surely be willing to make such an announcement if they were, for instance, 99.99% certain of H’s existence, as long as that 99.99% were rigorously justified. 5-sigma is used because they don’t feel able to quantify the probability of H rigorously. So they use the best statistical analysis that they know how to do, but because they also know there are numerous factors not taken into account by this analysis – the multiple testing, the likelihood of unrecognised or unquantified deficiencies in the data, experiment or statistics, and the possibility of other explanations – they ask for what on the face of it is an absurdly high level of significance from that analysis. Read more »

Categories: philosophy of science, Philosophy of Statistics, Statistics | Tags: , | 8 Comments

Scalar or Technicolor? S. Weinberg, “Why the Higgs?”

CERN’s Large Hadron Collider under construction, 2007

My colleague in philosophy at Va Tech, Ben Jantzen*, sent me this piece by Steven Weinberg on the Higgs. Even though it does not deal with the statistics, it manages to clarify some of the general theorizing more clearly than most of the other things I’ve read. (See also my previous post.)

Why the Higgs?
August 16, 2012
Steven Weinberg

The New York Times Review of Books

The following is part of an introduction to James Baggott’s new book Higgs: The Invention and Discovery of the “God Particle,” which will be published in August by Oxford University Press. Baggott wrote his book anticipating the recent announcement of the discovery at CERN near Geneva—with some corroboration from Fermilab—of a new particle that seems to be the long-sought Higgs particle. Much further research on its exact identity is to come.

It is often said that what was at stake in the search for the Higgs particle was the origin of mass. True enough, but this explanation needs some sharpening.

By the 1980s we had a good comprehensive theory of all observed elementary particles and the forces (other than gravitation) that they exert on one another. One of the essential elements of this theory is a symmetry, like a family relationship, between two of these forces, the electromagnetic force and the weak nuclear force. Electromagnetism is responsible for light; the weak nuclear force allows particles inside atomic nuclei to change their identity through processes of radioactive decay. The symmetry between the two forces brings them together in a single “electroweak” structure. The general features of the electroweak theory have been well tested; their validity is not what has been at stake in the recent experiments at CERN and Fermilab, and would not be seriously in doubt even if no Higgs particle had been discovered. Read more »

Categories: philosophy of science | Tags: , | Leave a comment

Good Scientist Badge of Approval?

In an attempt to fix the problem of “unreal” results in science some have started a “reproducibility initiative”. Think of the incentive for being explicit about how the results were obtained the first time….But would researchers really pay to have their potential errors unearthed in this way?  Even for a “good scientist” badge of approval?

August 14, 2012

Fixing Science’s Problem of ‘Unreal’ Results: “Good Scientist: You Get a Badge!”

Carl Zimmer, Slate

As a young biologist, Elizabeth Iorns did what all young biologists do: She looked around for something interesting to investigate. Having earned a Ph.D. in cancer biology in 2007, she was intrigued by a paper that appeared the following year in Nature. Biologists at the University of California-Berkeley linked a gene called SATB1 to cancer. They found that it becomes unusually active in cancer cells and that switching it on in ordinary cells made them cancerous. The flipside proved true, too: Shutting down SATB1 in cancer cells returned them to normal. The results raised the exciting possibility that SATB1 could open up a cure for cancer. So Iorns decided to build on the research.

There was just one problem. As her first step, Iorns tried replicate the original study. She couldn’t. Boosting SATB1 didn’t make cells cancerous, and shutting it down didn’t make the cancer cells normal again.

For some years now, scientists have gotten increasingly worried about replication failures. In one recent example, NASA made a headline-grabbing announcement in 2010 that scientists had found bacteria that could live on arsenic—a finding that would require biology textbooks to be rewritten. At the time, many experts condemned the paper as a poor piece of science that shouldn’t have been published. This July, two teams of scientists reported that they couldn’t replicate the results. Read more »

Categories: philosophy of science, Philosophy of Statistics | Tags: , , , | 12 Comments

Clark Glymour: The Theory of Search Is the Economics of Discovery (part 1)

The Theory of Search Is the Economics of Discovery:
Some Thoughts Prompted by Sir David Hendry’s Essay  *
in Rationality, Markets and Morals (RMM) Special Topic:
Statistical Science and Philosophy of Science

Part 1 (of 2)

Professor Clark Glymour

Alumni University Professor
Department of Philosophy[i]
Carnegie Mellon University

Professor Hendry* endorses a distinction between the “context of discovery” and the “context of evaluation” which he attributes to Herschel and to Popper and could as well have attributed also to Reichenbach and to most contemporary methodological commentators in the social sciences. The “context” distinction codes two theses.

1.“Discovery” is a mysterious psychological process of generating hypotheses; “evaluation” is about the less mysterious process of warranting them.

2. Of the three possible relations with data that could conceivably warrant a hypothesis—how it was generated, its explanatory connections with the data used to generate it, and its predictions—only the last counts.

Einstein maintained the first but not the second. Popper maintained the first but that nothing warrants a hypothesis.  Hendry seems to maintain neither–he has a method for discovery in econometrics, a search procedure briefly summarized in the second part of his essay, which is not evaluated by forecasts. Methods may be esoteric but they are not mysterious. And yet Hendry endorses the distinction. Let’s consider it.

As a general principle rather than a series of anecdotes, the distinction between discovery and justification or evaluation has never been clear and what has been said in its favor of its implied theses has not made much sense, ever. Let’s start with the father of one of Hendry’s endorsers, William Herschel. William Herschel discovered Uranus, or something. Actually, the discovery of the planet Uranus was a collective effort with, subject to vicissitudes of error and individual opinion, was a rational search strategy. On March 13, 1781, in the course of a sky survey for double stars Hershel reports in his journal the observation of a “nebulous star or perhaps a comet.”  The object came to his notice how it appeared through the telescope, perhaps the appearance of a disc. Herschel changed the magnification of his telescope, and finding that the brightness of the object changed more than the brightness of fixed stars, concluded he had seen a comet or “nebulous star.”  Observations that, on later nights, it had moved eliminated the “nebulous star” alternative and Herschel concluded that he had seen a comet. Why not a planet? Because lots of comets had been hitherto observed—Edmund Halley computed orbits for half a dozen including his eponymous comet—but never a planet.  A comet was much the more likely on frequency grounds. Further, Herschel had made a large error in his estimate of the distance of the body based on parallax values using his micrometer.  A planet could not be so close.

Read more »

Categories: philosophy of science, Philosophy of Statistics, Statistics, U-Phil | Tags: , , , | 1 Comment

“Always the last place you look!”

“Always the last place you look!”

This gets to a distinction I have tried to articulate, between explaining a known effect (like looking for a known object), and searching for an unknown effect (that may well not exist). In the latter, possible effects of “selection” or searching need to be taken account of. Of course, searching for the Higgs is akin to the latter, not the former, hence the joke in the recent New Yorker cartoon.

Categories: philosophy of science, Statistics | Tags: , , | 20 Comments

Peter Grünwald: Follow-up on Cherkassky’s Comments

Peter Grünwald

Peter Grünwald

A comment from Professor Peter Grünwald

Head, Information-theoretic Learning Group, Centrum voor Wiskunde en Informatica (CWI)
Part-time full professor  at Leiden University.

This is a follow-up on Vladimir Cherkassky’s comments on Deborah’s blog. First of all let me thank Vladimir for taking the time to clarify his position. Still, there’s one issue where we disagree and which, at the same time, I think, needs clarification, so I decided to write this follow-up.[related posts 1]

The issue is about how central VC (Vapnik-Chervonenkis)-theory is to inductive inference.

I agree with Vladimir that VC-theory is one of the most important achievements in the field ever, and indeed, that it fundamentally changed our way of thinking about learning from data. Yet I also think that there are many problems of inductive inference to which it has no direct bearing. Some of these are concerned with hypothesis testing, but even when one is concerned with prediction accuracy – which Vladimir considers the basic goal – there are situations where I do not see how it plays a direct role. One of these is sequential prediction with log-loss or its generalization, Cover’s loss. This loss function plays a fundamental role in (1) language modeling, (2) on-line data compression, (3a) gambling and (3b) sequential investment on the stock market (here we need Cover’s loss). [a superquick intro to log-loss as well as some references are given below under [A]; see also my talk at the Ockham workshop (slides 16-26 about weather forecasting!) )

Read more »

Categories: philosophy of science, Statistics | Tags: , , , , , , | 16 Comments

Is Particle Physics Bad Science?

I suppose[ed] this was somewhat of a joke from the ISBA, prompted by Dennis Lindley, but as I [now] accord the actual extent of jokiness to be only ~10%, I’m sharing it on the blog [i].  Lindley (according to O’Hagan) wonders why scientists require so high a level of statistical significance before claiming to have evidence of a Higgs boson.  It is asked: “Are the particle physics community completely wedded to frequentist analysis?  If so, has anyone tried to explain what bad science that is?”

Bad science?   I’d really like to understand what these representatives from the ISBA would recommend, if there is even a shred of seriousness here (or is Lindley just peeved that significance levels are getting so much press in connection with so important a discovery in particle physics?)

Well, read the letter and see what you think.

On Jul 10, 2012, at 9:46 PM, ISBA Webmaster wrote:

Dear Bayesians,

A question from Dennis Lindley prompts me to consult this list in search of answers.

We’ve heard a lot about the Higgs boson.  The news reports say that the LHC needed convincing evidence before they would announce that a particle had been found that looks like (in the sense of having some of the right characteristics of) the elusive Higgs boson.  Specifically, the news referred to a confidence interval with 5-sigma limits.

Now this appears to correspond to a frequentist significance test with an extreme significance level.  Five standard deviations, assuming normality, means a p-value of around 0.0000005.  A number of questions spring to mind.

1.  Why such an extreme evidence requirement?  We know from a Bayesian  perspective that this only makes sense if (a) the existence of the Higgs  boson (or some other particle sharing some of its properties) has extremely small prior probability and/or (b) the consequences of erroneously announcing its discovery are dire in the extreme.  Neither seems to be the case, so why  5-sigma?

2.  Rather than ad hoc justification of a p-value, it is of course better to do a proper Bayesian analysis.  Are the particle physics community completely wedded to frequentist analysis?  If so, has anyone tried to explain what bad science that is? Read more »

Categories: philosophy of science, Statistics | Tags: , , , , , | 10 Comments

Vladimir Cherkassky Responds on Foundations of Simplicity

I thank Dr. Vladimir Cherkassky for taking up my general invitation to comment. I don’t have much to add to my original post[i], except to make two corrections at the end of this post.  I invite readers’ comments.

Vladimir Cherkassky

As I could not participate in the discussion session on Sunday, I would like to address several technical issues and points of disagreement that became evident during this workshop. All opinions are mine, and may not be representative of the “machine learning community.” Unfortunately, the machine learning community at large is not very much interested in the philosophical and methodological issues. This breeds a lot of fragmentation and confusion, as evidenced by the existence of several technical fields: machine learning, statistics, data mining, artificial neural networks, computational intelligence, etc.—all of which are mainly concerned with the same problem of estimating good predictive models from data.

Occam’s Razor (OR) is a general metaphor in the philosophy of science, and it has been discussed for ages. One of the main goals of this workshop was to understand the role of OR as a general inductive principle in the philosophy of science and, in particular, its importance in data-analytic knowledge discovery for statistics and machine learning.

Data-analytic modeling is concerned with estimating good predictive models from finite data samples. This is directly related to the philosophical problem of inductive inference. The problem of learning (generalization) from finite data had been formally investigated in VC-theory ~ 40 years ago. This theory starts with a mathematical formulation of the problem of learning from finite samples, without making any assumptions about parametric distributions. This formalization is very general and relevant to many applications in machine learning, statistics, life sciences, etc. Further, this theory provides necessary and sufficient conditions for generalization. That is, a set of admissible models (hypotheses about the data) should be constrained, i.e., should have finite VC-dimension. Therefore, any inductive theory or algorithm designed to explain the data should satisfy VC-theoretical conditions. Read more »

Categories: philosophy of science, Statistics | Tags: , , , , , | 12 Comments

Comment on Falsification

The comment box was too small for my reply to Sober on falsification, so I will post it here:

I want to understand better Sober’s position on falsification. A pervasive idea to which many still subscribe, myself included, is that the heart of what makes inquiry scientific is the critical attitude: that if a claim or hypothesis or model fails to stand up to critical scrutiny it is rejected as false, and not propped up with various “face-saving” devices. Now

Sober writes “I agree that we can get rid of models that deductively entail (perhaps with the help of auxiliary assumptions) observational outcomes that do not happen.  But as soon as the relation is nondeductive, is there ‘falsification’”?

My answer is yes, else we could scarcely retain the critical attitude for any but the most trivial scientific claims. While at one time philosophers imagined that “observational reports” were given, and could therefore form the basis for a deductive falsification of scientific claims, certainly since Popper, Kuhn and the rest of the post-positivists, we recognize that observations are error prone, as are appeals to auxiliary hypotheses. Here is Popper: Read more »

Categories: philosophy of science, Statistics | Tags: , , , , , | 11 Comments

Elliott Sober Responds on Foundations of Simplicity

Here are a few comments on your recent blog about my ideas on parsimony.  Thanks for inviting me to contribute!

You write that in model selection, “’parsimony fights likelihood,’ while, in adequate evolutionary theory, the two are thought to go hand in hand.”  The second part of this statement isn’t correct.  There are sufficient conditions (i.e., models of the evolutionary process) that entail that parsimony and maximum likelihood are ordinally equivalent, but there are cases in which they are not.  Biologists often have data sets in which maximum parsimony and maximum likelihood disagree about which phylogenetic tree is best.

You also write that “error statisticians view hypothesis testing as between exhaustive hypotheses H and not-H (usually within a model).”  I think that the criticism of Bayesianism that focuses on the problem of assessing the likelihoods of “catch-all hypotheses” applies to this description of your error statistical philosophy.  The General Theory of Relativity, for example, may tell us how probable a set of observations is, but its negation does not.  I note that you have “usually within a model” in parentheses.  In many such cases, two alternatives within a model will not be exhaustive even within the confines of a model and of course they won’t be exhaustive if we consider a wider domain.

Read more »

Categories: philosophy of science, Statistics | Tags: , , , | 13 Comments

Blog at WordPress.com. Theme: Customized Adventure Journal by Contexture International.

Follow

Get every new post delivered to your Inbox.

Join 84 other followers