PhilStat: So you’re looking for a Ph.D dissertation topic?

Maybe you’ve already heard Hal Varian, Google’s chief economist: “The next sexy job in the next ten years will be statisticians.” Even Larry Wasserman declares that “statistics is sexy.” In that case, philosophy of statistics must be doubly so!

Thus one wonders at the decline of late in the lively and long-standing exchange between philosophers of science and statisticians. If you are a graduate student wondering how you might make your mark in a philosophy of science area, philosophy of statistical science, fairly brimming over with rich and open philosophical problems, may be the thing for you!* Surprising, pressing, intriguing, and novel philosophical twists on both traditional and cutting-edge controversies are going begging for analysis—they not only bear on many areas of popular philosophy but also may offer you ways of getting out in front of them.

I came across a spotty blog by Pitt graduate student Gregory Gandenberger awhile back (not like his new, frequently updated one) where he was wrestling with a topic for his masters thesis, and some years later, wrangling over dissertation topics in philosophy of statistics. After I started this blog, I looked for it again, and now I’ve invited him to post, on the topic of his choice, as he did here, and I invite other graduate students though the U-Phil call.

New Role of Philosophy in Statistics?

Philosophy of statistical science not only deals with the philosophical foundations of statistics but also questions about the nature of and justification for inductive-statistical learning more generally. So it is ironic that just as philosophy of science is striving to immerse itself in and be relevant to scientific practice, that statistical science and philosophy of science—so ahead of their time in combining the work of philosophers and practicing scientists—should see such dialogues become rather rare.  (See special topic here.)

At the same time, philosophical ideas are increasingly finding their way into discussions of statistical method, whether or not they are dubbed as such. (Please search this blog for examples.) But I have actually heard statisticians, at times, ask: where are the philosophers?  The increased use of nonsubjective Bayesianism in general, and attempts at frequentist-Bayesian “reconciliations” in particular, have, at least implicitly, put foundational issues back on the map, despite not always being noticed by philosophers. If nonsubjective (or default, or reference, or just “conventional”) Bayesian methods with good frequentist properties are available, some ask, how are we to interpret posterior probabilities?

Interestingly, some contemporary statisticians are engaging with philosophy for pragmatic reasons: philosophical preconceptions create roadblocks to activities they value. For example, although some Bayesians would like to check their prior probabilities as part of a test of statistical models, the supposition that prior-probability distributions are modeling beliefs might seem to prohibit such checks. Others feel constricted by the supposition that once the data are in hand, the Bayesian paradigm disallows consideration of outcomes other than the one observed (the essence of the “strong” likelihood principle [SLP]). We had earlier discussed the SLP on this blog, and are now returning to it . . .

In many ways current foundational issues seem to hit the Bayesians harder than the frequentists (sampling theorists or users of error probabilities for inference). If this is so, it is undoubtedly because the Bayesians had long been thought to enjoy sound axiomatic foundations growing precisely out of probability theory itself (along with, perhaps, decision theory). Sampling theorists had always been tarred with the criticism of offering a hodgepodge of methods with a variety of frequentist criteria. Bayesian methods offered themselves as a respite from such chaotic foundations. However, nonsubjective or conventional Bayesian methods, arguably the most popular form in current practice, permit violations of fundamental principles long held as integral to Bayesian foundations, notably the strong likelihood principle (SLP). Its standpoint is regarded as being seriously at odds with what subjective (or personalistic) Bayesians consider the “Bayesian standpoint.” Showing that a widely accepted argument for the likelihood principle (by Allan Birnbaum) is flawed, I  think, is important for both Bayesians and frequentist sampling theorists.

This connects to one of the most dramatic shifts in current discussions of statistical foundations. As I state at the description of this blog, because it was long assumed that only subjective Bayesianism had a shot at coming up with genuine philosophical foundations, subjective Bayesians had generally approached the newer nonsubjective, conventional Bayesians as needing to be brought around to their underlying subjective foundations. This position, although not entirely absent, is much less prevalent. Bayesians themselves have come to question if the widespread use of methods under the Bayesian umbrella, however useful, itself supports philosophical Bayesianism as its foundation! Once traditional Bayesian coherence goes by the board, other shibboleths may well follow.

Notice that it is statisticians, not philosophers, who are leading the movement away from subjective foundations (which isn’t to say that practitioners are overtly interested in philosophical foundations in general). This blog seeks, among other things, to open up some new avenues for philosophical participation. For practitioners, it suffices that their methods are useful and widely applied; we philosophers, by contrast, need to make underlying defenses explicit.

Then, of course, there is the long-lived movement in psychology and other social sciences to “reform” (or even ban) commonly used statistical significance tests (e.g., in favor of confidence intervals). Again, search this blog for examples.

Error-statistical Philosophy

Contrasting statistical philosophies reflect different answers to the basic questions:

  • What is the nature and role of probability in statistical inference?
  • What is required to warrant statistical inferences?

The general statistical philosophy out of which my own account grows may be called the error-statistical account. Probabilities arise in inference in order to assess and control the error probabilities of methods. (The connection to the term “sampling theory” is that error probabilities are based on the sampling distribution of observable statistics). Why should long run error rates matter in evaluating a particular inference?  I argue that in cases of scientific (or Birnbaum’s “informative”) inference,  error probabilities allow us to determine how well or severely tested claims are, by the given test and data. The reasoning is based on the central premise that: If data are in accordance with a hypothesis H, but the method would very probably have issued so good a fit even if H is false, then the data provide poor evidence for H.

A data-hypothesis accordance only counts as evidence for H if the test had a reasonable probability of resulting in a discordant result if H is false, or specifiably flawed. This single principle, which can be stated in all kinds of ways, and is open to numerous extensions and uses, is essentially the touchstone of the error-statistical philosophy. Together with the basic aim that we desire to find things out, and that we not be obstructed in this goal, the full statistical philosophy grows. This is obviously in the spirit of Karl Popper’s philosophy of corroboration, which he never succeeded in implementing. Using core ideas from frequentist error statistics, we can get much further.

Of course these are just a few aspects of an approach to a very large, interdisciplinary field. Search this blog for more…

The roles of probability and statistics in scientific inquiry and learning are being debated on many sides. I really don’t know what side Greg will develop or provoke, but it is sure to be of interest, and perhaps galvanize the graduate contingent to PhilStat (the next sexy area). Share your comments.

*Ideally coupled with some statistics itself.

Categories: Error Statistics, philosophy of science, Philosophy of Statistics | 3 Comments

Post navigation

3 thoughts on “PhilStat: So you’re looking for a Ph.D dissertation topic?

  1. Nicole Jinn

    Yes, I fully acknowledge and *agree* that there is *so much work to be done* in the area of philosophy of statistical science, which is one of the reasons *why I chose that area* for my master’s thesis. However, I am only one person and *more* researchers are needed in philosophy of statistical science if we truly want to influence the (proper) way researchers use statistical methods, and to generate more productive scrutiny of Bayesian and frequentist methods.

    • Nicole: Yes, and I expect that with your background in statistics, and increasingly philosophy, that you will be able to make important contributions! Keep in there!

  2. Christian Hennig

    As a somewhat idiosyncratic 😉 statistician interested in philosophy, I think that a key issue that I see as connected to Mayo’s overall “programme” (and that is based on a frequentist interpretation of probability) is model assumptions, and more precisely the effect checking model assumptions has on inference computed on the same data, and how parametric inference taking this into account compares to robust and nonparametric inference. Certainly there is something to be done from the statistical/mathematical side, but I think that it is also a philosophical issue of much interest.

    It is true that Mayo, Spanos and D.Hendry among others have already thought about this and some results exist, mostly in form of adjustments of parametric inference or invariance arguments in specific cases. But much remains to do. On this blog I perceive a tendency toward standard parametric inference after ruling out competing models by misspecification testing. One can find a few words (mainly from Spanos, I think) jusitifying this vs. using methodology based on broader classes of models (either nonparametric or “neighbourhoods” of parametric models in Huber’s and Hampel’s robustness tradition), but I think that there is much more to be said and investigated in this direction.

    Let me combine this with another nudge for philosophers interested in statistics to have a look at
    Davies, P. L. (1995) Data features. Statistica Neerlandica 49,185-245.
    Davies, P. L. (2008). Approximating data (with discussion)., Journal of the Korean Statistical Society, 37:191–240.
    Hampel, F.R., Rousseeuw, P.J. Ronchetti, and Stahel, W. Robust Statistics–The Approach Based on Influence Functions. Wiley, NY, 1986.
    Some great statisticians are currently working on “Valid Post-Selection Inference”, look up Andreas Buja’s webpage for interesting stuff.

I welcome constructive comments that are of relevance to the post and the discussion, and discourage detours into irrelevant topics, however interesting, or unconstructive declarations that "you (or they) are just all wrong". If you want to correct or remove a comment, send me an e-mail. If readers have already replied to the comment, you may be asked to replace it to retain comprehension.

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Blog at