# Posts Tagged With: Frequentist inference

## Guest Blog: ARIS SPANOS: The Enduring Legacy of R. A. Fisher

By Aris Spanos

One of R. A. Fisher’s (17 February 1890 — 29 July 1962) most re­markable, but least recognized, achievement was to initiate the recast­ing of statistical induction. Fisher (1922) pioneered modern frequentist statistics as a model-based approach to statistical induction anchored on the notion of a statistical model, formalized by:

Mθ(x)={f(x;θ); θ∈Θ}; x∈Rn ;Θ⊂Rm; m < n; (1)

where the distribution of the sample f(x;θ) ‘encapsulates’ the proba­bilistic information in the statistical model.

Before Fisher, the notion of a statistical model was vague and often implicit, and its role was primarily conﬁned to the description of the distributional features of the data in hand using the histogram and the ﬁrst few sample moments; implicitly imposing random (IID) samples. The problem was that statisticians at the time would use descriptive summaries of the data to claim generality beyond the data in hand x0:=(x1,x2,…,xn) As late as the 1920s, the problem of statistical induction was understood by Karl Pearson in terms of invoking (i) the ‘stability’ of empirical results for subsequent samples and (ii) a prior distribution for θ.

Fisher was able to recast statistical inference by turning Karl Pear­son’s approach, proceeding from data x0 in search of a frequency curve f(x;ϑ) to describe its histogram, on its head. He proposed to begin with a prespeciﬁed Mθ(x) (a ‘hypothetical inﬁnite population’), and view x0 as a ‘typical’ realization thereof; see Spanos (1999). Continue reading

Categories: Fisher, Spanos, Statistics |

## Aris Spanos: The Enduring Legacy of R. A. Fisher

More Fisher insights from A. Spanos, this from 2 years ago:

One of R. A. Fisher’s (17 February 1890 — 29 July 1962) most re­markable, but least recognized, achievement was to initiate the recast­ing of statistical induction. Fisher (1922) pioneered modern frequentist statistics as a model-based approach to statistical induction anchored on the notion of a statistical model, formalized by:

Mθ(x)={f(x;θ); θ∈Θ}; x∈Rn ;Θ⊂Rm; m < n; (1)

where the distribution of the sample f(x;θ) ‘encapsulates’ the proba­bilistic information in the statistical model.

Before Fisher, the notion of a statistical model was vague and often implicit, and its role was primarily conﬁned to the description of the distributional features of the data in hand using the histogram and the ﬁrst few sample moments; implicitly imposing random (IID) samples. The problem was that statisticians at the time would use descriptive summaries of the data to claim generality beyond the data in hand x0:=(x1,x2,…,xn). As late as the 1920s, the problem of statistical induction was understood by Karl Pearson in terms of invoking (i) the ‘stability’ of empirical results for subsequent samples and (ii) a prior distribution for θ.

Fisher was able to recast statistical inference by turning Karl Pear­son’s approach, proceeding from data x0 in search of a frequency curve f(x;ϑ) to describe its histogram, on its head. He proposed to begin with a prespeciﬁed Mθ(x) (a ‘hypothetical inﬁnite population’), and view x0 as a ‘typical’ realization thereof; see Spanos (1999).

In my mind, Fisher’s most enduring contribution is his devising a general way to ‘operationalize’ errors by embedding the material ex­periment into Mθ(x), and taming errors via probabiliﬁcation, i.e. to deﬁne frequentist error probabilities in the context of a statistical model. These error probabilities are (a) deductively derived from the statistical model, and (b) provide a measure of the ‘eﬀectiviness’ of the inference procedure: how often a certain method will give rise to correct in­ferences concerning the underlying ‘true’ Data Generating Mechanism (DGM). This cast aside the need for a prior. Both of these key elements, the statistical model and the error probabilities, have been reﬁned and extended by Mayo’s error statistical approach (EGEK 1996). Learning from data is achieved when an inference is reached by an inductive procedure which, with high probability, will yield true conclusions from valid inductive premises (a statistical model); Mayo and Spanos (2011). Continue reading

Categories: Fisher, phil/history of stat, Statistics |

## Comedy hour at the Bayesian (epistemology) retreat: highly probable vs highly probed (vs B-boosts)

Since we’ll be discussing Bayesian confirmation measures in next week’s seminar—the relevant blogpost being here--let’s listen in to one of the comedy hours at the Bayesian retreat as reblogged from May 5, 2012.

Did you hear the one about the frequentist error statistical tester who inferred a hypothesis H passed a stringent test (with data x)?

The problem was, the epistemic probability in H was so low that H couldn’t be believed!  Instead we believe its denial H’!  So, she will infer hypotheses that are simply unbelievable!

So it appears the error statistical testing account fails to serve as an account of knowledge or evidence (i.e., an epistemic account). However severely I might wish to say that a hypothesis H has passed a test, this Bayesian critic assigns a sufficiently low prior probability to H so as to yield a low posterior probability in H[i].  But this is no argument about why this counts in favor of, rather than against, their particular Bayesian computation as an appropriate assessment of the warrant to be accorded to hypothesis H.

To begin with, in order to use techniques for assigning frequentist probabilities to events, their examples invariably involve “hypotheses” that consist of asserting that a sample possesses a characteristic, such as “having a disease” or “being college ready” or, for that matter, “being true.”  This would not necessarily be problematic if it were not for the fact that their criticism requires shifting the probability to the particular sample selected—for example, a student Isaac is college-ready, or this null hypothesis (selected from a pool of nulls) is true.  This was, recall, the fallacious probability assignment that we saw in Berger’s attempt, later (perhaps) disavowed. Also there are just two outcomes, say s and ~s, and no degrees of discrepancy from H. Continue reading

## Comedy hour at the Bayesian (epistemology) retreat: highly probable vs highly probed (vs what ?)

Our favorite high school student, Isaac, gets a better shot at showing his college readiness using one of the comparative measures of support or confirmation discussed last week. Their assessment thus seems more in sync with the severe tester, but they are not purporting that z is evidence for inferring (or even believing) an H to which z affords a high B-boost*. Their measures identify a third category that reflects the degree to which H would predict z (where the comparison might be predicting without z, or under ~H or the like).  At least if we give it an empirical, rather than a purely logical, reading. Since it’s Saturday night let’s listen in to one of the comedy hours at the Bayesian retreat as reblogged from May 5, 2012.

Did you hear the one about the frequentist error statistical tester who inferred a hypothesis H passed a stringent test (with data x)?

The problem was, the epistemic probability in H was so low that H couldn’t be believed!  Instead we believe its denial H’!  So, she will infer hypotheses that are simply unbelievable!

So it appears the error statistical testing account fails to serve as an account of knowledge or evidence (i.e., an epistemic account). However severely I might wish to say that a hypothesis H has passed a test, this Bayesian critic assigns a sufficiently low prior probability to H so as to yield a low posterior probability in H[i].  But this is no argument about why this counts in favor of, rather than against, their particular Bayesian computation as an appropriate assessment of the warrant to be accorded to hypothesis H.

To begin with, in order to use techniques for assigning frequentist probabilities to events, their examples invariably involve “hypotheses” that consist of asserting that a sample possesses a characteristic, such as “having a disease” or “being college ready” or, for that matter, “being true.”  This would not necessarily be problematic if it were not for the fact that their criticism requires shifting the probability to the particular sample selected—for example, a student Isaac is college-ready, or this null hypothesis (selected from a pool of nulls) is true.  This was, recall, the fallacious probability assignment that we saw in Berger’s attempt, later (perhaps) disavowed. Also there are just two outcomes, say s and ~s, and no degrees of discrepancy from H. Continue reading

## Comedy Hour at the Bayesian (Epistemology) Retreat: Highly Probable vs Highly Probed

Bayesian philosophers (among others) have analogous versions of the criticism in my April 28 blogpost: error probabilities (associated with inferences to hypotheses) may conflict with chosen posterior probabilities in hypotheses. Since it’s Saturday night let’s listen in to one of the comedy hours at the Bayesian retreat (note the sedate philosopher’s comedy club backdrop):

Did you hear the one about the frequentist error statistical tester who inferred a hypothesis H passed a stringent test (with data x)?

The problem was, the epistemic probability in H was so low that H couldn’t be believed!  Instead we believe its denial H’!  So, she will infer hypotheses that are simply unbelievable!

So clearly the error statistical testing account fails to serve in an account of knowledge or inference (i.e., an epistemic account). However severely I might wish to say that a hypothesis H has passed a test, the Bayesian critic assigns a sufficiently low prior probability to H so as to yield a low posterior probability in H[i].  But this is no argument about why this counts in favor of, rather than against, their Bayesian computation as an appropriate assessment of the warrant to be accorded to hypothesis H.

To begin with, in order to use techniques for assigning frequentist probabilities to events, their examples invariably involve “hypotheses” that consist of asserting that a sample possesses a characteristic, such as “having a disease” or “being college ready” or, for that matter, “being true.”  This would not necessarily be problematic if it were not for the fact that their criticism requires shifting the probability to the particular sample selected—for example, a student Isaac is college-ready, or this null hypothesis (selected from a pool of nulls) is true.  This was, recall, the fallacious probability assignment that we saw in Berger’s attempt, later (perhaps) disavowed. Also there are just two outcomes, say s and ~s, and no degrees of discrepancy from H. Continue reading

## Matching Numbers Across Philosophies

The search for an agreement on numbers across different statistical philosophies is an understandable pastime in foundations of statistics. Perhaps identifying matching or unified numbers, apart from what they might mean, would offer a glimpse as to shared underlying goals? Jim Berger (2003) assures us there is no sacrilege in agreeing on methodology without philosophy, claiming “while the debate over interpretation can be strident, statistical practice is little affected as long as the reported numbers are the same” (Berger, 2003, p. 1).

Neyman and Pearson (or perhaps it was mostly Neyman) set out to determine when tests of statistical hypotheses may be considered “independent of probabilities a priori” ([p. 201). In such cases, frequentist and Bayesian may agree on a critical or rejection region.

The agreement between “default” Bayesians and frequentists in the case of one-sided Normal (IID) testing (known σ) is very familiar.   As noted in Ghosh, Delampady, and Samanta (2006, p. 35), if we wish to reject a null value when “the posterior odds against it are 19:1 or more, i.e., if posterior probability of H0 is < .05” then the rejection region matches that of the corresponding test of H0, (at the .05 level) if that were the null hypothesis. By contrast, they go on to note the also familiar fact that there would be disagreement between the frequentist and Bayesian if one were instead testing the two sided: H0: μ=μ0 vs. H1: μ≠μ0 with known σ. In fact, the same outcome that would be regarded as evidence against the null in the one-sided test (for the default Bayesian and frequentist) can result in statistically significant results being construed as no evidence against the null —for the Bayesian– or even evidence for it (due to a spiked prior).[i] Continue reading

## Guest Blogger. ARIS SPANOS: The Enduring Legacy of R. A. Fisher

By Aris Spanos

One of R. A. Fisher’s (17 February 1890 — 29 July 1962) most re­markable, but least recognized, achievement was to initiate the recast­ing of statistical induction. Fisher (1922) pioneered modern frequentist statistics as a model-based approach to statistical induction anchored on the notion of a statistical model, formalized by:

Mθ(x)={f(x;θ); θ∈Θ}; x∈Rn ;Θ⊂Rm; m < n; (1)

where the distribution of the sample f(x;θ) ‘encapsulates’ the proba­bilistic information in the statistical model.

Before Fisher, the notion of a statistical model was vague and often implicit, and its role was primarily conﬁned to the description of the distributional features of the data in hand using the histogram and the ﬁrst few sample moments; implicitly imposing random (IID) samples. The problem was that statisticians at the time would use descriptive summaries of the data to claim generality beyond the data in hand x0:=(x1,x2,…,xn) As late as the 1920s, the problem of statistical induction was understood by Karl Pearson in terms of invoking (i) the ‘stability’ of empirical results for subsequent samples and (ii) a prior distribution for θ.

Categories: Statistics |