# Posts Tagged With: Models/Modelling

## Guest Blog: ARIS SPANOS: The Enduring Legacy of R. A. Fisher

By Aris Spanos

One of R. A. Fisher’s (17 February 1890 — 29 July 1962) most re­markable, but least recognized, achievement was to initiate the recast­ing of statistical induction. Fisher (1922) pioneered modern frequentist statistics as a model-based approach to statistical induction anchored on the notion of a statistical model, formalized by:

Mθ(x)={f(x;θ); θ∈Θ}; x∈Rn ;Θ⊂Rm; m < n; (1)

where the distribution of the sample f(x;θ) ‘encapsulates’ the proba­bilistic information in the statistical model.

Before Fisher, the notion of a statistical model was vague and often implicit, and its role was primarily conﬁned to the description of the distributional features of the data in hand using the histogram and the ﬁrst few sample moments; implicitly imposing random (IID) samples. The problem was that statisticians at the time would use descriptive summaries of the data to claim generality beyond the data in hand x0:=(x1,x2,…,xn) As late as the 1920s, the problem of statistical induction was understood by Karl Pearson in terms of invoking (i) the ‘stability’ of empirical results for subsequent samples and (ii) a prior distribution for θ.

Fisher was able to recast statistical inference by turning Karl Pear­son’s approach, proceeding from data x0 in search of a frequency curve f(x;ϑ) to describe its histogram, on its head. He proposed to begin with a prespeciﬁed Mθ(x) (a ‘hypothetical inﬁnite population’), and view x0 as a ‘typical’ realization thereof; see Spanos (1999). Continue reading

Categories: Fisher, Spanos, Statistics |

## Aris Spanos: The Enduring Legacy of R. A. Fisher

More Fisher insights from A. Spanos, this from 2 years ago:

One of R. A. Fisher’s (17 February 1890 — 29 July 1962) most re­markable, but least recognized, achievement was to initiate the recast­ing of statistical induction. Fisher (1922) pioneered modern frequentist statistics as a model-based approach to statistical induction anchored on the notion of a statistical model, formalized by:

Mθ(x)={f(x;θ); θ∈Θ}; x∈Rn ;Θ⊂Rm; m < n; (1)

where the distribution of the sample f(x;θ) ‘encapsulates’ the proba­bilistic information in the statistical model.

Before Fisher, the notion of a statistical model was vague and often implicit, and its role was primarily conﬁned to the description of the distributional features of the data in hand using the histogram and the ﬁrst few sample moments; implicitly imposing random (IID) samples. The problem was that statisticians at the time would use descriptive summaries of the data to claim generality beyond the data in hand x0:=(x1,x2,…,xn). As late as the 1920s, the problem of statistical induction was understood by Karl Pearson in terms of invoking (i) the ‘stability’ of empirical results for subsequent samples and (ii) a prior distribution for θ.

Fisher was able to recast statistical inference by turning Karl Pear­son’s approach, proceeding from data x0 in search of a frequency curve f(x;ϑ) to describe its histogram, on its head. He proposed to begin with a prespeciﬁed Mθ(x) (a ‘hypothetical inﬁnite population’), and view x0 as a ‘typical’ realization thereof; see Spanos (1999).

In my mind, Fisher’s most enduring contribution is his devising a general way to ‘operationalize’ errors by embedding the material ex­periment into Mθ(x), and taming errors via probabiliﬁcation, i.e. to deﬁne frequentist error probabilities in the context of a statistical model. These error probabilities are (a) deductively derived from the statistical model, and (b) provide a measure of the ‘eﬀectiviness’ of the inference procedure: how often a certain method will give rise to correct in­ferences concerning the underlying ‘true’ Data Generating Mechanism (DGM). This cast aside the need for a prior. Both of these key elements, the statistical model and the error probabilities, have been reﬁned and extended by Mayo’s error statistical approach (EGEK 1996). Learning from data is achieved when an inference is reached by an inductive procedure which, with high probability, will yield true conclusions from valid inductive premises (a statistical model); Mayo and Spanos (2011). Continue reading

Categories: Fisher, phil/history of stat, Statistics |

## Guest Blogger. ARIS SPANOS: The Enduring Legacy of R. A. Fisher

By Aris Spanos

One of R. A. Fisher’s (17 February 1890 — 29 July 1962) most re­markable, but least recognized, achievement was to initiate the recast­ing of statistical induction. Fisher (1922) pioneered modern frequentist statistics as a model-based approach to statistical induction anchored on the notion of a statistical model, formalized by:

Mθ(x)={f(x;θ); θ∈Θ}; x∈Rn ;Θ⊂Rm; m < n; (1)

where the distribution of the sample f(x;θ) ‘encapsulates’ the proba­bilistic information in the statistical model.

Before Fisher, the notion of a statistical model was vague and often implicit, and its role was primarily conﬁned to the description of the distributional features of the data in hand using the histogram and the ﬁrst few sample moments; implicitly imposing random (IID) samples. The problem was that statisticians at the time would use descriptive summaries of the data to claim generality beyond the data in hand x0:=(x1,x2,…,xn) As late as the 1920s, the problem of statistical induction was understood by Karl Pearson in terms of invoking (i) the ‘stability’ of empirical results for subsequent samples and (ii) a prior distribution for θ.

Categories: Statistics |

## If you try sometime, you find you get what you need!

 picking up the pieces
Categories: Statistics |

## RMM-4: Special Volume on Stat Scie Meets Phil Sci

The article “Foundational Issues in Statistical Modeling: Statistical Model Specification and Validation*” by Aris Spanos has now been published in our special volume of the on-line journal, Rationality, Markets, and Morals (Special Topic: Statistical Science and Philosophy of Science: Where Do/Should They Meet?”)

Abstract:
Statistical model specification and validation raise crucial foundational problems whose pertinent resolution holds the key to learning from data by securing the reliability of frequentist inference. The paper questions the judiciousness of several current practices, including the theory-driven approach, and the Akaike-type model selection procedures, arguing that they often lead to unreliable inferences. This is primarily due to the fact that goodness-of-fit/prediction measures and other substantive and pragmatic criteria are of questionable value when the estimated model is statistically misspecified. Foisting one’s favorite model on the data often yields estimated models which are both statistically and substantively misspecified, but one has no way to delineate between the two sources of error and apportion blame. The paper argues that the error statistical approach can address this Duhemian ambiguity by distinguishing between statistical and substantive premises and viewing empirical modeling in a piecemeal way with a view to delineate the various issues more effectively. It is also argued that Hendry’s general to specific procedures does a much better job in model selection than the theory-driven and the Akaike-type procedures primary because of its error statistical underpinnings.

Categories: Philosophy of Statistics, Statistics |

## SF conferences & E. Lehmann

I’m jumping off the Island for a bit.  Destination: San Francisco, a conference on “The Experimental Side of Modeling” http://www.isabellepeschard.org/ .  Kuru makes a walk on appearance in my presentation, “How Experiment Gets a Life of its Own”.  It does not directly discuss statistics, but I will post my slides.

The last time I was in SF was in 2003 with my econometrician colleague, Aris Spanos.  We were on our way to Santa Barbara to engage in an unusual powwow on statistical foundations at NCEAS*, and stopped off in SF to meet with Erich Lehmann and his wife, Julie Shaffer.   We discussed, among other things, this zany idea of mine to put together a session for the Second Lehmann conference in 2004 that would focus on philosophical foundations of statistics. (Our session turned out to include David Freedman and D.R. Cox). Continue reading