Posts Tagged With: statistical model

Guest Blog: ARIS SPANOS: The Enduring Legacy of R. A. Fisher

Posted on February 19, 2017 by Mayo

By Aris Spanos

One of R. A. Fisher’s (17 February 1890 — 29 July 1962) most remarkable, but least recognized, achievement was to initiate the recasting of statistical induction. Fisher (1922) pioneered modern frequentist statistics as a model-based approach to statistical induction anchored on the notion of a statistical model, formalized by:

M_θ(x)={f(x;θ); θ∈Θ}; x∈Rⁿ;Θ⊂R^m; m < n; (1)

where the distribution of the sample f(x;θ) ‘encapsulates’ the probabilistic information in the statistical model.

Before Fisher, the notion of a statistical model was vague and often implicit, and its role was primarily conﬁned to the description of the distributional features of the data in hand using the histogram and the ﬁrst few sample moments; implicitly imposing random (IID) samples. The problem was that statisticians at the time would use descriptive summaries of the data to claim generality beyond the data in hand x₀:=(x₁,x₂,…,x_n) As late as the 1920s, the problem of statistical induction was understood by Karl Pearson in terms of invoking (i) the ‘stability’ of empirical results for subsequent samples and (ii) a prior distribution for θ.

Fisher was able to recast statistical inference by turning Karl Pearson’s approach, proceeding from data x₀in search of a frequency curve f(x;ϑ) to describe its histogram, on its head. He proposed to begin with a prespeciﬁed M_θ(x) (a ‘hypothetical inﬁnite population’), and view x₀as a ‘typical’ realization thereof; see Spanos (1999). Continue reading →

Categories: Fisher, Spanos, Statistics | Tags: E S Pearson, Frequentist inference, induction, Jerzy Neyman, Models/Modelling, Ronald Fisher, statistical model | Leave a comment

Aris Spanos: The Enduring Legacy of R. A. Fisher

Posted on February 18, 2014 by Mayo

More Fisher insights from A. Spanos, this from 2 years ago:

M_θ(x)={f(x;θ); θ∈Θ}; x∈Rⁿ;Θ⊂R^m; m < n; (1)

where the distribution of the sample f(x;θ) ‘encapsulates’ the probabilistic information in the statistical model.

Before Fisher, the notion of a statistical model was vague and often implicit, and its role was primarily conﬁned to the description of the distributional features of the data in hand using the histogram and the ﬁrst few sample moments; implicitly imposing random (IID) samples. The problem was that statisticians at the time would use descriptive summaries of the data to claim generality beyond the data in hand x₀:=(x₁,x₂,…,x_n). As late as the 1920s, the problem of statistical induction was understood by Karl Pearson in terms of invoking (i) the ‘stability’ of empirical results for subsequent samples and (ii) a prior distribution for θ.

In my mind, Fisher’s most enduring contribution is his devising a general way to ‘operationalize’ errors by embedding the material experiment into M_θ(x), and taming errors via probabiliﬁcation, i.e. to deﬁne frequentist error probabilities in the context of a statistical model. These error probabilities are (a) deductively derived from the statistical model, and (b) provide a measure of the ‘eﬀectiviness’ of the inference procedure: how often a certain method will give rise to correct inferences concerning the underlying ‘true’ Data Generating Mechanism (DGM). This cast aside the need for a prior. Both of these key elements, the statistical model and the error probabilities, have been reﬁned and extended by Mayo’s error statistical approach (EGEK 1996). Learning from data is achieved when an inference is reached by an inductive procedure which, with high probability, will yield true conclusions from valid inductive premises (a statistical model); Mayo and Spanos (2011). Continue reading →

Categories: Fisher, phil/history of stat, Statistics | Tags: E S Pearson, Frequentist inference, induction, Jerzy Neyman, Models/Modelling, Ronald Fisher, statistical model | 2 Comments

What’s in a Name? (Gelman’s blog)

Posted on July 31, 2012 by Mayo

I just noticed Andrew Gelman’s blog today. ..too good to let pass without quick comment: He asks:

What is a Bayesian?

Deborah Mayo recommended that I consider coming up with a new name for the statistical methods that I used, given that the term “Bayesian” has all sorts of associations that I dislike (as discussed, for example, in section 1 of this article).

I replied that I agree on Bayesian, I never liked the term and always wanted something better, but I couldn’t think of any convenient alternative. Also, I was finding that Bayesians (even the Bayesians I disagreed with) were reading my research articles, while non-Bayesians were simply ignoring them. So I thought it was best to identify with, and communicate with, those people who were willing to engage with me.

More formally, I’m happy defining “Bayesian” as “using inference from the posterior distribution, p(theta|y)”. This says nothing about where the probability distributions come from (thus, no requirement to be “subjective” or “objective”) and it says nothing about the models (thus, no requirement to use the discrete models that have been favored by the Bayesian model selection crew). Based on my minimal definition, I’m as Bayesian as anyone else.

He may be “as Bayesian as anyone else,” but does he really want to be as Bayesian as anyone? (slight, deliberate equivocation). As a good Popperian, I concur (with Popper), that names should not matter, but Gelman’s remarks suggest he should distinguish himself, at least philosophically[i].

As in note [iv] of my Wasserman deconstruction: “Even where Bayesian methods are usefully applied, some say ‘most of the standard philosophy of Bayes is wrong’ (Gelman and Shalizi 2012, 2 n2)”.

In the paper Gelman today cites (from our RMM collection):

… we see science—and applied statistics—as resolving anomalies via the creation of improved models which of- ten include their predecessors as special cases. This view corresponds closely to the error-statistics idea of Mayo (1996). (Gelman 2011, 70)

If the foundations for these methods are error statistical, then shouldn’t that come out in the description? (error-statistical Bayes?) It seems sufficiently novel to warrant some greater gesture, than ‘this too is Bayesian’.)

In that spirit I ended my deconstruction with the passage:

Ironically many seem prepared to allow that Bayesianism still gets it right for epistemology, even as statistical practice calls for methods more closely aligned with frequentist principles. What I would like the reader to consider is that what is right for epistemology is also what is right for statistical learning in practice. That is, statistical inference in practice deserves its own epistemology. (Mayo, 2011p. 100)

What do people think?

[i] To Gelman’s credit, he is one of the few contemporary statisticians to (openly) recognize the potential value of philosophy of statistics for statistical practice!

Categories: Statistics | Tags: Andrew Gelman, error-statistical Bayes, posterior distribution, statistical model | 2 Comments

A. Spanos: Jerzy Neyman and his Enduring Legacy

Posted on April 16, 2012 by Mayo

A Statistical Model as a Chance Mechanism

Aris Spanos

Jerzy Neyman (April 16, 1894 – August 5, 1981), was a Polish/American statistician[i] who spent most of his professional career at the University of California, Berkeley. Neyman is best known in statistics for his pioneering contributions in framing the Neyman-Pearson (N-P) optimal theory of hypothesis testing and his theory of Confidence Intervals.

One of Neyman’s most remarkable, but least recognized, achievements was his adapting of Fisher’s (1922) notion of a statistical model to render it pertinent for non-random samples. Fisher’s original parametric statistical model M_θ(x) was based on the idea of ‘a hypothetical infinite population’, chosen so as to ensure that the observed data x₀:=(x₁,x₂,…,x_n) can be viewed as a ‘truly representative sample’ from that ‘population’:

“The postulate of randomness thus resolves itself into the question, Of what population is this a random sample? (ibid., p. 313), underscoring that: the adequacy of our choice may be tested a posteriori.’’ (p. 314)

In cases where data x₀ come from sample surveys or it can be viewed as a typical realization of a random sample X:=(X₁,X₂,…,X_n), i.e. Independent and Identically Distributed (IID) random variables, the ‘population’ metaphor can be helpful in adding some intuitive appeal to the inductive dimension of statistical inference, because one can imagine using a subset of a population (the sample) to draw inferences pertaining to the whole population.

This ‘infinite population’ metaphor, however, is of limited value in most applied disciplines relying on observational data. To see how inept this metaphor is consider the question: what is the hypothetical ‘population’ when modeling the gyrations of stock market prices? More generally, what is observed in such cases is a certain on-going process and not a fixed population from which we can select a representative sample. For that very reason, most economists in the 1930s considered Fisher’s statistical modeling irrelevant for economic data! Continue reading →

Categories: Statistics | Tags: chance mechanism, frequentist probability, induction, Jerzy Neyman, Ronald Fisher, statistical Generating Mechanism, statistical model | 2 Comments

Misspecification Tests: (part 4) and brief concluding remarks

Posted on February 28, 2012 by Mayo

The Nature of the Inferences From Graphical Techniques: What is the status of the learning from graphs? In this view, the graphs afford good ideas about the kinds of violations for which it would be useful to probe, much as looking at a forensic clue (e.g., footprint, tire track) helps to narrow down the search for a given suspect, a fault-tree, for a given cause. The same discernment can be achieved with a formal analysis (with parametric and nonparametric tests), perhaps more discriminating than can be accomplished by even the most trained eye, but the reasoning and the justification are much the same. (The capabilities of these techniques may be checked by simulating data deliberately generated to violate or obey the various assumptions.)

The combined indications from the graphs indicate departures from the LRM in the direction of the DLRM, but only, for the moment, as indicating a fruitful model to probe further. We are not licensed to infer that it is itself a statistically adequate model until its own assumptions are subsequently tested. Even when they are checked and found to hold up – which they happen to be in this case – our inference must still be qualified. While we may infer that the model is statistically adequate – this should be understood only as licensing the use the model as a reliable tool for primary statistical inferences but not necessarily as representing the substantive phenomenon being modeled.

Continue reading →

Categories: Intro MS Testing, Statistics | Tags: Aris Spanos, Duhem's problem, piecemeal inquiry, statistical model, testing model assumptions | 6 Comments

Guest Blogger. ARIS SPANOS: The Enduring Legacy of R. A. Fisher

Posted on February 15, 2012 by Mayo

By Aris Spanos

M_θ(x)={f(x;θ); θ∈Θ}; x∈Rⁿ;Θ⊂R^m; m < n; (1)

where the distribution of the sample f(x;θ) ‘encapsulates’ the probabilistic information in the statistical model.

Continue reading →

Categories: Statistics | Tags: E S Pearson, Frequentist inference, induction, Jerzy Neyman, Models/Modelling, Ronald Fisher, statistical model | 5 Comments

Posts Tagged With: statistical model

Guest Blog: ARIS SPANOS: The Enduring Legacy of R. A. Fisher

Aris Spanos: The Enduring Legacy of R. A. Fisher

What’s in a Name? (Gelman’s blog)

A. Spanos: Jerzy Neyman and his Enduring Legacy

Misspecification Tests: (part 4) and brief concluding remarks

Guest Blogger. ARIS SPANOS: The Enduring Legacy of R. A. Fisher

The Statistics Wars & Their Casualties

Blog links (references)

Reviews of Statistical Inference as Severe Testing (SIST)

Interviews & Debates on PhilStat (2020)

Interviews on PhilStat (2019)

LSE PH500 Research Seminar (May 21-June 25, 2020): Controversies in Phil Stat

Summer Seminar 2019 (article)

Top Posts & Pages

Conferences & Workshops

RMM Special Topic

Mayo & Spanos, Error Statistics

Follow Blog via Email

My Websites

Recent Posts: PhilStatWars

The Statistics Wars and Their Casualties Videos & Slides from Sessions 1 & 2

THE STATISTICS WARS AND THEIR CASUALTIES VIDEOS & SLIDES FROM SESSIONS 3 & 4

Final session: The Statistics Wars and Their Casualties: 8 December, Session 4

SCHEDULE: The Statistics Wars and Their Casualties: 1 Dec & 8 Dec: Sessions 3 & 4

WORKSHOP

LOG IN/OUT

Archives

© Deborah G. Mayo, Error Statistics Philosophy, 2011-2018 All Rights Reserved.