**A Statistical Model as a Chance Mechanism**

**Aris Spanos**

**Jerzy Neyman** (April 16, 1894 – August 5, 1981), was a Polish/American statistician[i] who spent most of his professional career at the University of California, Berkeley. Neyman is best known in statistics for his pioneering contributions in framing the Neyman-Pearson (N-P) optimal theory of hypothesis testing and his theory of Confidence Intervals.

One of Neyman’s most remarkable, but least recognized, achievements was his adapting of Fisher’s (1922) notion of a statistical model to render it pertinent for non-random samples. Fisher’s original parametric statistical model M_{θ}(**x**) was based on the idea of ‘a hypothetical infinite population’, chosen so as to ensure that the observed data **x**_{0}:=(x_{1},x_{2},…,x_{n}) can be viewed as a ‘truly representative sample’ from that ‘population’:

“The postulate of randomness thus resolves itself into the question, Of what population is this a random sample? (ibid., p. 313), underscoring that: the adequacy of our choice may be tested a posteriori.’’ (p. 314)

In cases where data **x**_{0} come from sample surveys or it can be viewed as a typical realization of a random sample **X**:=(X_{1},X_{2},…,X_{n}), i.e. Independent and Identically Distributed (IID) random variables, the ‘population’ metaphor can be helpful in adding some intuitive appeal to the inductive dimension of statistical inference, because one can imagine using a subset of a population (the sample) to draw inferences pertaining to the whole population.

This ‘infinite population’ metaphor, however, is of limited value in most applied disciplines relying on observational data. To see how inept this metaphor is consider the question: what is the hypothetical ‘population’ when modeling the gyrations of stock market prices? More generally, what is observed in such cases is a certain on-going process and not a fixed population from which we can select a representative sample. For that very reason, most economists in the 1930s considered Fisher’s statistical modeling irrelevant for economic data!

Due primarily to Neyman’s experience with empirical modeling in a number of applied fields, including genetics, agriculture, epidemiology, biology, astronomy and economics, his notion of a statistical model, evolved beyond Fisher’s ‘infinite populations’ in the 1930s into Neyman’s frequentist ‘chance mechanisms’ (see Neyman, 1950, 1952):

Guessing and then verifying the ‘chance mechanism’, the repeated operation of which produces the observed frequencies. This is a problem of ‘frequentist probability theory’. Occasionally, this step is labeled ‘model building’. Naturally, the guessed chance mechanism is hypothetical. (Neyman, 1977, p. 99)

From my perspective, this was a major step forward for several reasons, including the following.

*First*, the notion of a statistical model as a ‘chance mechanism’ extended the intended scope of statistical modeling to include dynamic phenomena that give rise to data from non-IID samples, i.e. data that exhibit both dependence and heterogeneity, like stock prices.

*Second*, the notion of a statistical model as a ‘chance mechanism’ is not only of metaphorical value, but it can be operationalized in the context of a statistical model, formalized by:

M_{θ}(**x**)={f(**x**;θ), θ∈Θ**}**, **x**∈R^{n }, Θ⊂R^{m}; m << n,

where the distribution of the sample f(**x**;θ) describes the probabilistic assumptions of the statistical model. This takes the form of a statistical Generating Mechanism (GM), stemming from f(**x**;θ), that can be used to generate simulated data on a computer. An example of such a Statistical GM is:

X_{t} = α_{0} + α_{1}X_{t-1} + σε_{t}, *t=1,2,…,n*

This indicates how one can use *pseudo-random* numbers for the error term ε_{t} ~NIID(0,1) to simulate data for the Normal, AutoRegressive [AR(1)] Model. One can generate numerous sample realizations, say N=100000, of sample size *n* in nanoseconds on a PC.

*Third*, the notion of a statistical model as a ‘chance mechanism’ puts a totally different spin on another metaphor widely used by uninformed critics of frequentist inference. This is the ‘long-run’ metaphor associated with the relevant error probabilities used to calibrate frequentist inferences. The operationalization of the statistical GM reveals that the temporal aspect of this metaphor is totally irrelevant for the frequentist inference; remember Keynes’s catch phrase “In the long run we are all dead”? Instead, what matters in practice is its *repeatability in principle*, not over time! For instance, one can use the above statistical GM to generate the empirical sampling distributions for any test statistic, and thus render operational, not only the pre-data error probabilities like the type I-II as well as the power of a test, but also the post-data probabilities associated with the severity evaluation; see Mayo (1996).

For further discussion on the above issues see:

Spanos, A. (2012), “A Frequentist Interpretation of Probability for Model-Based Inductive Inference,” forthcoming in *Synthese*:

http://www.econ.vt.edu/faculty/2008vitas_research/Spanos/1Spanos-2011-Synthese.pdf

Fisher, R. A. (1922), “On the mathematical foundations of theoretical statistics,” *Philosophical Transactions of the Royal Society* A, 222: 309-368.

Mayo, D. G. (1996), *Error and the Growth of Experimental Knowledge*, The University of Chicago Press, Chicago.

Neyman, J. (1950), *First Course in Probability and Statistics*, Henry Holt, NY.

Neyman, J. (1952), *Lectures and Conferences on Mathematical Statistics and Probability*, 2nd ed. U.S. Department of Agriculture, Washington.

Neyman, J. (1977), “Frequentist Probability and Frequentist Statistics,” *Synthese*, 36, 97-131.

[i]He was born in an area that was part of Russia.

One thing about Neyman (1977): While fascinating and unusual (if only because it’s in a philo journal and is relatively contemporary), it makes Neyman appear more behavioristic than he really is. In his attempt to forcefully deny that one needs “repeated sampling from the same population,” he emphasizes that the error probabilities hold even lumping together lots of different studies. From this, J. Berger gets his “frequentist principle”. Then again, another example in Neyman (1977), as I recall, illuminates/endorses empirical Bayes.

In my experience Physicists are the worst when bringing up the howler mentioned in the post about the “long run”. One asked me why it’s ok to say “the probability of rolling all six’s 100,000 times in a row is 1/6^100000” but not ok to say “the probability a neutrino has mass is x%”

In the “long run” it’s not possible to repeat these dice rolling trials enough to verify that each 100,000 tuple outcome occurs with probability 1/6^100000 since the solar system will end before you can finish. The fast majority of outcomes will have relative frequency 0 when the sun explodes.

But you can at least imagine repeating the dice experiment enough times. That’s why it’s ok. It’s important that you can at least imagine the repeated trials otherwise your just left with whatevers in Baysian’s heads that allows them to assign probabilities.

He just claimed that if the String Theorists are right it might be possible to check the relative frequency of universes in the multi-verse, which have massless neutrinos. I don’t know anything about physics so I don’t know what to make of that.