Intro MS Testing

Return to Classical Epistemology: Sensitivity and Severity: Gardiner and Zaharatos (2022) (i)

.

Picking up where I left off in a 2023 post, I will (finally!) return to Gardiner and Zaharos’s discussion of sensitivity in epistemology and its connection to my notion of severity. But before turning to Parts II (and III), I’d better reblog Part I. Here it is:

I’ve been reading an illuminating paper by Georgi Gardiner and Brian Zaharatos (Gardiner and Zaharatos, 2022; hereafter, G & Z), “The safe, the sensitive and the severely tested,” that forges links between contemporary epistemology and my severe testing account. It’s part of a collection published in Synthese on “Recent issues in Philosophy of Statistics”.  Gardiner and Zaharatos were among the 15 faculty who attended the 2019 summer seminar in philstat that I ran (with Aris Spanos). The authors courageously jump over some high hurdles separating the two projects (whether a palisade or a ha ha–see G & Z) and manage to bring them into close connection. The traditional epistemologist is largely focused on an analytic task of defining what is meant by knowledge (generally restricted to low-level perceptual claims, or claims about single events) whereas the severe tester is keen to articulate when scientific hypotheses are well or poorly warranted by data. Still, while severity grows out of statistical testing, I intend for the account to hold for any case of error-prone inference. So it should stand up to the examples with which one meets in the jungles of epistemology. For all of the examples I’ve seen so far, it does. I will admit, the epistemologists have storehouses of thorny examples, many of which I’ll come back to. This will be part 1 of two, possible even three, posts on the topic; revisions to this part will be indicated with ii, iii, etc., and no I haven’t used the chatbot or anything in writing this. Continue reading

Categories: severity and sensitivity in epistemology | 1 Comment

Sensitivity and Severity: Gardiner and Zaharatos (2022) (i)

.

I’ve been reading an illuminating paper by Georgi Gardiner and Brian Zaharatos (Gardiner and Zaharatos, 2022; hereafter, G & Z), “The safe, the sensitive and the severely tested,” that forges links between contemporary epistemology and my severe testing account. It’s part of a collection published in Synthese on “Recent issues in Philosophy of Statistics”.  Gardiner and Zaharatos were among the 15 faculty who attended the 2019 summer seminar in philstat that I ran (with Aris Spanos). The authors courageously jump over some high hurdles separating the two projects (whether a palisade or a ha ha–see G & Z) and manage to bring them into close connection. The traditional epistemologist is largely focused on an analytic task of defining what is meant by knowledge (generally restricted to low-level perceptual claims, or claims about single events) whereas the severe tester is keen to articulate when scientific hypotheses are well or poorly warranted by data. Still, while severity grows out of statistical testing, I intend for the account to hold for any case of error-prone inference. So it should stand up to the examples with which one meets in the jungles of epistemology. For all of the examples I’ve seen so far, it does. I will admit, the epistemologists have storehouses of thorny examples, many of which I’ll come back to. This will be part 1 of two, possible even three, posts on the topic; revisions to this part will be indicated with ii, iii, etc., and no I haven’t used the chatbot or anything in writing this. Continue reading

Categories: severity and sensitivity in epistemology | 2 Comments

Phil 6334: Misspecification Testing: Ordering From A Full Diagnostic Menu (part 1)

.

 We’re going to be discussing the philosophy of m-s testing today in our seminar, so I’m reblogging this from Feb. 2012. I’ve linked the 3 follow-ups below. Check the original posts for some good discussion. (Note visitor*)

“This is the kind of cure that kills the patient!”

is the line of Aris Spanos that I most remember from when I first heard him talk about testing assumptions of, and respecifying, statistical models in 1999.  (The patient, of course, is the statistical model.) On finishing my book, EGEK 1996, I had been keen to fill its central gaps one of which was fleshing out a crucial piece of the error-statistical framework of learning from error: How to validate the assumptions of statistical models. But the whole problem turned out to be far more philosophically—not to mention technically—challenging than I imagined. I will try (in 3 short posts) to sketch a procedure that I think puts the entire process of model validation on a sound logical footing. Continue reading

Categories: Intro MS Testing, Statistics | Tags: , , , , | 16 Comments

Misspecification Tests: (part 4) and brief concluding remarks

The Nature of the Inferences From Graphical Techniques: What is the status of the learning from graphs? In this view, the graphs afford good ideas about the kinds of violations for which it would be useful to probe, much as looking at a forensic clue (e.g., footprint, tire track) helps to narrow down the search for a given suspect, a fault-tree, for a given cause. The same discernment can be achieved with a formal analysis (with parametric and nonparametric tests), perhaps more discriminating than can be accomplished by even the most trained eye, but the reasoning and the justification are much the same. (The capabilities of these techniques may be checked by simulating data deliberately generated to violate or obey the various assumptions.)

The combined indications from the graphs indicate departures from the LRM in the direction of the DLRM, but only, for the moment, as indicating a fruitful model to probe further.  We are not licensed to infer that it is itself a statistically adequate model until its own assumptions are subsequently tested.  Even when they are checked and found to hold up – which they happen to be in this case – our inference must still be qualified. While we may infer that the model is statistically adequate – this should be understood only as licensing the use the model as a reliable tool for primary statistical inferences but not necessarily as representing the substantive phenomenon being modeled.

Continue reading

Categories: Intro MS Testing, Statistics | Tags: , , , , | 6 Comments

Misspecification Testing: (part 3) Subtracting-out effects “on paper”

Nurse chart behind her pink

A Better Way  The traditional approach described in Part 2 did not detect the presence of mean-heterogeneity and so it misidentified temporal dependence as the sole source of misspecification associated with the original LRM.

On the basis of figures 1-3 we can summarize our progress in detecting potential departures from the LRM model assumptions to probe thus far:

LRM Alternatives
(D) Distribution: Normal ?
(M) Dependence: Independent ?
(H) Heterogeneity: Identically Distributed mean-heterogeneity

Discriminating and Amplifying the Effects of Mistakes  We could correctly assess dependence if our data were ID and not obscured by the influence of the trending mean.  Although, we can not literally manipulate relevant factors, we can ‘subtract out’ the trending mean in a generic way to see what it would be like if there were no trending mean. Here are the detrended xt and yt.

 

Fig. 4: Detrended Population (y - trend )

Fig. 4: Detrended Population (y – trend )

Continue reading

Categories: Intro MS Testing, Statistics | Tags: , , , | 11 Comments

Misspecification Testing: (part 2) A Fallacy of Error “Fixing”

mstestingPart2nurse

Part 1 is here.

Graphing t-plots (This is my first experiment with blogging data plots, they have been blown up a bit, so hopefully they are now sufficiently readable).

Here are two plots (t-plots) of the observed data where yt is the population of the USA in millions, and  xt our “secret” variable, to be revealed later on, both over time (1955-1989).

Fig 1: USA Population (y)

Fig 1: USA Population (y)

mstestingPart2Fig2

Fig. 2: Secret variable (x)

mstestingPart2 Fig 3

Figure 3: A typical realization of a NIID process.

Pretty clearly, there are glaring departures from IID when we compare a typical realization of a NIID process,  in fig. 3, with the t-plots of the two series  in figures 1-2.  In particular, both data series show the mean is increasing with time – that is, strong mean-heterogeneity (trending mean).Our recommended next step would be to continue exploring the probabilistic structure of the data in figures 1 and 2  with a view toward thoroughly assessing the validity of the LRM assumptions [1]-[5] (table 1). But first let us take a quick look at the traditional approach for testing assumptions, focusing just on assumption [4] traditionally viewed as error non-autocorrelation: E(ut,us)=0 for t≠s, t,s=1,2,…,n. Continue reading

Categories: Intro MS Testing, Statistics | Tags: , , , , , | Leave a comment

Intro to Misspecification Testing: Ordering From A Full Diagnostic Menu (part 1)

 

“This is the kind of cure that kills the patient!”

is the line of Aris Spanos that I most remember from when I first heard him talk about testing assumptions of, and respecifying, statistical models in 1999.  (The patient, of course, is the statistical model.) On finishing my book, EGEK 1996, I had been keen to fill its central gaps one of which was fleshing out a crucial piece of the error-statistical framework of learning from error: How to validate the assumptions of statistical models. But the whole problem turned out to be far more philosophically—not to mention technically—challenging than I imagined.I will try (in 3 short posts) to sketch a procedure that I think puts the entire process of model validation on a sound logical footing.  Thanks to attending several of Spanos’ seminars (and his patient tutorials, for which I am very grateful), I was eventually able to reflect philosophically on aspects of  his already well-worked out approach. (Synergies with the error statistical philosophy, of which this is a part,  warrant a separate discussion.)

Continue reading

Categories: Intro MS Testing, Statistics | Tags: , , , , | 20 Comments

Blog at WordPress.com.