Phil6334

All I want for Chrismukkah is that critics & “reformers” quit howlers of testing (after 3 yrs of blogging)! So here’s Aris Spanos “Tallking Back!”

Posted on December 23, 2014 by Mayo

This was initially posted as slides from our joint Spring 2014 seminar: “Talking Back to the Critics Using Error Statistics”. (You can enlarge them.) Related reading is Mayo and Spanos (2011)

Categories: Error Statistics, fallacy of rejection, Phil6334, reforming the reformers, Statistics | 27 Comments

Deconstructing Andrew Gelman: “A Bayesian wants everybody else to be a non-Bayesian.”

Posted on May 17, 2014 by Mayo

At the start of our seminar, I said that “on weekends this spring (in connection with Phil 6334, but not limited to seminar participants) I will post some of my ‘deconstructions‘ of articles”. I began with Andrew Gelman‘s note “Ethics and the statistical use of prior information”[i], but never posted my deconstruction of it. So since it’s Saturday night, and the seminar is just ending, here it is, along with related links to Stat and ESP research (including me, Jack Good, Persi Diaconis and Pat Suppes). Please share comments especially in relation to current day ESP research. Continue reading →

Categories: Background knowledge, Gelman, Phil6334, Statistics | 35 Comments

A. Spanos: Talking back to the critics using error statistics (Phil6334)

Posted on May 7, 2014 by Mayo

Aris Spanos’ overview of error statistical responses to familiar criticisms of statistical tests. Related reading is Mayo and Spanos (2011)

Categories: Error Statistics, frequentist/Bayesian, Phil6334, reforming the reformers, statistical tests, Statistics | Leave a comment

Reliability and Reproducibility: Fraudulent p-values through multiple testing (and other biases): S. Stanley Young (Phil 6334: Day#13)

Posted on April 26, 2014 by Mayo

S. Stanley Young, PhD
Assistant Director for Bioinformatics
National Institute of Statistical Sciences
Research Triangle Park, NC

Here are Dr. Stanley Young’s slides from our April 25 seminar. They contain several tips for unearthing deception by fraudulent p-value reports. Since it’s Saturday night, you might wish to perform an experiment with three 10-sided dice*,recording the results of 100 rolls (3 at a time) on the form on slide 13. An entry, e.g., (0,1,3) becomes an imaginary p-value of .013 associated with the type of tumor, male-female, old-young. You report only hypotheses whose null is rejected at a “p-value” less than .05. Forward your results to me for publication in a peer-reviewed journal.

*Sets of 10-sided dice will be offered as a palindrome prize beginning in May.

Categories: Phil6334, science communication, spurious p values, Statistical fraudbusting, Statistics | Tags: S. Stanley Young | 12 Comments

Phil 6334 Visitor: S. Stanley Young, “Statistics and Scientific Integrity”

Posted on April 23, 2014 by Mayo

We are pleased to announce our guest speaker at Thursday’s seminar (April 24, 2014): “Statistics and Scientific Integrity”:

S. Stanley Young, PhD
Assistant Director for Bioinformatics
National Institute of Statistical Sciences
Research Triangle Park, NC

Author of Resampling-Based Multiple Testing, Westfall and Young (1993) Wiley.

The main readings for the discussion are:

 Young, S. & Karr, A. (2011). Deming, Data and Observational Studies. Signif. 8 (3), 116–120.
 Begley & Ellis (2012) Raise standards for preclinical cancer research. Nature 483: 531-533.
 Ioannidis (2005). Why most published research ﬁndings are false. PLoS Med 2(8): e124.
Peng, R. D., Dominici, F. & Zeger, S. L. (2006). “Reproducible Epidemiologic Research” American Journal of Epidemiology 163 (9), 783-789.

Categories: Announcement, evidence-based policy, Phil6334, science communication, selection effects, Statistical fraudbusting, Statistics | 4 Comments

Phil 6334: Foundations of statistics and its consequences: Day #12

Posted on April 21, 2014 by Mayo

We interspersed key issues from the reading for this session (from Howson and Urbach) with portions of my presentation at the Boston Colloquium (Feb, 2014): Revisiting the Foundations of Statistics in the Era of Big Data: Scaling Up to Meet the Challenge. (Slides below)*.

Someone sent us a recording (mp3)of the panel discussion from that Colloquium (there’s a lot on “big data” and its politics) including: Mayo, Xiao-Li Meng (Harvard), Kent Staley (St. Louis), and Mark van der Laan (Berkeley).

See if this works: | mp3

*There’s a prelude here to our visitor on April 24: Professor Stanley Young from the National Institute of Statistical Sciences.

Categories: Bayesian/frequentist, Error Statistics, Phil6334 | 43 Comments

Duality: Confidence intervals and the severity of tests

Posted on April 17, 2014 by Mayo

A question came up in our seminar today about how to understand the duality between a simple one-sided test and a lower limit (LL) of a corresponding 1-sided confidence interval estimate. This is also a good route to SEV (i.e., severity). Here’s a quick answer: Continue reading →

Categories: confidence intervals and tests, Phil6334 | Leave a comment

Severe osteometric probing of skeletal remains: John Byrd

Posted on March 28, 2014 by Mayo

John E. Byrd, Ph.D. D-ABFA

Central Identification Laboratory
JPAC

Guest, March 27, PHil 6334

“Statistical Considerations of the Histomorphometric Test Protocol for Determination of Human Origin of Skeletal Remains”

By:
John E. Byrd, Ph.D. D-ABFA
Maria-Teresa Tersigni-Tarrant, Ph.D.
Central Identification Laboratory
JPAC

Categories: Phil6334, Philosophy of Statistics, Statistics | 1 Comment

Fallacies of statistics & statistics journalism, and how to avoid them: Summary & Slides Day #8 (Phil 6334)

Posted on March 22, 2014 by Mayo

We spent the first half of Thursday’s seminar discussing the Fisher, Neyman, and E. Pearson “triad”[i]. So, since it’s Saturday night, join me in rereading for the nth time these three very short articles. The key issues were: error of the second kind, behavioristic vs evidential interpretations, and Fisher’s mysterious fiducial intervals. Although we often hear exaggerated accounts of the differences in the Fisherian vs Neyman-Pearson (NP) methodology, in fact, N-P were simply providing Fisher’s tests with a logical ground (even though other foundations for tests are still possible), and Fisher welcomed this gladly. Notably, with the single null hypothesis, N-P showed that it was possible to have tests where the probability of rejecting the null when true exceeded the probability of rejecting it when false. Hacking called such tests “worse than useless”, and N-P develop a theory of testing that avoids such problems. Statistical journalists who report on the alleged “inconsistent hybrid” (a term popularized by Gigerenzer) should recognize the extent to which the apparent disagreements on method reflect professional squabbling between Fisher and Neyman after 1935 [A recent example is a Nature article by R. Nuzzo in ii below]. The two types of tests are best seen as asking different questions in different contexts. They both follow error-statistical reasoning. Continue reading →

Categories: phil/history of stat, Phil6334, science communication, Severity, significance tests, Statistics | Tags: Nuzzo | 35 Comments

Power taboos: Statue of Liberty, Senn, Neyman, Carnap, Severity

Posted on March 19, 2014 by Mayo

Is it taboo to use a test’s power to assess what may be learned from the data in front of us? (Is it limited to pre-data planning?) If not entirely taboo, some regard power as irrelevant post-data[i], and the reason I’ve heard is along the lines of an analogy Stephen Senn gave today (in a comment discussing his last post here)[ii].

Senn comment: So let me give you another analogy to your (very interesting) fire alarm analogy (My analogy is imperfect but so is the fire alarm.) If you want to cross the Atlantic from Glasgow you should do some serious calculations to decide what boat you need. However, if several days later you arrive at the Statue of Liberty the fact that you see it is more important than the size of the boat for deciding that you did, indeed, cross the Atlantic.

My fire alarm analogy is here. My analogy presumes you are assessing the situation (about the fire) long distance. Continue reading →

Categories: exchange with commentators, Neyman's Nursery, P-values, Phil6334, power, Stephen Senn | 6 Comments

Phil6334 Day #7: Selection effects, the Higgs and 5 sigma, Power

Posted on March 11, 2014 by Mayo

Below are slides from March 6, 2014: (a) the 2nd half of “Frequentist Statistics as a Theory of Inductive Inference” (Selection Effects),”* and (b) the discussion of the Higgs particle discovery and controversy over 5 sigma.

We spent the rest of the seminar computing significance levels, rejection regions, and power (by hand and with the Excel program). Here is the updated syllabus (3rd installment).

A relevant paper on selection effects on this blog is here.

Categories: Higgs, P-values, Phil6334, selection effects | Leave a comment

Power, power everywhere–(it) may not be what you think! [illustration]

Posted on March 4, 2014 by Mayo

Statistical power is one of the neatest [i], yet most misunderstood statistical notions [ii].So here’s a visual illustration (written initially for our 6334 seminar), but worth a look by anyone who wants an easy way to attain the will to understand power.(Please see notes below slides.)

[i]I was tempted to say power is one of the “most powerful” notions.It is.True, severity leads us to look, not at the cut-off for rejection (as with power) but the actual observed value, or observed p-value. But the reasoning is the same. Likewise for less artificial cases where the standard deviation has to be estimated. See Mayo and Spanos 2006.

[ii]

Some say that to compute power requires either knowing the alternative hypothesis (whatever that means), or worse, the alternative’s prior probability! Then there’s the tendency (by reformers no less!) to transpose power in such a way as to get the appraisal of tests exactly backwards. An example is Ziliac and McCloskey (2008). See,for example, the will to understand power: https://errorstatistics.com/2011/10/03/part-2-prionvac-the-will-to-understand-power/
Many allege that a null hypothesis may be rejected (in favor of alternative H’) with greater warrant, the greater the power of the test against H’, e.g., Howson and Urbach (2006, 154). But this is mistaken. The frequentist appraisal of tests is the reverse, whether Fisherian significance tests or those of the Neyman-Pearson variety. One may find the fallacy exposed back in Morrison and Henkel (1970)! See EGEK 1996, pp. 402-3.
For a humorous post on this fallacy, see: “The fallacy of rejection and the fallacy of nouvelle cuisine”: https://errorstatistics.com/2012/04/04/jackie-mason/

You can find a link to the Severity Excel Program (from which the pictures came) on the left hand column of this blog, and a link to basic instructions.This corresponds to EXAMPLE SET 1 pdf for Phil 6334.

Howson, C. and P. Urbach (2006). Scientific Reasoning: The Bayesian Approach. La Salle, Il: Open Court.

Mayo, D. G. and A. Spanos (2006) “Severe Testing as a Basic Concept in a Neyman-Pearson Philosophy of Induction“ British Journal of Philosophy of Science, 57: 323-357.

Morrison and Henkel (1970), The significance Test controversy.

Ziliak, Z. and McCloskey, D. (2008), The Cult of Statistical Significance: How the Standard Error Costs Us Jobs, Justice and Lives, University of Michigan Press.

Categories: Phil6334, Statistical power, Statistics | 26 Comments

Phil 6334: February 20, 2014 (Spanos): Day #5

Posted on February 24, 2014 by Mayo

PHIL 6334 – “Probability/Statistics Lecture Notes 3 for 2/20/14: Estimation (Point and Interval)”:(Prof. Spanos)*

*This is Day #5 on the Syllabus, as Day #4 had to be made up (Feb 24, 2014) due to snow. Slides for Day #4 will go up Feb. 26, 2014. (See the revised Syllabus Second Installment.)

Categories: Phil6334, Philosophy of Statistics, Spanos | 5 Comments

Phil6334 Statistical Snow Sculpture

Posted on February 13, 2014 by Mayo

Statistical Snow Sculpture

No Seminar. Blizzard.

Categories: Announcement, Phil6334 | Leave a comment

Phil6334

All I want for Chrismukkah is that critics & “reformers” quit howlers of testing (after 3 yrs of blogging)! So here’s Aris Spanos “Tallking Back!”

Deconstructing Andrew Gelman: “A Bayesian wants everybody else to be a non-Bayesian.”

A. Spanos: Talking back to the critics using error statistics (Phil6334)

Reliability and Reproducibility: Fraudulent p-values through multiple testing (and other biases): S. Stanley Young (Phil 6334: Day#13)

Phil 6334 Visitor: S. Stanley Young, “Statistics and Scientific Integrity”

Phil 6334: Foundations of statistics and its consequences: Day #12

Duality: Confidence intervals and the severity of tests

Severe osteometric probing of skeletal remains: John Byrd

Fallacies of statistics & statistics journalism, and how to avoid them: Summary & Slides Day #8 (Phil 6334)

Power taboos: Statue of Liberty, Senn, Neyman, Carnap, Severity

Phil6334 Day #7: Selection effects, the Higgs and 5 sigma, Power

Power, power everywhere–(it) may not be what you think! [illustration]

Phil 6334: February 20, 2014 (Spanos): Day #5

Phil6334 Statistical Snow Sculpture

The Statistics Wars & Their Casualties

Blog links (references)

Reviews of Statistical Inference as Severe Testing (SIST)

Interviews & Debates on PhilStat (2020)

Interviews on PhilStat (2019)

LSE PH500 Research Seminar (May 21-June 25, 2020): Controversies in Phil Stat

Summer Seminar 2019 (article)

Top Posts & Pages

Conferences & Workshops

RMM Special Topic

Mayo & Spanos, Error Statistics

Follow Blog via Email

My Websites

Recent Posts: PhilStatWars

The Statistics Wars and Their Casualties Videos & Slides from Sessions 1 & 2

THE STATISTICS WARS AND THEIR CASUALTIES VIDEOS & SLIDES FROM SESSIONS 3 & 4

Final session: The Statistics Wars and Their Casualties: 8 December, Session 4

SCHEDULE: The Statistics Wars and Their Casualties: 1 Dec & 8 Dec: Sessions 3 & 4

WORKSHOP

LOG IN/OUT

Archives

© Deborah G. Mayo, Error Statistics Philosophy, 2011-2018 All Rights Reserved.