power paradox | Error Statistics Philosophy

Posts Tagged With: power paradox

Answer to the Homework & a New Exercise

Posted on June 14, 2012 by Mayo

Debunking the “power paradox” allegation from my previous post. The authors consider a one-tailed Z test of the hypothesis H₀: μ ≤ 0 versus H₁: μ > 0: our Test T+. The observed sample mean is = 1.4 and in the first case _σx = 1, and in the second case _σx = 2.

First case: The power against μ = 3.29 is high, .95 (i.e. P(Z > 1.645; μ=3.29) =1-φ(-1.645) = .95), and thus the DDS assessor would take the result as a good indication that μ < 3.29.

Second case: For σ_x = 2, the cut-off for rejection would be 0 + 1.65(2) = 3.30.

So, in the second case (σ_x = 2) the probability of erroneously accepting H₀, even if μ were as high as 3.29, is .5! (i.e. P(Z ≤ 1.645; μ=3.29) = φ(1.645-(3.29/2)) ~.5.) Although p₁ < p₂[i] the justifiable upper bound in the first test is smaller (closer to 0) than in the second! Hence, the DDS assessment is entirely in keeping with the appropriate use of error probabilities in interpreting tests. There is no conflict with p-value reasoning.

NEW PROBLEM

The DDS power analyst always takes the worst cast of just missing the cut-off for rejection. Compare instead

SEV(μ < 3.29) for the first test, and SEV(μ < 3.29) for the second (using the actual outcomes as SEV requires).

[i] p₁= .081 and p₂ = .242.

Categories: Statistics | Tags: criticism of frequentist methods, debunking, homework problems, power paradox | 6 Comments

U-Phil: Is the Use of Power* Open to a Power Paradox?

Posted on June 9, 2012 by Mayo

* to assess Detectable Discrepancy Size (DDS)

In my last post, I argued that DDS type calculations (also called Neymanian power analysis) provide needful information to avoid fallacies of acceptance in the test T+; whereas, the corresponding confidence interval does not (at least not without special testing supplements). But some have argued that DDS computations are “fundamentally flawed” leading to what is called the “power approach paradox”, e.g., Hoenig and Heisey (2001).

We are to consider two variations on the one-tailed test T+: H₀: μ ≤ 0 versus H₁: μ > 0 (p. 21). Following their terminology and symbols: The Z value in the first, Z_p1, exceeds the Z value in the second, Z_p2, although the same observed effect size occurs in both[i], and both have the same sample size, implying that σ₁ < σ₂. For example, suppose σ_x1 = 1 and σ_x2 = 2. Let observed sample mean M be 1.4 for both cases, so Z_p1 = 1.4 and Z_p2 = .7. They note that for any chosen power, the computable detectable discrepancy size will be smaller in the first experiment, and for any conjectured effect size, the computed power will always be higher in the first experiment.

“These results lead to the nonsensical conclusion that the first experiment provides the stronger evidence for the null hypothesis (because the apparent power is higher but significant results were not obtained), in direct contradiction to the standard interpretation of the experimental results (p-values).” (p. 21)

But rather than show the DDS assessment “nonsensical”, nor any direct contradiction to interpreting p values, this just demonstrates something nonsensical in their interpretation of the two p-value results from tests with different variances. Since it’s Sunday night and I’m nursing[ii] overexposure to rowing in the Queen’s Jubilee boats in the rain and wind, how about you find the howler in their treatment. (Also please inform us of articles pointing this out in the last decade, if you know of any.)

______________________

Hoenig, J. M. and D. M. Heisey (2001), “The Abuse of Power: The Pervasive Fallacy of Power Calculations in Data Analysis,” The American Statistician, 55: 19-24.

[i] The subscript indicates the p-value of the associated Z value.

[ii] With English tea and a cup of strong “Elbar grease”.

Categories: Statistics, U-Phil | Tags: criticism of frequentist methods, D. M Heisey, Hoenig and Heisey, J. M. Hoenig, power paradox, significance tests | 7 Comments

Posts Tagged With: power paradox

Answer to the Homework & a New Exercise

U-Phil: Is the Use of Power* Open to a Power Paradox?

The Statistics Wars & Their Casualties

Blog links (references)

Reviews of Statistical Inference as Severe Testing (SIST)

Interviews & Debates on PhilStat (2020)

Interviews on PhilStat (2019)

LSE PH500 Research Seminar (May 21-June 25, 2020): Controversies in Phil Stat

Summer Seminar 2019 (article)

Top Posts & Pages

Conferences & Workshops

RMM Special Topic

Mayo & Spanos, Error Statistics

Follow Blog via Email

My Websites

Recent Posts: PhilStatWars

The Statistics Wars and Their Casualties Videos & Slides from Sessions 1 & 2

THE STATISTICS WARS AND THEIR CASUALTIES VIDEOS & SLIDES FROM SESSIONS 3 & 4

Final session: The Statistics Wars and Their Casualties: 8 December, Session 4

SCHEDULE: The Statistics Wars and Their Casualties: 1 Dec & 8 Dec: Sessions 3 & 4

WORKSHOP

LOG IN/OUT

Archives

© Deborah G. Mayo, Error Statistics Philosophy, 2011-2018 All Rights Reserved.