|EGEK weighs 1 pound|
Before leaving base again, I have a rule to check on weight gain since the start of my last trip. I put this off til the last minute, especially when, like this time, I know I’ve overeaten while traveling. The most accurate of the 4 scales I generally use (one is at my doctor’s) is actually in Neyman’s Nursery upstairs. To my surprise, none of these scales showed any discernible increase over when I left. At least one of the 4 scales would surely have registered a weight gain of 1 pound or more, had I gained it, and yet none of them do; that is an indication I’ve not gained a pound or more. I check that each scale reliably indicates 1 pound, because I know that is the weight of the book EGEK (you can even see this on the scale shown), and they each show exactly one pound when EGEK is weighed. Having evidence I’ve gained less than 1 pound, there is even less grounds for supposing I’ve gained as much as 5 pounds, right?
This kind of measure of the capability of a method to detect a change or discrepancy is very much of a Power-type notion (whether formal or informal). Analogously, if an experimental test very probably would have rejected the null hypothesis, if the correct value of μ is as large as μ’—i.e., if the Power of the test against μ’ were very high— , then a non-rejection is an indication that μ is not as large as μ’. This is a general type of Power Analytic reasoning.
Now if you invent a notion that is supposed to be akin to Power in appraising evidence, call it Shpower, and yet this reasoning does not hold, or comes out backwards, then you have not provided grounds against the Power analytic reasoning. Rather, you have provided grounds against supposing that your notion, Shpower, successfully captures the intended idea of Power.
I was pleased to discover all those stat blogs in the top 50 when my ragtag blog was included. And yet I quickly came across some confusions of some basic statistical notions within discussions of recommended tricks of the trade. One of them was Shpower!
I keep to the one sided normal test Test T+ (in the Nov 8 blog):Test T+: reject the null iff X > μ0 + 1.96(σ/√n), corresponding to significance level, .025.
Define: The Shpower of test T+ : It is the same as ordinary Power for test T+, but computed under the assumption that μ equals the observed sample mean, i.e., under the assumption that μ = X0
The Shpower of test T+: P(X> μ0 + 1.96σ/√n); μ = X0)
Now spoze the observed X just fails to reach the cut-off for rejecting the null, let
X0 = μ0 + 1.96(σ/√n).
The Shpower of the test T+ to reject the null hypothesis (μ < μ0) calculated under the assumption that μ = X is .5!
The Shpower value would be even smaller than .5 for smaller observed sample means! But smaller sample means should be even more indicative of small discrepancies from the null. And yet, the less the sample mean differs from the null, the smaller the Shpower!
It is concluded that Power analysis is paradoxical and inconsistent with p-value reasoning!
But they should really only conclude that Shpower analytic reasoning is paradoxical and flawed!
The only way to go is to start with sound principles of statistical reasoning and let them guide you! A poor way to proceed, I hope to convince you, is starting with an invented computation (which might bear some relation to a standard one) and getting all knotted up in paradox, concluding (erroneously) that the standard notion is paradoxical.
I have been promising to come back to this business of “observed power” or “estimated power” and now I have, for Shpower is none other than observed Power!
Arbuckle’s “Lies and Statistics”, October 31 2008