Monthly Archives: June 2026

‘Low power’ and an all too standard error (continuation of “don’t turn power on its head”)

.

“In my opinion, a great deal of confusion about statistics can be traced to the fact that the point estimate is seen as being the be all and end all, the expression of uncertainty being forgotten….to provide a point estimate without also providing a standard error is, indeed, an all too standard error.”

Stephen Senn: “Error point: the importance of knowing how much you don’t know”

 

In my previous blogpost, (“How not to turn power on its head”), I argued, in relation to a one-sided test of mean μ (e.g., H0: µ  0 vs H1: µ > 0 with known SE):

If POW(μ′) is high (e.g., over .5), then a just significant result is poor evidence that μ > μ′; while if POW(μ′) is low (e.g., less than .2), it is good evidence that μ > μ′ where μ′ is a value greater 0 (provided assumptions for these claims hold approximately).

By a “just statistically significant result” I mean one that just makes it to the threshold for statistical significance, write it as M* (my last post used D*). The reasoning is essentially this: Because it’s very improbable to obtain as low a P-value as we did, were μ as small as μ′—that is, because POW(μ′) is low—the result indicates we are in a world where μ is greater than μ′. This is exactly the reasoning that allows us to infer μ > 0 with a statistically significant result. Indeed, the power of the test against μ₀ is α.

Why then do we often hear that low power is associated with “exaggerated” or “inflated “effects? As we reasoned in the previous post, low power against μ′ strengthens the inference that μ exceeds μ′. Can the same feature—low power—also be associated with overestimation? The answer is, yes it can, but only one of the claims corresponds to a correct application of statistical significance tests.

More specifically, the overestimation charge stems from supposing the observed result M* is taken as a (point) estimate of the population mean (i.e., estimating  μ = M*, without providing the SE)–an unkosher (but not so uncommon) move–and then considering a value μ′ against which the test has low power. Since M* is the just-significant cutoff, clearly M* will exceed μ′ (at least in a good test). So if the true population mean takes a value against which the test has low power, and M* is taken as a point estimate of μ, the result will be to “overestimate” the population mean. While the true value is unknown, this if-then claim is correct. Of course, if the power to detect the true μ is high, the observed M*, will underestimate μ–if M* used as a point estimate.

To clarify these points, it helps to contrast two different questions that are often run together:

  1. Does the observed (just) statistically significant result M* warrant inferring μ > μ′ (when POW(μ′) is low)?
  2. Does the observed (just) statistically significant result M* exceed μ′ (when POW(μ′) is low)?

The answer to both questions is yes. The very fact invoked to show that M* exceeds μ′—yielding a “yes” answer to #2–namely, that a result at least as large as M* would be improbable were μ = μ′—is precisely what warrants inferring that μ > μ′–yielding a “yes” answer to #1.

However common it may be to identify the observed statistically significant result M* with the population mean, that is not the inference warranted from a significance test. For one thing, significance test inferences are inequalities, not point claims or point estimates. A statistically significant result warrants inferring μ > μ₀ and, more generally, warrants inferring μ > μ′ for values μ′ against which the test has sufficiently low power–although it is not typically put that way. It would more typically be put in terms of the p-value reached. What is the p-value were we testing H0: µ  M*, and observed our just significant result M*? Answer: .5.

There is, of course, a relation to estimation. Rejecting H₀ is equivalent to inferring that μ exceeds the corresponding lower confidence bound, for a reasonably high confidence level.  Obtaining this lower bound requires subtracting a number of SEs (e.g., 1.5, 1.65, 1.96, 2) from M*.

Observe that POW(μ₀)=α and POW(M*)=.5. As μ′ moves farther below M*:

As μ′ moves farther below M* Consequence
M* − μ′ increases Greater overestimation if the observed M* is used to estimate μ
POW(μ′) decreases The probability of obtaining M ≥ M* under μ = μ′ decreases
P-value for μ = μ′ decreases Stronger evidence that μ > μ′

Thus, as power against μ′ decreases, the amount by which M* exceeds μ′ increases, but so too does the evidence that μ exceeds μ′. The very circumstance that yields greater overestimation when M* is used to estimate μ yields stronger evidence that μ exceeds μ′.

One final point. If a testing procedure selectively reporting only statistically significant results, then the original error probabilities no longer apply–whether to the test or equivalent CI estimation.

Share your queries and thoughts in the comments to this post.

 

For a related post see “Do underpowered tests exaggerate population effects?

See also the discussion on pp. 359-361 of Mayo (2018, CUP): Statistical Inference as Severe Testing: How to get beyond the statistics wars? (SIST). The relevant excerpt can be found here.

 

 

Categories: power, reforming the reformers | Leave a comment

Blog at WordPress.com.