Objectivity

Will the Real Junk Science Please Stand Up?

Junk Science (as first coined).* Have you ever noticed in wranglings over evidence-based policy that it’s always one side that’s politicizing the evidence—the side whose policy one doesn’t like? The evidence on the near side, or your side, however, is solid science. Let’s call those who first coined the term “junk science” Group 1. For Group 1, junk science is bad science that is used to defend pro-regulatory stances, whereas sound science would identify errors in reports of potential risk. (Yes, this was the first popular use of “junk science”, to my knowledge.) For the challengers—let’s call them Group 2—junk science is bad science that is used to defend the anti-regulatory stance, whereas sound science would identify potential risks, advocate precautionary stances, and recognize errors where risk is denied.

Both groups agree that politicizing science is very, very bad—but it’s only the other group that does it!

A given print exposé exploring the distortions of fact on one side or the other routinely showers wild praise on their side’s—their science’s and their policy’s—objectivity, their adherence to the facts, just the facts. How impressed might we be with the text or the group that admitted to its own biases? Continue reading

Categories: 4 years ago!, junk science, Objectivity, Statistics | Tags: , , , , | 29 Comments

Objectivity in Statistics: “Arguments From Discretion and 3 Reactions”

dirty hands

We constantly hear that procedures of inference are inescapably subjective because of the latitude of human judgment as it bears on the collection, modeling, and interpretation of data. But this is seriously equivocal: Being the product of a human subject is hardly the same as being subjective, at least not in the sense we are speaking of—that is, as a threat to objective knowledge. Are all these arguments about the allegedly inevitable subjectivity of statistical methodology rooted in equivocations? I argue that they are! [This post combines this one and this one, as part of our monthly “3 years ago” memory lane.]

“Argument from Discretion” (dirty hands)

Insofar as humans conduct science and draw inferences, it is obvious that human judgments and human measurements are involved. True enough, but too trivial an observation to help us distinguish among the different ways judgments should enter, and how, nevertheless, to avoid introducing bias and unwarranted inferences. The issue is not that a human is doing the measuring, but whether we can reliably use the thing being measured to find out about the world.

Remember the dirty-hands argument? In the early days of this blog (e.g., October 13, 16), I deliberately took up this argument as it arises in evidence-based policy because it offered a certain clarity that I knew we would need to come back to in considering general “arguments from discretion”. To abbreviate:

  1. Numerous  human judgments go into specifying experiments, tests, and models.
  2. Because there is latitude and discretion in these specifications, they are “subjective.”
  3. Whether data are taken as evidence for a statistical hypothesis or model depends on these subjective methodological choices.
  4. Therefore, statistical inference and modeling is invariably subjective, if only in part.

We can spot the fallacy in the argument much as we did in the dirty hands argument about evidence-based policy. It is true, for example, that by employing a very insensitive test for detecting a positive discrepancy d’ from a 0 null, that the test has low probability of finding statistical significance even if a discrepancy as large as d’ exists. But that doesn’t prevent us from determining, objectively, that an insignificant difference from that test fails to warrant inferring evidence of a discrepancy less than d’.

Test specifications may well be a matter of  personal interest and bias, but, given the choices made, whether or not an inference is warranted is not a matter of personal interest and bias. Setting up a test with low power against d’ might be a product of your desire not to find an effect for economic reasons, of insufficient funds to collect a larger sample, or of the inadvertent choice of a bureaucrat. Or ethical concerns may have entered. But none of this precludes our critical evaluation of what the resulting data do and do not indicate (about the question of interest). The critical task need not itself be a matter of economics, ethics, or what have you. Critical scrutiny of evidence reflects an interest all right—an interest in not being misled, an interest in finding out what the case is, and others of an epistemic nature. Continue reading

Categories: Objectivity, Statistics | Tags: , | 6 Comments

Objective/subjective, dirty hands and all that: Gelman/ Wasserman blogolog (ii)

Objectivity #2: The “Dirty Hands” Argument for Ethics in EvidenceAndrew Gelman says that as a philosopher, I should appreciate his blog today in which he records his frustration: “Against aggressive definitions: No, I don’t think it helps to describe Bayes as ‘the analysis of subjective beliefs’…”  Gelman writes:

I get frustrated with what might be called “aggressive definitions,” where people use a restrictive definition of something they don’t like. For example, Larry Wasserman writes (as reported by Deborah Mayo):

“I wish people were clearer about what Bayes is/is not and what 
frequentist inference is/is not. Bayes is the analysis of subjective
 beliefs but provides no frequency guarantees. Frequentist inference 
is about making procedures that have frequency guarantees but makes no 
pretense of representing anyone’s beliefs.”

I’ll accept Larry’s definition of frequentist inference. But as for his definition of Bayesian inference: No no no no no. The probabilities we use in our Bayesian inference are not subjective, or, they’re no more subjective than the logistic regressions and normal distributions and Poisson distributions and so forth that fill up all the textbooks on frequentist inference.

To quickly record some of my own frustrations:*: First, I would disagree with Wasserman’s characterization of frequentist inference, but as is clear from Larry’s comments to (my reaction to him), I think he concurs that he was just giving a broad contrast. Please see Note [1] for a remark from my post: Comments on Wasserman’s “what is Bayesian/frequentist inference?” Also relevant is a Gelman post on the Bayesian name: [2].

Second, Gelman’s “no more subjective than…” evokes  remarks I’ve made before. For example, in “What should philosophers of science do…” I wrote:

Arguments given for some very popular slogans (mostly by non-philosophers), are too readily taken on faith as canon by others, and are repeated as gospel. Examples are easily found: all models are false, no models are falsifiable, everything is subjective, or equally subjective and objective, and the only properly epistemological use of probability is to supply posterior probabilities for quantifying actual or rational degrees of belief. Then there is the cluster of “howlers” allegedly committed by frequentist error statistical methods repeated verbatim (discussed on this blog).

I’ve written a lot about objectivity on this blog, e.g., here, here and here (and in real life), but what’s the point if people just rehearse the “everything is a mixture…” line, without making deeply important distinctions? I really think that, next to the “all models are false” slogan, the most confusion has been engendered by the “no methods are objective” slogan. However much we may aim at objective constraints, it is often urged, we can never have “clean hands” free of the influence of beliefs and interests, and we invariably sully methods of inquiry by the entry of background beliefs and personal judgments in their specification and interpretation. Continue reading

Categories: Bayesian/frequentist, Error Statistics, Gelman, Objectivity, Statistics | 41 Comments

Will the Real Junk Science Please Stand Up? (critical thinking)

Equivocations about “junk science” came up in today’s “critical thinking” class; if anything, the current situation is worse than 2 years ago when I posted this.

Have you ever noticed in wranglings over evidence-based policy that it’s always one side that’s politicizing the evidence—the side whose policy one doesn’t like? The evidence on the near side, or your side, however, is solid science. Let’s call those who first coined the term “junk science” Group 1. For Group 1, junk science is bad science that is used to defend pro-regulatory stances, whereas sound science would identify errors in reports of potential risk. For the challengers—let’s call them Group 2—junk science is bad science that is used to defend the anti-regulatory stance, whereas sound science would identify potential risks, advocate precautionary stances, and recognize errors where risk is denied. Both groups agree that politicizing science is very, very bad—but it’s only the other group that does it!

A given print exposé exploring the distortions of fact on one side or the other routinely showers wild praise on their side’s—their science’s and their policy’s—objectivity, their adherence to the facts, just the facts. How impressed might we be with the text or the group that admitted to its own biases?

Take, say, global warming, genetically modified crops, electric-power lines, medical diagnostic testing. Group 1 alleges that those who point up the risks (actual or potential) have a vested interest in construing the evidence that exists (and the gaps in the evidence) accordingly, which may bias the relevant science and pressure scientists to be politically correct. Group 2 alleges the reverse, pointing to industry biases in the analysis or reanalysis of data and pressures on scientists doing industry-funded work to go along to get along.

When the battle between the two groups is joined, issues of evidence—what counts as bad/good evidence for a given claim—and issues of regulation and policy—what are “acceptable” standards of risk/benefit—may become so entangled that no one recognizes how much of the disagreement stems from divergent assumptions about how models are produced and used, as well as from contrary stands on the foundations of uncertain knowledge and statistical inference. The core disagreement is mistakenly attributed to divergent policy values, at least for the most part. Continue reading

Categories: critical thinking, junk science, Objectivity | Tags: , , , , | 16 Comments

Objectivity (#5): Three Reactions to the Challenge of Objectivity (in inference):

(1) If discretionary judgments are thought to introduce subjectivity in inference, a classic strategy thought to achieve objectivity is to extricate such choices, replacing them with purely formal a priori computations or agreed-upon conventions (see March 14).  If leeway for discretion introduces subjectivity, then cutting off discretion must yield objectivity!  Or so some argue. Such strategies may be found, to varying degrees, across the different approaches to statistical inference.

The inductive logics of the type developed by Carnap promised to be an objective guide for measuring degrees of confirmation in hypotheses, despite much-discussed problems, paradoxes, and conflicting choices of confirmation logics.  In Carnapian inductive logics, initial assignments of probability are based on a choice of language and on intuitive, logical principles. The consequent logical probabilities can then be updated (given the statements of evidence) with Bayes’s Theorem. The fact that the resulting degrees of confirmation are at the same time analytical and a priori—giving them an air of objectivity–reveals the central weakness of such confirmation theories as “guides for life”, e.g., —as guides, say, for empirical frequencies or for finding things out in the real world. Something very similar  happens with the varieties of “objective’” Bayesian accounts, both in statistics and in formal Bayesian epistemology in philosophy (a topic to which I will return; if interested, see my RMM contribution).

A related way of trying to remove latitude for discretion might be to define objectivity in terms of the consensus of a specified group, perhaps of experts, or of agents with “diverse” backgrounds. Once again, such a convention may enable agreement yet fail to have the desired link-up with the real world.  It would be necessary to show why consensus reached by the particular choice of group (another area for discretion) achieves the learning goals of interest.

Continue reading

Categories: Objectivity, Objectivity, Statistics | Tags: , , | Leave a comment

Objectivity (#4) and the “Argument From Discretion”

We constantly hear that procedures of inference are inescapably subjective because of the latitude of human judgment as it bears on the collection, modeling, and interpretation of data. But this is seriously equivocal: Being the product of a human subject is hardly the same as being subjective, at least not in the sense we are speaking of—that is, as a threat to objective knowledge. Are all these arguments about the allegedly inevitable subjectivity of statistical methodology rooted in equivocations? I argue that they are!

Insofar as humans conduct science and draw inferences, it is obvious that human judgments and human measurements are involved. True enough, but too trivial an observation to help us distinguish among the different ways judgments should enter, and how, nevertheless, to avoid introducing bias and unwarranted inferences. The issue is not that a human is doing the measuring, but whether we can reliably use the thing being measured to find out about the world.

Continue reading

Categories: Objectivity, Objectivity, Statistics | Tags: , | 29 Comments

Objectivity #3: Clean(er) Hands With Metastatistics

I claim that all but the first of the “dirty hands” argument’s five premises are flawed. Even the first premise too directly identifies a policy decision with a statistical report. But the key flaws begin with premise 2. Although risk policies may be based on a statistical report of evidence, it does not follow that the considerations suitable for judging risk policies are the ones suitable for judging the statistical report. They are not. The latter, of course, should not be reduced to some kind of unthinking accept/reject report. If responsible, it must clearly and completely report the nature and extent of (risk-related) effects that are and are not indicated by the data, making plain how the methodological choices made in the generation, modeling, and interpreting of data raise or lower the chances of finding evidence of specific risks. These choices may be called risk assessment policy (RAP) choices. Continue reading

Categories: Objectivity, Objectivity, Statistics | Tags: , | 10 Comments

Objectivity #2: The “Dirty Hands” Argument for Ethics in Evidence

Some argue that generating and interpreting data for purposes of risk assessment invariably introduces ethical (and other value) considerations that might not only go beyond, but might even conflict with, the “accepted canons of objective scientific reporting.”  This thesis, we may call it the thesis of ethics in evidence and inference, some think, shows that an ethical interpretation of evidence may warrant violating canons of scientific objectivity, and even that a scientist must choose between norms of morality and objectivity.

The reasoning is that since the scientists’ hands must invariably get “dirty” with policy and other values, they should opt for interpreting evidence in a way that promotes ethically sound values, or maximizes public benefit (in some sense).

I call this the “dirty hands” argument, alluding to a term used by philosopher Carl Cranor (1994).1

I cannot say how far its proponents would endorse taking the argument.2 However, it seems that if this thesis is accepted, it may be possible to regard as “unethical” the objective reporting of scientific uncertainties in evidence.  This consequence is worrisome: in fact, it would conflict with the generally accepted imperative for an ethical interpretation of scientific evidence.

Nevertheless, the “dirty hands” argument as advanced has apparently plausible premises, one or more of which would need to be denied to avoid the conclusion which otherwise follows deductively. It goes roughly as follows:

  1. Whether observed data are taken as evidence of a risk depends on a methodological decision as to when to reject the null hypothesis of no risk  H0 (and infer the data are evidence of a risk).
  2. Thus interpreting data to feed into policy decisions with potentially serious risks to the public, the scientist is actually engaged in matters of policy (what is generally framed as an issue of evidence and science, is actually an issue of policy values, ethics, and politics).
  3.  The public funds scientific research and the scientist should be responsible for promoting the public good, so scientists should interpret risk evidence so as to maximize public benefit.
  4. Therefore, a responsible (ethical) interpretation of scientific data on risks is one that maximizes public benefit–and one that does not do so is irresponsible or unethical.
  5. Public benefit is maximized by minimizing the chance of failing to find a risk.  This leads to the conclusion in 6:
  6. CONCLUSION: In situations of risk assessment the ethical interpreter of evidence will maximize the chance of inferring there is a risk–even if this means inferring a risk when there is none with high probability (or at least a probability much higher than is normally countenanced)

The argument about ethics in evidence is often put in terms of balancing type 1 and 2 errors.

Type I error:test T finds evidence of an increased risk ( H0 is rejected), when in fact the risk is absent (false positive)

Type II error:
test T does not find evidence of an increased risk ( H0 is accepted), when in fact an increased risk δ is present (false negative).

The traditional balance of type I and type II error probabilities, wherein type I errors are minimized, some argue, is unethical. Rather than minimize type I errors, it might be  claimed, an “ethical” tester should minimize type II errors.

I claim that at least 3 of the premises, while plausible-sounding, are false.  What do you think?
_____________________________________________________

(1) Cranor (to my knowledge) was among the first to articulate the argument in philosophy, in relation to statistical significance tests (it is echoed by more recent philosophers of evidence based policy):

Scientists should adopt more health protective evidentiary standards, even when they are not consistent with the most demanding inferential standards of the field.  That is, scientists may be forced to choose between the evidentiary ideals of their fields and the moral value of protecting the public from exposure to toxins, frequently they cannot realize both (Cranor 1994, pp. 169-70).

Kristin Shrader-Frechette has advanced analogous arguments in numerous risk research contexts.

(2) I should note that Cranor is aware that properly scrutinizing statistical tests can advance matters here.

Cranor, C. (1994), “Public Health Research and Uncertainty”, in K. Shrader-Frechette, Ethics of Sciencetific Research.  Rowman and Littlefield, pp. 169-186.

Shrader-Frechette, K. (1994), Ethics of Scientific Research, Rowman and Littlefield

Categories: Objectivity, Objectivity, Statistics | Tags: , , , , | 17 Comments

Objectivity #1. Will the Real Junk Science Please Stand Up?

Have you ever noticed in wranglings over evidence-based policy that it’s always one side that’s politicizing the evidence—the side whose policy one doesn’t like? The evidence on the near side, or your side, however, is solid science. Let’s call those who first coined the term “junk science” Group 1. For Group 1, junk science is bad science that is used to defend pro-regulatory stances, whereas sound science would identify errors in reports of potential risk. For the challengers—let’s call them Group 2—junk science is bad science that is used to defend the anti-regulatory stance, whereas sound science would identify potential risks, advocate precautionary stances, and recognize errors where risk is denied.

Both groups agree that politicizing science is very, very bad—but it’s only the other group that does it!

A given print exposé exploring the distortions of fact on one side or the other routinely showers wild praise on their side’s—their science’s and their policy’s—objectivity, their adherence to the facts, just the facts. How impressed might we be with the text or the group that admitted to its own biases?

Take, say, global warming, genetically modified crops, electric-power lines, medical diagnostic testing. Group 1 alleges that those who point up the risks (actual or potential) have a vested interest in construing the evidence that exists (and the gaps in the evidence) accordingly, which may bias the relevant science and pressure scientists to be politically correct. Group 2 alleges the reverse, pointing to industry biases in the analysis or reanalysis of data and pressures on scientists doing industry-funded work to go along to get along.

When the battle between the two groups is joined, issues of evidence—what counts as bad/good evidence for a given claim—and issues of regulation and policy—what are “acceptable” standards of risk/benefit—may become so entangled that no one recognizes how much of the disagreement stems from divergent assumptions about how models are produced and used, as well as from contrary stands on the foundations of uncertain knowledge and statistical inference. The core disagreement is mistakenly attributed to divergent policy values, at least for the most part.

Over the years I have tried my hand in sorting out these debates (e.g., Mayo and Hollander 1991). My account of testing actually came into being to systematize reasoning from statistically insignificant results in evidence based risk policy: no evidence of risk is not evidence of no risk! (see October 5). Unlike the disputants who get the most attention, I have argued that the current polarization cries out for critical or meta-scientific scrutiny of the uncertainties, assumptions, and risks of error that are part and parcel of the gathering and interpreting of evidence on both sides. Unhappily, the disputants tend not to welcome this position—and are even hostile to it.  This used to shock me when I was starting out—why would those who were trying to promote greater risk accountability not want to avail themselves of ways to hold the agencies and companies responsible when they bury risks in fallacious interpretations of statistically insignificant results?  By now, I am used to it.

This isn’t to say that there’s no honest self-scrutiny going on, but only that all sides are so used to anticipating conspiracies of bias that my position is likely viewed as yet another politically motivated ruse. So what we are left with is scientific evidence having less and less a role in constraining or adjudicating disputes. Even to suggest an evidential adjudication risks being attacked as a paid insider.

I agree with David Michaels (2008, 61) that “the battle for the integrity of science is rooted in issues of methodology,” but winning the battle would demand something that both sides are increasingly unwilling to grant. It comes as no surprise that some of the best scientists stay as far away as possible from such controversial science.

Mayo,D. and Hollander. R. (eds.). 1991. Acceptable Evidence: Science and Values in Risk Management, Oxford.

Mayo. 1991. Sociological versus Metascientific Views of Risk Assessment, in D. Mayo and R. Hollander (eds.), Acceptable Evidence: 249-79.

Michaels, D. 2008. Doubt Is Their Product, Oxford.

Categories: Objectivity, Statistics | Tags: , , , , | 3 Comments

Blog at WordPress.com.