Guest Post: Andrea Saltelli: Analytic flexibility: a badly kept secret? (thoughts on “abandon statistical significance 5 years on”)

.

.

Professor Andrea Saltelli
UPF Barcelona School of Management, Barcelona, Spain, Centre for the Study of the Sciences and the Humanities, University of Bergen, Bergen, Norway

[An earlier post by A. Saltelli on this topic: Nov 22, 2019: A. Saltelli (Guest post): What can we learn from the debate on statistical significance?]

Analytic flexibility: a badly kept secret?

In a previous post in this blog I expressed concern about a loss of trust that could incur the activity of scientific quantification – as practiced in several discipline – unless some technical and normative element of crisis could be managed. The piece warned that the phenomenon could lead to “a decline of public trust in the findings of science”. Five years and one pandemic later, we may wonder if the danger has indeed materialized.

Looking at a broader time horizon, two decades have elapsed from the onset of the reproducibility crisis, with Ioannidis 2005 categorical “Why Most Published Research Findings Are False” all the way to the large reproducibility experiments of present times, with “many researchers using the same data”.

My personal reading is that the last five years have witnessed an acceleration of the crisis leading to a progressive recalibration of what science can do and not do in a context characterized by increased polarization. We inhabit now an increasingly sophisticated landscape where the epistemic authority of science is appropriated via fact signalling –  “a practice where the stylistic tropes of logical thinking, scientific research, or data analysis is worn like a costume to bolster a sense of moral righteousness and certitude” – by practically all actors, from purported fact-checkers to corporate interests to governments, without forgetting scientists themselves.

Statisticians and econometricians have continued in recent times to shed light on the so-called analytical flexibility of statistical work which has variously been described as a garden of forking paths or a universe of uncertainty hiding in plain sight.

Modellers reached a similar conclusion four decades ago – when hydrologists provocatively wrote that with some ‘judiciously fiddling’ any conclusion could be reached using a mathematical model. This became substance for satire when Douglas Adam, the lucky author of the Hitchhiker’s Guide to the Galaxy, created as a character in one of his novels the model developer that could link any desired conclusions to the available observations via a “plausible series of logical-sounding steps”.

A few practitioners have realized that as an alternative, or a complement, to tens of different teams trying to replicate the same data (seventy-three in Breznau et. al) one may try to anticipate what would result from such an experiment by propagating the uncertainty or ambiguity implicit in the garden of forking paths by simulation, i.e. by running iteratively the analysis with different combinations of assumptions about data and models. Some authors have called this with some imagination (and a catchy name) multiverse analysis.

With my collaborators, we call this with the admittedly less catchy ‘modelling of the modelling process’, or simply global sensitivity analysis. These approaches compare favourably with the alternative, I daresay; they rely on decades of experience in global sensitivity analysis, augmented by sensitivity auditing, an approach that aims to explore the entire model generating process, inclusive of motivation  and biases of the developers, expectations and purposes of the recipient of the analysis and so on. Sensitivity auditing is recommended in guidelines of European institutions and academia and examples of sensitivity auditing and of modelling of the modelling process can be found in a recent book on the policy of modelling published by Oxford University Press. On a personal note, sensitivity auditing is inspired by a philosophical orientation known as post-normal science that helped me in my own research.

An important element of these approaches is that they do not eschew engaging with sociology of quantification. As noted in my previous post, our crises are also epistemological and philosophical, and we need more than ever disciplines to work together abandoning the tragic divide among different families of science. The reproducibility crisis has important political and legal ramifications, that feed into our discussions about the facticity of facts and about how science is mobilized to support policy decisions.  A comparison among different communities on the merit of the different ways to look at our quantification would be enlightening. Statisticians are now convinced that “it is easy to miss or downplay philosophical presuppositions, especially if one has a strong interest in endorsing the policy upshot” (Mayo 2021), and that quantifications, including statistical ones, need to be looked at with a double lens, technical as well as normative, as recommended by sociologists of quantification.

If we  go back to the question posed above, about what is happening with public trust in science five years after the high point of the statistical wars on retiring significance, we must say that the rapid clock of policy has consigned us with a world where not just some, but possibly any issue involving society, technology and policy, has become intensely and increasingly polarized. As a result, even the interpretation of where we are now with public trust in science is a matter of conflict – the success and failures to address the pandemic being a case in point. For example, when it comes to successes and failure of science, scientific journals whose business model depends of ‘science solving problems’ may be led to reject works that point in the opposite direction.

My vision, that cannot be neutral for all that we discussed thus far, is that a new covenant between science and society is needed, an enterprise where statisticians can offer a special  contribution due to the history of their discipline being rich in contributions from sociology and philosophy – more so I daresay than in other communities producing numbers and algorithms, but this is also up for debate.

Categories: abandon statistical significance | Tags: , , , | 10 Comments

Post navigation

10 thoughts on “Guest Post: Andrea Saltelli: Analytic flexibility: a badly kept secret? (thoughts on “abandon statistical significance 5 years on”)

  1. I’m very grateful to Andrea Saltelli for his thought-provoking and illuminating overview of the current status of disagreement and controversy in quantification and modeling. He observes: “what is happening with public trust in science five years after the high point of the statistical wars on retiring significance, we must say that the rapid clock of policy has consigned us with a world where not just some, but possibly any issue involving society, technology and policy, has become intensely and increasingly polarized” (Saltelli 2024). A question that I hope we can make progress on, with the help of these 5 year reflections, is: what should be done? He concludes “that a new covenant between science and society is needed, an enterprise where statisticians can offer a special  contribution due to the history of their discipline being rich in contributions from sociology and philosophy”. Of course, he recognizes that  this too is open for debate. One does not see very much of the rich history in statistics wherein statisticians, scientists, philosophers, historians of science and others participate in interdisciplinary forums on foundations of statistical inference and modeling. It seems to me that the fascinating underlying disagreements about the nature of learning in the face of error, and the rival conceptions of the very aims and goals of science are downplayed. Moreover, data science, AI and machine learning don’t have that kind of philosophical history: prediction rather than understanding is key, and statistical science is keen to keep up with them.      

    I just read an article in the June 7 issue of Science recommending that we should be  “transparently documenting unresolved disagreements” and normalizing rival positions in all fields. I agree. When the 2019 editorial by Wasserstein et al. in the American Statistician gave the impression that the call to “abandon statistical significance” enjoyed the same (limited) degree of consensus as did the 2016 ASA (policy) Statement on Statistical Significance and Replicability, the ASA itself wound up appointing another task force on Statistical Significance and Replicability in 2019 that was put in the odd position of trying to correct the misperception about an editorial written by its own Executive Director. (One can search this blog for details.) Few are aware of the Task Force report and the entire episode seems to have been papered over. But disagreement about which statistical methods to use and how to interpret them remains.

    If papering over differences to give the appearance of consensus goes against the scientific goal of “organized skepticism” (as in the Science article), then shouldn’t it be considered another kind of QRP? We now know that the scientists who wrote the 2020 Nature article on “The Proximal Origin of SARS-COV-2” feigned consensus on the alleged overwhelming likelihood of a natural origin for Covid, in contrast to the views they privately shared in emails. (Of course, they could have changed their minds.) I don’t know if any of the authors have updated their views.

    I look forward to hearing readers’ comments and to learning more about Saltelli’s vision for a new covenant.

  2. Thanks for this very good posting with some very interesting references. Is there any way to read the link on “fact signalling” without paywall or the requirement to sign up?

  3. I appreciate Saltelli’s calling out “fact signaling” and reminding us that “there are things science can do and not do”. Friends and colleagues often assume that if we could just agree on the facts, then the policy would be clear. But values are not facts, and the “live free or die” state might well opt to risk death when some of us would rather play it (what seems to us) safer.

  4. I want to learn more about this, and I recognize the problem. But I am skeptical of PNS.

    As an example, here is a technocratic view that I think is largely correct. “In order to reduce GHG and to undo patterns of urban segregation, we need to push people towards using more public transit and we need to increase housing density. In other words, the social norm of living in the suburbs and driving to work in your SUV needs to change. We will attempt to make this change through tax and transportation policy and through public education. In particular, we will produce a report which says `spending X on a new light rail system will have benefits Y'”.

    How does PNS address this? The fact is, a large proportion of suburban homeowners prefer the existing system. I don’t know if it’s a majority. But they want to drive to work instead of taking public transportation. They want a house with a big lawn. I think this would be true, even if they recognized that higher density would be better for society overall. This conversation has been going on for a while. What does “listening to all stakeholders” mean in this context? Are we actually going to not take these steps because some people don’t like it?

    What I’m afraid of is that we will have fake conversations. It is more honest in my opinion to say “we experts have decided on plan X, so we are going to do it, and your uninformed opinion is not appreciated”, instead of “listening” to someone complain, and then going ahead with your plan anyway. The latter option just makes people mad, and it kind of makes a joke of democracy. On the other hand, if we let experts just make some decisions, at least we know who hold accountable if it doesn’t work.

    So I will definitely approach PNS with an open mind, but I wanted to share my initial reservations.

    • hwyneken
      Thanks for your comment on Saltelli. Is PNS viewed as taking all stakeholders into account in policymaking–or claiming to? I was going to ask Saltelli how he understood PNS, beyond the conditions he gives where it might be called for. What’s your view? If “normal science” is understood as Kuhnisn normal science, then my own view is that science is essentially never normal.

  5. Thanks to Andrea Saltelli for another interesting posting with which I largely agree (I re-read the one of 2019 and had a similar impression back then)! Once more this resonates with a number of ideas that are very central for me. I’m reminded particularly of our proposed “scientific virtues” of “Awareness of Multiple Perspectives” and “Stability” in “Beyond subjective and objective in statistics” with Andrew Gelman https://rss.onlinelibrary.wiley.com/doi/abs/10.1111/rssa.12276 (I don’t understand how I could place the link under a clickable link description). Regarding statistical tests, they are routinely and correctly criticised for being overinterpreted by deriving headline-grabbing claims from a single p-value. This criticism would be justified just as well if p-values were replaced by any other single number. Anything in science that qualifies for being trustworthy is based on several analyses by several groups of people from various points of view.

  6. Andrea Saltelli

    Thanks for these comments. I agree that “The new covenant” (see here and here for the concept) invites skepticism, and Deborah Mayo is right to point to the  latest incredible story of the obstruction of the investigation about the origin of COVID as a proof that we do not seem to be navigating toward any covenant. I wrote myself about the story, and Roger Pielke Jr. has done good work on it in his The Honest Broker blog. It is thus likely that things will be worse (but how much?) before they will be better. As discussed, I do not believe that post normal science as such offers all-embracing solutions – it rather suggests an epistemological stance, an invitation to circumspection in receiving the bounty of science and technology, while still caring about science itself. I find this a healthy perspective. An extended peer community is a desirable approach, but there is no guarantee of it working out smoothly or of being conductive to the greater good, especially in situations of power asymmetry. Philip Mirowski has articulated a critique of the concept that is worth reading. Working with my several colleagues, we try to debunk sloppy quantifications, especially in the field of mathematical modelling, exposing where possible the political nature of purportedly neutral numbers (book , article , a recent presentation). We likely have an extended peer community in mind as an audience when we do this, and the recent discovery of ‘analytic flexibility’ discussed in my piece offers some hope. Or I could be pessimist and say that the idea of making digital twins of the planet in silico, as to have “the future at our fingertips”, invites despair … the reader to decide. As you see, I have doubts. Thanks to this blog for hosting them.        

    • Andrea:

      Thanks for your reply to comments. I wrote a blogpost on March 1, 2021 on “falsifying trust” regarding a seres of mysteries that led me to realize that (at the very least) something extremely suspicious was going on with respect to the events surrounding Covid origins:

      https://errorstatistics.com/2021/03/01/falsifying-claims-of-trust-in-bat-coronavirus-research-mysteries-of-the-mine/

      Numerous sleuths were all onto the issues even as the cover-ups were taking place*. Alina Chan’s article is finally published in the NYT last week. Many of these investigators were linked on twitter, and they are the ones who get credit for revealing facts (yes they still exist) such as the deliberate renaming of the closest known virus to SARS 2 at the start of the pandemic (to hide origins). Anyway, it’s a fascinating story, going back to someone’s MA thesis.

      *From one of the notes to my post: there was “an anonymous Twitter user known as ‘The Seeker’ and a group going by the name of DRASTIC” (Ridley and Chan (2021)). 

I welcome constructive comments that are of relevance to the post and the discussion, and discourage detours into irrelevant topics, however interesting, or unconstructive declarations that "you (or they) are just all wrong". If you want to correct or remove a comment, send me an e-mail. If readers have already replied to the comment, you may be asked to replace it to retain comprehension.

Blog at WordPress.com.