I’ve been reading about the artificial intelligence/machine learning (AI/ML) wars revolving around the use of so-called “black-box” algorithms–too complex for humans, even their inventors, to understand. Such algorithms are increasingly used to make decisions that affect you, but if you can’t understand, or aren’t told, why a machine predicted your graduate-school readiness, or which drug a doctor should prescribe for you, etc, you’d likely be dissatisfied and want some kind of explanation. Being told the machine is highly accurate (in some predictive sense) wouldn’t suffice. A new AI field has grown up around the goal of developing (secondary) “white box” models to “explain” the workings of the (primary) black box model. Some call this explainable AI, or XAI. The black box is still used to reach predictions or decisions, but the explainable model is supposed to help explain why the output was reached. (The EU and DARPA in the U.S. have instituted broad requirements and programs for XAI.)
Surprisingly, at least to an outsider like me, there is enormous push back against the movement to adopt or require explainable AI. “Beware Explanations From AI” declares one critic; “Stop explaining black boxes!” warns another. As is often the case in statistics wars, opponents of explainable AI disagree with each other, and the criticisms revolve around fundamental disagreements as to the nature and roles of statistical inference and modeling.
This is the first time I’m writing on this, so I’ll be grateful to hear from readers about mistakes.
Breiman: The parametric vs algorithmic models battle: I remember the early gauntlet thrown down by Breiman (2001), challenging statisticians to move away from their tendency to seek probabilistic data models to capture a hypothesized data generating mechanism underlying data (“Statistical Modeling: The Two Cultures”). Breiman describes “algorithmic modeling” this way:
“The approach is that nature produces data in a black box whose insides are complex, mysterious, and, at least, partly unknowable. What is observed is a set of x’s that go in and a subsequent set of y’s that come out. The problem is to find an algorithm f(x) such that for future x in a test set, f(x) will be a good predictor of y.
The theory in this field shifts focus from data models to the properties of algorithms. It characterizes their “strength” as predictors, convergence if they are iterative, and what gives them good predictive accuracy. The one assumption made in the theory is that the data is drawn iid from an unknown multivariate distribution.” (Breiman 205)
Breiman’s wars are sometimes put as a difference in aims–predictive accuracy vs understanding mechanisms. If the goal is prediction or classification, limited to cases similar to the data, then predictive accuracy might suffice, and clearly there have been impressive successes (e.g., speech, handwriting, facial recognition). But even in fields where the algorithmic vs data modeling war has largely been won by the former, we see a return of the demand to understand, if indeed that goal was ever abandoned.
The new field of “explanatory AI” (XAI) Put to one side for now that philosophers, despite volumes devoted to the topic, have never come up with an adequate account of “explanation”. We can explain specifically what goes on–and what seems wanted–here without a general account. A major problem XAI critics have is that explaining black box ML models does not reveal the elements of the primary black box model, nor even the data used to build it. By means of interactions with the primary black model, a post hoc, supposedly humanly understandable, explanation can arise. Actual decisions are still made using the black box model, generally regarded as more reliable than the explainable model—the latter is only to help various stakeholders understand, question and ideally trust the black box while mostly replicating its predictive behavior.
Use RCTs for high risk cases. What first perked up my ears was Babic et al.’s 2021 article in Science blaring out: “Beware explanations from AI in health care”:
“Explainable AI/ML (unlike interpretable AI/ML) offers post hoc algorithmically generated rationales of black-box predictions, which are not necessarily the actual reasons behind those predictions or related causally to them. Accordingly, the apparent advantage of explainability is a “fool’s gold” because post hoc rationalizations of a black box are unlikely to contribute to our understanding of its inner workings.“ (Babic et al. 2021)
“Interpretable AI/ML” avoids being “fool’s gold,” they claim, and gives the actual reasons behind the predictions. The terms here are not always used in the same way, but the idea is that an “interpretable” model is where the original algorithmic model is already an “understandable” white box, not needing the work of XAI. Although examples are scarce, it’s usually a linear regression model with clear inputs, e.g., education, grades, GRE scores, and an output, like “graduate school ready” or not. By contrast, XAI doesn’t use the original (or what I’m calling “primary”) function that actually generated the prediction, but a white box approximation that mimics it to some degree.
The white box approximation might merely tell you which factors or features seemed to weigh most heavily in the output, or what changes in input would have changed the output. It might, for a hypothetical example, reveal that an AI model that deemed a college student “non-ready” for graduate school would have deemed her ready if only she had gotten a specified score on a standardized test like the GRE.
Babic et al (2021) think that rather than explain the black box, it should just be used as is, at least when it is considered sufficiently reliable–and the stakes aren’t that high. (Examples are not given.) But in high stakes medical contexts, they aver, we should not try to explain black boxes—in the formal XAI sense—we should instead look to well-designed clinical trials on safety and effectiveness of practices and treatments.
“If explainability should not be a strict requirement for AI/ML in health care, what then? Regulators like the FDA should focus on those aspects of the AI/ML system that directly bear on its safety and effectiveness—in particular, how does it perform in the hands of its intended users? To accomplish this, regulators should place more emphasis on well-designed clinical trials, at least for some higher-risk devices, and less on whether the AI/ML system can be explained”. (Babic et al. 2021)
This, they claim, will help avoid the frequent medical reversals where treatments thought to be beneficial either don’t work or turn out to be harmful. Well-designed RCTs can get beyond the limitations of inferring from observational data, which is what AI is limited to. This takes us to design-based or model-based error statistical methods and models.
A similar call is made in a recent Lancet article:
“Instead of requiring local explanations from a complicated AI system, we should advocate for thorough and rigorous validation of these systems across as many diverse and distinct populations as possible,…
…Despite competing explanations for how acetaminophen works, we know that it is a safe and effective pain medication because it has been extensively validated in numerous randomised controlled trials (RCTs). RCTs have historically been the gold-standard way to evaluate medical interventions, and it should be no different for AI systems.” (Ghassemi et al. 2021)
With RCTs we can run statistical significance tests and compute error probabilities associated with estimates and inferences. The dangers of biasing selection effects and confounding are blocked or controlled.
Nevertheless, the choices shouldn’t be either algorithmic models or clinical trials–which are highly restricted in use—but should include the use of testable parametric statistical models, or possibly combinations of AI models with subsequent testing. AI algorithms from observational data might serve to discover brand new risk factors, to be followed up with studies to test these hypotheses.
Use intrinsically interpretable AI models for high risk cases. An earlier critic of XAI for high stakes decisions, Cynthia Rudin, has a different view. Rudin tells us to: “Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead” (2018). Recall, an “interpretable” model is where the original algorithmic model is already an understandable white box, not needing the work of XAI.
“The lack of transparency and accountability of predictive models can have (and has already had) severe consequences; there have been cases of people incorrectly denied parole, poor bail decisions leading to the release of dangerous criminals, ML-based pollution models stating that highly polluted air was safe to breathe, and generally poor use of limited valuable resources in criminal justice, medicine, energy reliability, finance, and in other domains” (Rudin 2018)
These appear to be examples where black-box algorithms have gone wrong, not necessarily where the second-order attempt to elucidate those models lead to problems. Right? Moreover, the black-box nature in many of these cases, I take it, was not due to complexity but proprietary models. So how does her recommendation to use intrinsically interpretable AI help? Perhaps black box models should never be used in high stakes decisions, but what if that’s not possible?
It’s also not clear to me how intrinsically interpretable models necessarily engender trusting or testing black boxes. Some readers will remember the big Anil Potti controversy (discussed on this blog) some years ago, and the big resulting guidebook as to how to avoid dangers of high throughput predictive models. The example concerned predicting which chemotherapy to use on breast cancer patients at Duke University and, shockingly, it was already being applied before being validated. I don’t think the Potti model would be classified as black box, but it had failed utterly to be well-validated. No legitimate type 1 or 2 errors could be vouchsafed. (They conveniently left out data points that didn’t fit their prediction model, along with a series of howlers. Search this blog if you’re interested.) I would have thought the validation requirements in that great big guidebook would be routine, after horror stories like the Potti case.
To be clear: I do not take any side in this battle—at least not yet. I would concur with the call by Babic et al (2021) and Ghassemi at al (2021) for well-designed clinical trials where feasible. But it’s not obvious that intrinsically interpretable AI models would afford an error statistical validation.
Can we severely test AI models? My econometrics colleague, Aris Spanos, in a tour de force, detailed, comparative account of different approaches to modeling, argues that algorithmic modeling amounts to an elaborate curve-fitting project which assumes but does not test its key assumption of IID data. In algorithmic modeling, Spanos remarks, “likelihood-based inference procedures are replaced by loss-function based procedures driven by mathematical approximation theory and goodness of fit measures” (Spanos 2021, 25) So it’s not surprising that they don’t quantify error rates. Spanos’ work might be said to be on Breiman’s data modeling side.
David Watson, while he takes the XAI side of the divide, recognizes these shortcomings:
“[XAI] methods do not even bother to quantify expected error rates. This makes it impossible to subject algorithmic explanations to severe tests, as is required of any scientific hypothesis”. (Watson 2020)
Nevertheless, Watson holds out hope for remedying this. Or again in the Lancet article:
“[XAI] explanations have no performance guarantees. Indeed, the performance of explanations is rarely tested at all, and most tests that are done rely on heuristic measures rather than explicitly scoring the explanation from a human perspective.” (Ghassendi 2021)
Presumably performance here is how good a job the XAI model does at mimicking the primary black box model. But even that won’t suffice to trust either the XAI model or fix a faulty black box model.
Rudin, who thinks the way to solve the problem is to use only intrinsically interpretable AI models, proposes:
“Let us consider a possible mandate that, for certain high-stakes decisions, no black box should be deployed when there exists an interpretable model with the same level of performance.”
She appeals to the supposed simplicity of nature to argue that reliable interpretable models are generally available. But in any event, if they are available, they should be favored, says Rudin.
“If such a mandate were deployed, organizations that produce and sell black box models could then be held accountable if an equally accurate transparent model exists. It could be considered a form of false advertising …”
This is an interesting proposal. How accuracy and reliability is to be shown needs to be spelled out. Why not also test the AI model as compared to a non-AI model, as with the well-designed clinical trials some advocate, or with validated parametric models? Has statistics moved too far to the algorithmic modeling side?
I’ve said little about the forms that XAI can take, and might come back to this another time. However, since there is ample latitude to what might be included, there’s no reason to preclude tools for testing primary AI models and contesting predictions or decisions based on them. Rather than seek a reliable mimic of the primary model, it might be better to seek techniques that enable severe probing of the black and white boxes: testing the assumptions of the algorithm and contesting decisions based on them. We might want to call this testable or probative AI or some such thing.
Explainable XAI models are used to trouble-shoot and audit primary black box models, and this would seem relevant to individuals wishing to contest AI-driven decisions as well. For example, some self-critical XAI techniques can show an XAI model had no chance of unearthing the XAI model was biased or unfair in some way. Thus any purported claim that it is unbiased fails to pass with severity. (Perhaps students from Fewready need higher test scores than those from Manyready to be deemed graduate school-ready, say).
So it might be that what’s wanted is not an XAI model that passes with severity but one that let us critically appraise, improve, and contest black box and XAI models. I would not rule out XAI as serving this role–at least in a qualitative manner or by reintroducing parametric (probabilistic) reasoning at the XAI level.
All this is by way of sticking my neck out—I’m too much of an outsider to the AI/ML wars to really weigh in on the battles. Your constructive remarks and insights in the comments are welcome.
In writing this post, I consulted with Aris Spanos, who has written, critically, on algorithmic modeling, and David Watson, who works in the field of XAI and has often discussed conceptual and philosophical problems. I acknowledge their assistance in helping me grasp what’s going on here, but I do not elaborate on their views.
 A word often seen is “intrinsic”. If the original or primary algorithmic model is intrinsically interpretable, it doesn’t require another algorithm to explain how it works.
 While not a black box in the sense used in AI, it might well have been. It took years to sort out the coding errors, and you might say the Bayesian priors on which the model rested are black boxy. The “mutagenes” generated from signatures are also problematic. But I don’t think they use the term that way.
 The algorithmic modelers say things like, do you want the machine or the doctor to operate, if the former has 90% success, and the latter 80% success? It depends, for one thing, on where that “success rate” comes from. Maybe the patients with the most difficult conditions go to the human. Confronted with an anomalous case, the human can throw out the formula and avert disaster. These days we do well combining the two.
 For readers unfamiliar with this hypothetical example, I discuss a similar one in a number of articles. A quick sum-up is on 368-9, Excursion 5 Tour II (proofs) of my book Statistical Inference as Severe Testing: How to Get Beyond the Statistics Wars (2018, CUP). The context is not AI but the “diagnostic screening model” of statistical tests.
Babic et al. (2021), Beware Explanations From AI in Health Care
Breiman, L. (2021), Statistical Modeling: The Two Cultures.
Ghassemi, M. (2021). The False Hope of Current Approaches to explainable artificial intelligence in health care
Rudin, C. (2018), Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead
Watson, D. (2020) Conceptual Challenges for Interpretable Machine-Learning.