**The Theory of Search Is the Economics of Discovery:**

Some Thoughts Prompted by Sir David Hendry’s Essay *

in *Rationality, Markets and Morals* (RMM) Special Topic:

Statistical Science and Philosophy of Science

**Part 1 (of 2)**

*Professor Clark Glymour*

Alumni University Professor

* Department of Philosophy*[i]

* Carnegie Mellon University*

Professor Hendry* endorses a distinction between the “context of discovery” and the “context of evaluation” which he attributes to Herschel and to Popper and could as well have attributed also to Reichenbach and to most contemporary methodological commentators in the social sciences. The “context” distinction codes two theses.

1.“Discovery” is a mysterious psychological process of generating hypotheses; “evaluation” is about the less mysterious process of warranting them.

2. Of the three possible relations with data that could conceivably warrant a hypothesis—how it was generated, its explanatory connections with the data used to generate it, and its predictions—only the last counts.

Einstein maintained the first but not the second. Popper maintained the first but that nothing warrants a hypothesis. Hendry seems to maintain neither–he has a *method* for discovery in econometrics, a search procedure briefly summarized in the second part of his essay, which is not evaluated by forecasts. Methods may be esoteric but they are not mysterious. And yet Hendry endorses the distinction. Let’s consider it.

As a general principle rather than a series of anecdotes, the distinction between discovery and justification or evaluation has never been clear and what has been said in its favor of its implied theses has not made much sense, ever. Let’s start with the father of one of Hendry’s endorsers, William Herschel. William Herschel discovered Uranus, or something. Actually, the discovery of the planet Uranus was a collective effort with, subject to vicissitudes of error and individual opinion, was a rational search strategy. On March 13, 1781, in the course of a sky survey for double stars Hershel reports in his journal the observation of a “nebulous star or perhaps a comet.” The object came to his notice how it appeared through the telescope, perhaps the appearance of a disc. Herschel changed the magnification of his telescope, and finding that the brightness of the object changed more than the brightness of fixed stars, concluded he had seen a comet or “nebulous star.” Observations that, on later nights, it had moved eliminated the “nebulous star” alternative and Herschel concluded that he had seen a comet. Why not a planet? Because lots of comets had been hitherto observed—Edmund Halley computed orbits for half a dozen including his eponymous comet—but never a planet. A comet was much the more likely on frequency grounds. Further, Herschel had made a large error in his estimate of the distance of the body based on parallax values using his micrometer. A planet could not be so close.

Herschel communicated his observations to the British observatories at Oxford and Greenwich, which took up the observations as, soon, did astronomers on the continent. Maskelyne quickly concluded the object was a planet, not a comet, but multiple attempts were made on the continent to fit a parabolic orbit. Every further observation conflicted whichever parabolic hypothesis had been fitted to the previous data. By early summer of 1781 Lexell had computed a circular orbit and (very accurate) distance had been computed using the extreme observations including Herschel’s original observation, but the accumulating data showed an eccentricity. The elements of the elliptic orbit were given by Laplace early in 1783. In all, it took nearly two years for Herschel’s object to be certified a planet.

There was a logic to the process, but there is no natural place to chop it into context of discovery and context of justification, and no value in chopping it anywhere. Herschel had a criterion for noting an anomalous object—the appearance of a disc. An anomalous object could be one of three kinds: a comet, a planet, an anomalous star. Herschel had a quick test—changing magnification—that eliminated the stellar option. Based on the history of astronomical observations a comet was far more likely than a new planet, and that option was investigated first, by attempts to compute a parabolic orbit that would predict subsequent observations. When that failed, and the body failed to show distinctive signs of a comet—a tail for example—the planet hypothesis was resorted to, first by computing the simplest orbit, a circle, and when that failed, the elliptical orbit.

The example is a near-paradigm of how to recognize and determine salient properties of a rare object. First, a cheap criterion for recognizing a candidate; then an ordering based on prior data of the alternative hypotheses; then applying the criterion for the likeliest candidate—a parabolic orbit for a candidate—and, testing on further observations, rejecting it. Then applying the criterion for the remaining candidate in its simplest form, rejecting it, applying the criterion in a more complex form, and succeeding. There is a collective decision flow chart, which I leave to the reader.

What is striking is the economy of the procedure. Cheap tests (noting the visual features of the object, changing the magnification) were applied first, tests requiring more data and more calculation (computing orbits) only applied when the cheap tests were passed. The alternative explanations were examined in the order of their prior probability which was also in some respects the order of their data requirements (elliptic orbits required more data than parabolic orbits), so that if the most probable hypothesis succeeded—which it did not–the effort of testing the less probable would be avoided. A cheap (in data requirements and calculational effort) test (circular orbit) of the less probable hypothesis was applied before the more demanding, and ultimately successful test.

There is a lesson in the example. Testing is a cog in the machine of search, a part of the process of discovery, not a separate thing, “justification.” And a larger lesson: there are strategies to search, better and worse in the conditions of success and better and worse in the costs they risk. In a slogan, *the theory of search is the economics of discovery*. The slogan is only semi-original. Pierce described the process of abduction/deduction/test/abduction as the “economics of research.” As usual, he was inspired, but “abduction” never came to anything practical.

Could there be search procedures behind the discovery of things more abstract than planets? Cannizaro’s discovery of the values of relative atomic weights might serve as an example. But what about real big juicy theories, the theory of relativity say? We don’t know precisely what went on in the brains of Einstein and Hilbert, the two discoverers of the theory, but we know something about what they knew or assumed that constrained their respective searches: they wanted field theories, which meant partial differential equations; they wanted the equations to be coordinate independent, or covariant; they wanted the equivalence principle—unforced motions would follow geodesics of a metric; they wanted a theory to account for the anomalous advance of Mercury’s perihelion. It is not at all implausible that an automated search with those constraints would turn up the field equations of general relativity as the most constrained explanation.

In sum, search methods are pretty common in science if not always explicit. They could be made a lot more common if the computer and the algorithms it allows were put to work in explicit search methods. Both philosophers and statisticians warn against search methods, even as statisticians use them—variable selection by regression is a search method, and so is principal components factor analysis. Herschel and Popper and Reichenbach seem not to have imagined the possibility of automated search, but Hempel did. Hempel claimed such searches could never discover anything novel because a computational procedure could not introduce “novel” properties. (Little did he know.) A thousand metaphors are apt: Search is the wandering Jew of methodology, useful but detested, a real enough bogeyman to scare graduate students. But even enlightened spirits who do not truck with such anti-search bigotry often imply that the fact that a hypothesis was found by a search method cannot itself be evidence for the hypothesis. And that is just wrong.

A search procedure can be thought of as a statistical estimator, and in many applications that is not just a way of thinking but the very thing. There is assumed a space, possibly infinite, of possible hypotheses. There is a space of possible data samples, each of which may be ordered (as in time) or not. A search procedure is a partial function from samples to subsets of the hypothesis space. If the hypotheses are probabilistic, then all of the usual criteria for statistical estimators are well defined: consistency, bias, rate of convergence, efficiency, sufficiency and so on. Some of the usual theorems of estimation theory may not apply in some cases because the search set up is not restricted to parametric models and the search need not be a point estimator—i.e., it may return a set of alternative models.

Statistical estimators have epistemic relationships and trade-offs. Some estimators have convergence theorems that hold under weaker assumptions than other estimators. The trade-off is usually in reduced information in cases in which the stronger assumptions are actually true. Hence “robust statistics.” The same is true of search methods. Linear regression as a method of searching for causal relations is pointwise consistent under extraordinarily strong assumptions; FCI, by now an old standard in graphical model search, has much weaker sufficient conditions for pointwise consistency but provides much less information even in circumstances in which regression assumptions are correct. In other cases, there is something akin to dominance. The PC algorithm, for example, gives the same causal information as regression whenever regression does (given the information that the predictors are not effects of the outcome) but also in many cases when regression does not.

I think ordinary parameter estimation provides evidence for the estimated parameter values. The quality of the evidence depends of course on the properties of the estimator: Is it consistent? What is the variance of the estimate? What is the rationale for the hypothesis space—the parametric family of probability distributions for which properties of the estimation function have been proved? But issues of quality do not undermine the general principle that parameter estimates are evidence for and against parameter values. So it is with model search, there is better and worse.

*Hendry, D. (2011) “Empirical Economic Model Discovery and Theory Evaluation”, in *Rationality, Markets and Morals*, Volume 2 Special Topic: *Statistical Science and Philosophy of Science, *Edited by Deborah G. Mayo, Aris Spanos and Kent W. Staley: 115-145.

_______________________________

[i] Clark Glymour is also a Senior Research Scientist at IHMC (Florida Institute for Human and Machine Learning).

He works on machine learning, especially on methods for automated causal inference, on the psychology of human causal judgement, and on topics in mathematical psychology.

His books include:

*Theory and Evidence* (Princeton, 1980);

*Examining Holistic Medicine* (with D. Stalker), *Prometheus*, 1985;

*Foundations of Space-Time Theories* (with J. Earman), University of Minnesota Press, 1986;

*Discovering Causal Structure* (with R. Scheines, P. Spirtes and K.Kelly) Academic Press, 1987;

*Causation, Prediction and Search* (with P.Spirtes and R. Scheines), Springer, 1993, 2nd Edition MIT Press, 2001;

*Thinking Things Through*, MIT Press, 1994;

*Android Epistemology* (with K. Ford and P. Hayes) MIT/AAAI Press, 1996;

*Bayes Nets and Graphical Causal Models in Psychology*, MIT Press, 2001.

*Galileo in Pittsburgh*, Harvard University Press, 2010 (with Wang Wei and Dag Westerstahl, eds.)

*Logic, Methodology and Philosophy of Science*, College Publications, 2010

Clark: Thanks so much for this. The key difference between abduction and induction for Peirce is that only the latter turns on “predesignation” and random sampling. More generally, induction (for Peirce and for one who cares abut error probabilities) requires taking into account features of the data and hypothesis generation and selection. This introduces a key factor that might be seen to alter the whole question of discovery vs justification. If features such “how the hypothesis was selected for testing” are relevant for assessing tests, then it might be thought to collapse any distinction. But I think this is a mistake. More later…