I apologize for past gaps between posts in light of my stated intention to post roughly once per week. I won’t apologize for future gaps because I am hereby rescinding that statement. I’m finding that a regular posting schedule just doesn’t work with my dissertation writing schedule. I’ll continue to post as I am able.
Now, on to the show.
One of likelihoodism’s central theses is the Law of Likelihood, which says that the likelihood ratio Pr(x;H1)/Pr(x;H2) is a correct measure of the degree to which datum x supports hypothesis H1 over H2 (with a likelihood ratio of 1 indicating neutrality). Sober (2008, p. 35) describes the Law of Likelihood as a “proposal” (as opposed to, say, a theorem) to be judged according to how well it (1) renders precise and systematic our use of the informal concept of “evidential favoring” and (2) isolates an epistemologically interesting concept.
It might seem obvious that the informal concept of “evidential favoring” is epistemologically interesting. After all, it is a truism that, as Hume put it, “a wise man proportions his belief to the evidence.” Similarly, according to Quine and Ullian, “insofar as we are rational in our beliefs, the intensity of belief will tend to correspond to the firmness of the available evidence.” The notion of evidence is closely linked to the notion of rational belief, and the notion of rational belief is epistemologically interesting if anything is. Thus, it might seem that the only question for the Law of Likelihood is how well it captures the informal concept of evidential favoring.
The following is a prima facie strong argument that the Law of Likelihood captures the informal concept of evidential favoring very well indeed. Birnbaum purported to show that evidential meaning conforms to the Likelihood Principle. Even frequentists who reject his proof admit that Bayes’s theorem, and thus the Likelihood Principle, are correct when Bayes’s theorem applies—that is, when prior probabilities are available. Barnard gave the following persuasive argument that when the Likelihood Principle is correct, a measure of evidential favoring should depend only on the likelihood ratio. Suppose that in addition to the datum x one also observes some independent irrelevant event E that has probability p on both hypotheses H1 and H2. For the purposes of assessing relative evidential support for those hypotheses, it should not matter whether one considers x alone or the conjunction of x and E. The factor of p that distinguishes the likelihoods on the conjunction of x and E from the likelihoods on x alone will cancel out only if one takes ratios. Thus, a measure of relative evidential support that depends on the data only through its likelihood function should depend on it only through the likelihood ratio.
Moreover, such a measure should be monotonically increasing in the likelihood ratio. For suppose that we used a function f of the likelihood ratio such that f(L)>f(L’) for some L<L’. Then in an experiment involving independent and identically distributed tosses of a coin it would be possible for a string of heads as long as one likes to construct pairs of hypotheses H and H’ such that H posits a higher probability of heads than H’, yet the observation of such a string favors H’ over the hypothesis that the coin is fair more strongly than it favors H over the hypothesis that the coin is fair. I take it that a measure of evidential favoring need only be unique up to monotone transformations, and thus that an argument that such a measure should be monotonically increasing in the likelihood ratio is sufficient to establish the Law of Likelihood.
A frequentist could maintain that this argument establishes only that the Law of Likelihood is correct when prior probabilities are available. But why should we find that claim plausible? Just as a BTU defined as the amount of energy required to raise the temperature of one pound of water at 39.2 degree F by one degree F retains its meaning in the absence of such water, so too we should expect the “ban” (the logarithm of the likelihood ratio) defined as the amount of evidence required to raise the odds for a pair of hypotheses at even odds by a factor of 2.5 to retain that meaning in the absence of such hypotheses (Royall 1997, p. 13).
I will not belabor the details of this argument because I believe that it is misleading regardless of how persuasive each step may be. The problem with likelihoodism is not that the Law of Likelihood does a poor job explicating a notion of evidential favoring, but rather that the notion of evidential favoring it explicates is epistemologically interesting only within either a Bayesian or a frequentist methodology. As Hume and Quine and Ullian contend, the notion of evidence is closely tied to that of rational belief. But it does not follow that a conceptually adequate measure of relative evidential favoring will by itself provide useful norms for belief.
Within a Bayesian methodology, likelihood ratios are potentially interesting because they are ratios of posterior odds to prior odds. From a Bayesian perspective, it might make sense in some cases to report a likelihood ratio rather than a posterior probability distribution in order to allow one’s audience members to update their own prior odds. But the usefulness of likelihood ratios for this purpose is parasitic on the usefulness of Bayesian updating. It does not vindicate likelihoodism as a viable and distinctive methodology.
Within a frequentist methodology, likelihood ratios are potentially interesting because they are often good test statistics. For instance, the Karlin-Rubin theorem states that when one’s hypothesis space has a monotone likelihood ratio, a test that rejects a point null against a one-sided composite alternative if and only if the likelihood ratio for the null against a pre-specified element of the alternative falls below a given cutoff value is a uniformly most powerful test with its Type I error rate. For the purpose of evaluating likelihoodism, it is important to understand that the Karlin-Rubin theorem does not say that a method of indiscriminately reporting likelihood ratios has any special properties. Uniformly most powerful tests require not only appropriate test statistics but also appropriate use of those test statistics, including frequentist practices such as predesignating the null and alternative hypotheses.
Philosophers of science who seek a non-Bayesian theory of induction and confirmation are typically looking for a theory that tells one what to believe with reference only to the data and a set of truth-valued assumptions about the process that gave rise to the data. Thus, Mayo uses the phrase “evidence for H” more or less interchangeably with the phrase “good grounds for inferring H.” In contrast, data strongly favoring H over H’ in the likelihoodist’s sense may not provide good grounds for inferring H rather than H’. Likelihoodists readily admit this point. In fact, they emphasize it in response to purported counterexamples such as the following. Suppose you were willing to assume that you live in a world in which all decks of cards are either “standard” (fifty-two cards with the usual distribution of suits and ranks) or “anomalous” (fifty-two cards with all the same suit and rank). A deck of cards is presented to you. The deck is shuffled, and the top card is revealed to be a king of clubs. Given your assumptions, there are two hypotheses compatible with this observation: either the deck is standard, or it contains fifty-two kings of clubs. The Law of Likelihood says that a correct measure of the degree to which your observation of a king of clubs favors the hypothesis that the deck contains fifty-two kings of clubs over the hypothesis that the deck is standard is the likelihood ratio of the former to the latter for this observation, namely fifty-two. Likelihoodists sometimes classify likelihood ratios of at least eight as “strong evidence” and likelihood ratios of at least thirty-two as “very strong evidence.” By that convention, the observation of a king of clubs is very strong evidence favoring the hypothesis that the deck contains fifty-two kings of clubs over the hypothesis that the deck is standard. Yet it would be odd to leap to the conclusion that the deck in fact contains fifty-two kings of clubs. A policy that endorsed leaping to that conclusion would have to endorse leaping to the conclusion that the deck contains fifty-two copies of whatever card appeared in the first flip on pain of making arbitrary distinctions among types of cards. But such a policy would always reject the hypothesis that the deck is standard given a single flip, even though intuitively a single flip tells one nothing about how varied the cards in the deck are.
Likelihoodists respond to counterexamples like this one by saying that when Bayesian methods apply, they provide proper norms for belief. When Bayesian methods do not apply, proper epistemic modesty permits only talk of relative evidential support. Bayesian methods give reasonable answers in examples like the one above. (For instance, if one assigns strictly positive prior probabilities to all of the possible deck configurations with the conditional prior probability over the possible anomalous configurations given some anomalous configuration uniform, one finds that the probability that the deck is standard does not change with a single draw.) The examples give no reason to think that likelihoodist methods are unreasonable, as long as one keeps in mind the fact that those methods speak only of evidential support and not of posterior belief.
The problem with this defense is that to call the likelihoodist’s talk of relative evidential support modest is an overstatement. Likelihood ratios by themselves tell one nothing about what to believe or do. Hume may be correct that a wise man proportions his belief to the evidence in some sense, but a wise person does not use likelihood ratios for inference or decision-making without either following a Bayesian methodology by incorporating prior probabilities or following a frequentist methodology by using likelihood ratios in such a way that the resulting procedure has good performance characteristics. The concept of evidence may be epistemologically interesting, but the likelihoodist’s measure of relative evidential favoring is not. Non-Bayesian philosophers of science looking for a theory of induction and confirmation should look elsewhere.
Even without precise, well-defined prior probability distributions or likelihood functions it may sometimes be possible to argue that the likelihood ratio for one hypothesis against another on the total available evidence is so large that one would have to have an unreasonably strong prior inclination to believe to second over the first in order to continue to do so after seeing the evidence, on pain of violating Bayesian conditionalization. Sober’s defense of evolutionary theory against intelligent design (2008) could be interpreted along those lines. It is important to keep in mind that such a likelihoodist argument is at bottom a kind of robust Bayesian argument.
Setting a higher threshold for “very strong evidence” won’t help. One could create analogous examples with arbitrarily large likelihood ratios by replacing the standard and anomalous decks with decks that are larger but analogous in that the cards of the first are all distinguishable from one another while the cards from the second are not.
Want to keep up with new posts without having to check for them manually? Use the sidebar on the left to sign up for updates via email or RSS feed!