I apologize for past gaps between posts in light of my stated intention to post roughly once per week. I won’t apologize for future gaps because I am hereby rescinding that statement. I’m finding that a regular posting schedule just doesn’t work with my dissertation writing schedule. I’ll continue to post as I am able.

Now, on to the show.

One of likelihoodism’s central theses is the Law of Likelihood, which says that the likelihood ratio Pr(*x*;*H*_{1})/Pr(*x*;*H*_{2}) is a correct measure of the degree to which datum *x* supports hypothesis *H*_{1} over *H*_{2} (with a likelihood ratio of 1 indicating neutrality). Sober (2008, p. 35) describes the Law of Likelihood as a “proposal” (as opposed to, say, a theorem) to be judged according to how well it (1) renders precise and systematic our use of the informal concept of “evidential favoring” and (2) isolates an epistemologically interesting concept.

It might seem obvious that the informal concept of “evidential favoring” is epistemologically interesting. After all, it is a truism that, as Hume put it, “a wise man proportions his belief to the evidence.” Similarly, according to Quine and Ullian, “insofar as we are rational in our beliefs, the intensity of belief will tend to correspond to the firmness of the available evidence.” The notion of evidence is closely linked to the notion of rational belief, and the notion of rational belief is epistemologically interesting if anything is. Thus, it might seem that the only question for the Law of Likelihood is how well it captures the informal concept of evidential favoring.

The following is a *prima facie* strong argument that the Law of Likelihood captures the informal concept of evidential favoring very well indeed. Birnbaum purported to show that evidential meaning conforms to the Likelihood Principle. Even frequentists who reject his proof admit that Bayes’s theorem, and thus the Likelihood Principle, are correct when Bayes’s theorem applies—that is, when prior probabilities are available. Barnard gave the following persuasive argument that when the Likelihood Principle is correct, a measure of evidential favoring should depend only on the likelihood ratio. Suppose that in addition to the datum *x* one also observes some independent irrelevant event *E* that has probability *p* on both hypotheses *H*_{1} and *H*_{2}. For the purposes of assessing relative evidential support for those hypotheses, it should not matter whether one considers *x* alone or the conjunction of *x* and *E*. The factor of *p* that distinguishes the likelihoods on the conjunction of *x* and *E *from the likelihoods on *x* alone will cancel out only if one takes ratios. Thus, a measure of relative evidential support that depends on the data only through its likelihood function should depend on it only through the likelihood ratio.

Moreover, such a measure should be monotonically increasing in the likelihood ratio. For suppose that we used a function *f* of the likelihood ratio such that *f*(*L*)*>f*(*L’*) for some *L*<*L’*. Then in an experiment involving independent and identically distributed tosses of a coin it would be possible for a string of heads as long as one likes to construct pairs of hypotheses *H* and *H’* such that *H* posits a higher probability of heads than *H’*, yet the observation of such a string favors *H’* over the hypothesis that the coin is fair more strongly than it favors *H* over the hypothesis that the coin is fair. I take it that a measure of evidential favoring need only be unique up to monotone transformations, and thus that an argument that such a measure should be monotonically increasing in the likelihood ratio is sufficient to establish the Law of Likelihood.

A frequentist could maintain that this argument establishes only that the Law of Likelihood is correct when prior probabilities are available. But why should we find that claim plausible? Just as a BTU defined as the amount of energy required to raise the temperature of one pound of water at 39.2 degree F by one degree F retains its meaning in the absence of such water, so too we should expect the “ban” (the logarithm of the likelihood ratio) defined as the amount of evidence required to raise the odds for a pair of hypotheses at even odds by a factor of 2.5 to retain that meaning in the absence of such hypotheses (Royall 1997, p. 13).

I will not belabor the details of this argument because I believe that it is misleading regardless of how persuasive each step may be. The problem with likelihoodism is not that the Law of Likelihood does a poor job explicating a notion of evidential favoring, but rather that the notion of evidential favoring it explicates is epistemologically interesting only within either a Bayesian or a frequentist methodology. As Hume and Quine and Ullian contend, the notion of evidence is closely tied to that of rational belief. But it does not follow that a conceptually adequate measure of relative evidential favoring will by itself provide useful norms for belief.

Within a Bayesian methodology, likelihood ratios are potentially interesting because they are ratios of posterior odds to prior odds. From a Bayesian perspective, it might make sense in some cases to report a likelihood ratio rather than a posterior probability distribution in order to allow one’s audience members to update their own prior odds. But the usefulness of likelihood ratios for this purpose is parasitic on the usefulness of Bayesian updating. It does not vindicate likelihoodism as a viable and distinctive methodology.

Within a frequentist methodology, likelihood ratios are potentially interesting because they are often good test statistics. For instance, the Karlin-Rubin theorem states that when one’s hypothesis space has a monotone likelihood ratio, a test that rejects a point null against a one-sided composite alternative if and only if the likelihood ratio for the null against a pre-specified element of the alternative falls below a given cutoff value is a uniformly most powerful test with its Type I error rate. For the purpose of evaluating likelihoodism, it is important to understand that the Karlin-Rubin theorem does not say that a method of indiscriminately reporting likelihood ratios has any special properties. Uniformly most powerful tests require not only appropriate test statistics but also appropriate use of those test statistics, including frequentist practices such as predesignating the null and alternative hypotheses.

Philosophers of science who seek a non-Bayesian theory of induction and confirmation are typically looking for a theory that tells one what to believe with reference only to the data and a set of truth-valued assumptions about the process that gave rise to the data. Thus, Mayo uses the phrase “evidence for *H*” more or less interchangeably with the phrase “good grounds for inferring *H*.” In contrast, data strongly favoring *H* over *H’* in the likelihoodist’s sense may not provide good grounds for inferring *H* rather than *H’*. Likelihoodists readily admit this point. In fact, they emphasize it in response to purported counterexamples such as the following. Suppose you were willing to assume that you live in a world in which all decks of cards are either “standard” (fifty-two cards with the usual distribution of suits and ranks) or “anomalous” (fifty-two cards with all the same suit and rank). A deck of cards is presented to you. The deck is shuffled, and the top card is revealed to be a king of clubs. Given your assumptions, there are two hypotheses compatible with this observation: either the deck is standard, or it contains fifty-two kings of clubs. The Law of Likelihood says that a correct measure of the degree to which your observation of a king of clubs favors the hypothesis that the deck contains fifty-two kings of clubs over the hypothesis that the deck is standard is the likelihood ratio of the former to the latter for this observation, namely fifty-two. Likelihoodists sometimes classify likelihood ratios of at least eight as “strong evidence” and likelihood ratios of at least thirty-two as “very strong evidence.” By that convention, the observation of a king of clubs is very strong evidence favoring the hypothesis that the deck contains fifty-two kings of clubs over the hypothesis that the deck is standard. Yet it would be odd to leap to the conclusion that the deck in fact contains fifty-two kings of clubs. A policy that endorsed leaping to that conclusion would have to endorse leaping to the conclusion that the deck contains fifty-two copies of whatever card appeared in the first flip on pain of making arbitrary distinctions among types of cards. But such a policy would always reject the hypothesis that the deck is standard given a single flip, even though intuitively a single flip tells one nothing about how varied the cards in the deck are.

Likelihoodists respond to counterexamples like this one by saying that when Bayesian methods apply, they provide proper norms for belief. When Bayesian methods do not apply, proper epistemic modesty permits only talk of relative evidential support. Bayesian methods give reasonable answers in examples like the one above. (For instance, if one assigns strictly positive prior probabilities to all of the possible deck configurations with the conditional prior probability over the possible anomalous configurations given some anomalous configuration uniform, one finds that the probability that the deck is standard does not change with a single draw.) The examples give no reason to think that likelihoodist methods are unreasonable, as long as one keeps in mind the fact that those methods speak only of evidential support and not of posterior belief.

The problem with this defense is that to call the likelihoodist’s talk of relative evidential support modest is an overstatement. Likelihood ratios by themselves tell one *nothing* about what to believe or do. Hume may be correct that a wise man proportions his belief to the evidence in some sense, but a wise person does not use likelihood ratios for inference or decision-making without either following a Bayesian methodology by incorporating prior probabilities or following a frequentist methodology by using likelihood ratios in such a way that the resulting procedure has good performance characteristics. The concept of evidence may be epistemologically interesting, but the likelihoodist’s measure of relative evidential favoring is not. Non-Bayesian philosophers of science looking for a theory of induction and confirmation should look elsewhere.

Even without precise, well-defined prior probability distributions or likelihood functions it may sometimes be possible to argue that the likelihood ratio for one hypothesis against another on the total available evidence is so large that one would have to have an unreasonably strong prior inclination to believe to second over the first in order to continue to do so after seeing the evidence, on pain of violating Bayesian conditionalization. Sober’s defense of evolutionary theory against intelligent design (2008) could be interpreted along those lines. It is important to keep in mind that such a likelihoodist argument is at bottom a kind of robust Bayesian argument.

Setting a higher threshold for “very strong evidence” won’t help. One could create analogous examples with arbitrarily large likelihood ratios by replacing the standard and anomalous decks with decks that are larger but analogous in that the cards of the first are all distinguishable from one another while the cards from the second are not.

Want to keep up with new posts without having to check for them manually? Use the sidebar on the left to sign up for updates via email or RSS feed!

Greg Gandenberger says

Michael Lew left the following comment on this post on my old blog site:

The card example is misleading: it’s an uncorrected mistake from the past. Likelihoods are not equal to the probability of the data given the hypothesis, they are only _proportional_ to the probability. Thus the likelihoods of two hypotheses can only be compared when the two hypotheses fit onto one likelihood function. In the cards example the likelihood function would presumably be the number of kings of clubs. The data would support 52 kings of clubs over 1 king of clubs, by 52 to 1, but 52 kings of clubs over 51 kings of clubs by a tiny margin of 52/51. Thus the data would not support reasonable inference about kings of clubs even while the likelihood function documents the evidence in the data.

Trivial evidence gives no reasonable inference. There is nothing in the likelihood principle or the law of likelihood that says that you have to make an inference before there is enough evidence to do so in a rational manner.

If you formulate hypotheses for the cards to allow one conclusion to be ‘normal deck’ then the likelihood function takes 52! dimensions, and the inadequacy of the single card as evidence becomes plain.

Greg Gandenberger says

It is of course true that neither the Likelihood Principle nor the Law of likelihood require drawing a conclusion in any circumstance. But that fact is a weakness as well as a strength.

As far as the card example specifically is concerned, I stipulated in setting up the example that either the deck is standard or it has fifty-two cards of the same rank and suit. Thus the number of kings of clubs is either one or fifty-two, and the likelihood ratio of fifty-two holds between the only two hypotheses that are compatible with the data given the background information that is built into the example.

Michael Lew says

Well, even with the stipulation that the deck is either standard or 52 cards the same the likelihood function has 53 dimensions: one for each type of card plus one for standard deck. The first card drawn changes the function from being flat on all dimensions to being zero on 51 dimensions and equal on the standard and kings of clubs directions because any card provides as much evidence for the standard deck whereas no single card can provide evidence against the standard deck.

(Of course, the implication that one would decide anything on the basis of one card when a second card would provide proof is silly.)

Greg Gandenberger says

I agree with everything you say here. My point is just that a rule for deciding what to believe or do based on likelihood ratios alone isn’t going to work. This claim is not controversial, but it sets the stage for the more controversial claim that likelihoodism fails to provide a viable genuine alternative to Bayesian and frequentist methodologies.

Greg Gandenberger says

Yes, I agree. My point is simply that, as every likelihoodist knows, likelihood ratios are not sufficient for belief or decision making.