The Law of Likelihood states that the outcome E of an observational situation favors hypothesis H1 over hypothesis H2 if and only if Pr(E|H1)>Pr(E|H2), and the degree to which E favors H1 over H2 is given by the likelihood ratio Pr(E|H1)/Pr(E|H2). The Law of Likelihood is controversial. It has been championed by Hacking (1965), Edwards (1972), and Royall (1997). Bayesian methods are consistent with it, but many Bayesians have no interest in it because they believe that the posterior distribution is all one needs for inference and decision making. Frequentist methods can violate it in various senses; for instance, there are uniformly most powerful level αhypothesis tests of a null hypothesis H0 against an alternative Ha that reject H0 if and only if an outcome occurs that according to the Law of Likelihood favors H0 over Ha, and moreover does so more than any other outcome (see my previous post).
My view regarding principles such as the Law of Likelihood is that no matter how well they might accord with our intuitions, they need to earn their keep by somehow helping us achieve our epistemic and/or practical goals. In this post, I will show that there is a way in which following the Likelihood Principle can help us achieve such goals: a hypothesis test that violates it fails to maximize expected utility.
The choice of a rejection region for a point-against-point frequentist hypothesis test determines a conditional probability of Type I error α and a conditional probability of Type II error β. For a test with a finite sample space (i..e, every test that can actually be performed), the possible (α,β) pairs can be plotted. For concreteness, consider an experiment with the following sample space:
It suffices for our purposes to consider the three possible tests shown below:
Test 1 is in the upper left, Test 2 in the upper right, Test 3 in the lower right.
One can create a rule for choosing among possible hypothesis tests by specifying a family of “indifference curves” in the (α,β) plane, where two points a and b lie on the same indifference curve if and only if one does not strictly prefer a test with the error probabilities given by a over one with the error probabilities given by b. Every indifference curve lies either wholly above or wholly below every other indifference curve, and a test on an indifference curve I is strictly preferred to every test on an indifference curve that lies above I.
Frequentists typically allow indifference curves to take any shape as long as they do not cross and their tangent lines have negative slope everywhere. But it’s easy to show that maximizing expected utility with one’s choice of test requires that one’s indifference curves be parallel straight lines. With this point in mind, it’s easy to see that a test that violates the Law of Likelihood by having in its rejection region a sample point with a smaller likelihood ratio of Ha to H0 than a sample point outside its rejection region cannot maximize expected utility.
Consider the example plotted above. Assuming that one is aiming to maximize expected utility, one’s indifference curves are parallel straight lines that have either smaller slope, larger slope, or the same slope as the line that connects Test 1 and Test 3 in the (α,β) plane. If they have the same slope, then Tests 1 and 3 are both preferred to Test 2. If they have larger (less negative) slope, then Test 3 is preferred to Test 2 (and Test 1). If they have smaller (more negative) slope, then Test 1 is preferred to Test 2 (and Test 2). Thus, violating the Law of Likelihood by preferring Test 2 to both Test 1 and Test 3 is incompatible with the Law of Likelihood.
The same argument can be made for any test that violates the Law of Likelihood. One can compare that test to a test that simply omits an “offending” point (one that is in the rejection region but has smaller likelihood ratio of Ha to H0 than some point outside the rejection region) from its rejection region and to a test that simply replaces an offending point with an “offended” point (one that is not in the rejection region but has larger likelihood ratio of Ha to H0 than some point that is in the rejection region). The slope of the straight line connecting those two tests in the (α,β) plane is the likelihood ratio of Ha to H0. The slope of the straight line connecting the offending test to the test that omits an offending point is greater (less negative) than this slope, so we have a situation like the one above.
This argument is not intended to be an objection to frequentism. At most, it’s an argument for adding to standard frequentist accounts of hypothesis testing the requirement that every point in the rejection region have a higher likelihood ratio of Ha to H0 than any point outside the rejection region. Even this claim is too strong: maximizing expected utility is not mandatory, and a frequentist will want to claim that the expected utility of a test is typically ill-defined. That being said, the result does show one way in which abiding by Law of Likelihood can help us achieve epistemic and practical goals.
Want to keep up with new posts without having to check for them manually? Use the sidebar on the left to sign up for updates via email or RSS feed!