I argued in this post that likelihoodism fails to provide a viable alternative to Bayesian and frequentist methodologies because likelihoodists have not provided a way to use likelihood functions for purposes of inference or decision that has an attractive justification yet lies outside both Bayesian and frequentist frameworks.
A possible likelihoodist response to this argument is that likelihoodism provides a ceteris paribus norm of inference and decision: all else being equal, one should prefer H1 to H2 upon learning x if and only if the Law of Likelihood says that x favors H1 over H2 (i.e., Pr(x;H1)/ Pr(x;H2)>1).
An obvious objection to this claim is that it is extremely vague. Moreover, the obvious way to precisify it is to spell out “all else being equal” in terms of prior probabilities and perhaps utilities in a way that a Bayesian would accept, which would not validate likelihood as a genuine alternative to Bayesianism.
There is another objection to this claim that is less obvious but perhaps more conclusive: if having background knowledge and utilities that are symmetric with respect to H1 and H2 is sufficient to satisfy the ceteris paribus clause, then ceteris paribus likelihoodism implies that one should be indifferent between a pair of estimators one of which dominates the other. For there is hypothetical scenario in which for any value x of some random variable X whose probability distribution is parameterized by θ, one’s background knowledge and utilities are completely symmetric with respect to the hypotheses θ=θ*(x) and θ=θ’(x), and Pr(x;θ*(x))/ Pr(x;θ’(x))=1, yet θ*(X) dominates θ’(X).
One could claim in response that being indifferent between the estimates θ*(x) and θ’(x) of θ for all values x of X is not the same as being indifferent between the estimators θ*(X) and θ’(X), but those two attitudes are indistinguishable behaviorally and have the same bad pragmatic consequences.
A hypothetical scenario in which this phenomenon arises is Stone’s (1976) “Flatland” example. The gist of the example is as follows (see this blog post by Larry Wasserman and Stone’s paper for more details). A sailor takes a number of steps along a two-dimensional grid, buries a treasure, takes one more step in a direction determined by the outcome of a roll of a fair four-sided die, and then dies. He carried with him a string that he kept taut. One’s datum x is the path of that string. The parameter one wishes to estimate is the location of the treasure θ. Pr(x;θ)=1/4 for θ one step north, south, east or west of the end of the string and 0 for all other values of θ. Thus, the Law of Likelihood implies that x is neutral between the hypothesis θ*(x) that θ is one step back along the path and the hypothesis θ’(x) that θ is one step forward along the path in the direction of the final step. In addition, one’s background knowledge is symmetric with respect to θ*(X) and θ’(X) because the scenario says nothing about how θ is generated, and we can simply stipulate that one’s utilities are symmetric with respect to θ*(X) and θ’(X), say payoff 0 or 1 according to whether one’s estimate is true or not. However, for any sequence of θs (random or nonrandom), the estimator θ*(X) which says that θ is one step back along the path gets the right answer ¾ of the time in the long run, while the estimator θ’(X) that θ is one step forward along the path in the direction of the final step gets the right answer ¼ of the time in the long run: Pr(θ*(X)=θ;θ)=3/4, while Pr(θ’(X)=θ;θ)=1/4.
The argument just given does not work against orthodox (countably additive) Bayesianism because orthodox Bayesians must give higher posterior to θ*(x) than to θ’(x) for some possible values x of X. Thus, it does not work against the Law of Likelihood understood simply as the claim that it is appropriate to use the phrase “the degree to which x favors H1 over H2” for the ratio of the posterior odds of H1 to H2 given x to the prior odds of H1 to H2 given x. Of course, the acceptability of the Law of Likelihood understood in that way does nothing to vindicate likelihoodism as a genuine alternative to Bayesianism.
The weakness of this argument is that examples like “Flatland” require infinite sample spaces. Real sample spaces (unlike our idealized models of them) are finite. Even if nature is continuous and/or unbounded, our measuring instruments have finite precision and range. Thus, a likelihoodist could reasonably claim that the Law of Likelihood is a ceteris paribus norm of inference for data from any experiment that we could actually perform. The fact that this claim encounters difficulties in idealized thought experiments is irrelevant to practical methodology.
Want to keep up with new posts without having to check for them manually? Use the sidebar on the left to sign up for updates via email or RSS feed!