### The Problem for Methodological Likelihoodism

I have argued that methodological likelihoodism is false by arguing that an adequate self-contained methodology for science provides good norms of commitment vis-à-vis hypotheses, articulating minimal requirements for a norm of this kind, and proving that no purely likelihood-based norm satisfies those requirements.

### The Solution? Appeal to Long-Run Operating Characteristics

One might attempt to rescue methodological likelihoodism by lowering one’s standards. **Perhaps no purely likelihood-based norms of commitment are among the canons of rationality, but such norms are nevertheless useful in practice when deployed judiciously.** This move may not appeal to most philosophers, but similar moves are common among statisticians (e.g. Chatfield 2001, Kass 2011, and Gelman 2011).

The idea that purely likelihood-based methods are useful when deployed judiciously is plausible only if those methods have some redeeming quality that at least partially compensates for the problems I discuss here. What could that redeeming quality be? Here are four candidates from (Royall 2000, 760):

- intuitive plausibility,
- consistency with other axioms and principles,
- objectivity, and
- desirable operational implications.

I am willing to grant 1-3 for the sake of argument. Those virtues are not sufficient, even jointly, to vindicate methodological likelihoodism. It still needs to be shown that methods based on likelihood functions alone can provide useful guidance for our commitments vis-à-vis hypotheses.

4 is prima facie more promising. It refers to the purported fact that purely likelihood-based methods are guaranteed to perform well in certain senses in the indefinite long run if used over and over again with varying data. Appeals to guarantees about long-run performance are the hallmark of frequentism, but Bayesians cite such results as well, perhaps most often in the form of convergence theorems (e.g. Doob 1949). The exact significance of various facts about long-run operating characteristics is a matter of dispute, but there is no disputing the basic idea that we want techniques that we can reasonably expect to yield good results.

### Why Appealing to Long-Run Operating Characteristics Does Not Help Methodological Likelihoodism

Unfortunately for this line of response, likelihoodist appeals to 4 generate many problems (see this post). The most damning of these problems is that **the operating characteristics that likelihoodists appeal to are not operating characteristics of purely likelihood-based methods**. Instead, they are operating characteristics of methods that use likelihood functions in a frequentist way.

Let me explain. By definition, purely likelihood-based methods are not sensitive to differences between experimental outcomes that are not reflected in the likelihood function. One such fact concerns the distinction between *fixed* and *random* hypotheses. Fixed hypotheses are specified without reference to the data, while random hypotheses are specified in terms of the data. For instance, the hypothesis that the mean of the distribution that produced the data is *zero* is a fixed hypothesis, while the hypothesis that it is the *sample average* (the sum of the data values divided by the number of data values) is a random hypothesis, because the value of the sample average depends on the data while the value of the number zero does not.

Unlike purely likelihood-based methods, frequentist methods violate the likelihood principle by being sensitive to the distinction between fixed and random hypotheses. A frequentist may draw different conclusions about the hypothesis that the mean of a distribution is zero depending on whether he or she set out to test the hypothesis that the mean is zero or set out to test the hypothesis that the mean is the sample average, which turned out to be zero.

Whether sensitivity to the distinction during fixed and random hypotheses is a good feature for a method to have or not is a topic for another occasion. The key points for present purposes are that **sensitivity to the distinction during fixed and random hypotheses (1) is not present in purely likelihood-based methods and (2) is necessary for the long-run operating characteristics that likelihoodists erroneously cite in support of their methods**. I will illustrate these claims for the *universal bound*, which is the fact that the probability of a likelihood ratio of at least $k$ for any given fixed, false hypothesis against the true hypothesis is at most $1/k$ (i.e., $\Pr_{H_0}(\Pr(E|H_1)/\Pr(E|H_0)\geq k)<1/k$) (Royall (2000), 762-3). The same point holds for other results can concerning the performance characteristics of methods based on likelihood functions, including both the tighter bounds that Royall derives for specific distributions (2000) and likelihood ratio convergence theorems (Hawthorne 2012).

#### An Example Due to Armitage as a Counterexample to a Generalized Universal Bound

An example^{1} due to Armitage (see commentary on 1961) is a counterexample to a generalized version of the universal bound that applies to fixed as well as random hypotheses. I will simply describe the main features of the Armitage example here; see (Cox and Hinkley 1974, 50–1) for details. The example involves taking observations until the sample average $\bar{x}$ is at least a specified distance away from zero. That distance decreases as the number of observations increases. It does so at a rate that is fast enough that the experiment is guaranteed to end in finite time,^{2} but slow enough to ensure that according to the Law of Likelihood its final outcome strongly favors the hypothesis that the true mean equals $\bar{x}$ over the hypothesis that it equals zero. For any $k$, there is an experiment of this kind such that $\Pr{H_0}(\Pr(E|H_1)/\Pr(E|H_0)\geq k)$ is 1—a maximally severe violation of the universal bound.

I have argued elsewhere that this example should not be regarded as a counterexample to the Law of Likelihood itself. However, it is a counterexample to attempts to use the universal bound to support the use of purely likelihood-based methods.

### Conclusion

Likelihood functions do not distinguish between fixed and random hypotheses, so purely likelihood-based methods cannot distinguish between them either. Thus, results such as the universal bound that hold only for fixed hypotheses do not support the use of purely likelihood-based methods. Methodological likelihoodists who wish to claim that purely likelihood-based methods are useful when deployed judiciously need to find some other support for that view.

To share your thoughts about this post, comment below or send me an email.

Comments support Markdown formatting and $\LaTeX$ mathematical expressions (surround expressions with single dollar signs for in-line math or double dollar signs for display math).

lotharson says

Dear Greg, I have the following question.

As I told you, I am a pluralist, believing that there is not a SINGLE all-encompassing epistemology.

I think that the likelihood ratios could provide useful

priorsto open-minded Bayesians.As I explained, I utterly reject the principle of indifference which aims at turning complete ignorance in specific knowledge in a way very akin to magical thinking.

I really think that scientists are well advised NOT using it for calculating priors of theories.

Instead I’d propose the following step:

1)waiting to have good experimental data

2)computing the likelihood ratios of ALL competing theories p(Data/theory)/p(data)

3) making our ideal degrees of beliefs equal to the ratios which are naturally comprised between 0 and 1

4) Using Bayes theorem for updating the probability

Are you open to this suggestion?

I am convinced that it is a FAR better way to proceed than using the principle of indifference, because it allows us to directly start with

physically meaningfulvalues.If I were a Bayesian, I would certainly go that way.

I doubt, however, that there is such a thing as rational degrees of beliefs in propositions being either true or false, unlike events one can bet upon.

Cheers.

Greg Gandenberger says

That way of proceeding is appealing

prima facie. Unfortunately, it yields the same results as Bayesian updating on a flat prior and thus faces all the same difficulties as the principle of indifference.Michael Lew says

Edwards describes the use of ‘prior likelihoods’ in his excellent book “Likelihood” (which is still in press: http://www.amazon.com/Likelihood-A-W-F-Edwards/dp/0801844436), but to me there seems little difference between prior likelihoods and prior probabilities other than the former having easily understood properties on the scale of the experimental results.

Greg Gandenberger says

On the other hand, prior probabilities unlike prior likelihoods have easily understood implications for one’s beliefs and behavior after seeing the data.

Michael Lew says

I agree entirely. The distinction between Edwards’s proposed use of prior likelihoods and prior probability distributions is too subtle for my mind. I was pointing to them only because I thought that lotharson might find it interesting.

lotharson says

Thanks!

lotharson says

The problem is that these implications can be utterly misleading if non-physical or non-informative priors are used.

Assuming the existence of rational degrees of belief about propositions, I fail to see why fixing prior probabilities using the principle of indifference would be better in terms of beliefs and actions than fixing them as the likelihoods after a certain number of data has been obtained.

Actually, I am pretty confident that the second approach is far more promising in terms of results and convergence than the first one, since it starts with an informative prior.

But maybe I am misunderstanding you.

Greg Gandenberger says

Waiting for some data $E$ and then setting $\Pr(H_1|E)/\Pr(H_2|E)=\Pr(E|H_1)/\Pr(E|H_2)$ gives the same result as setting your initial prior $\Pr(H_1)/\Pr(H_2)=1$ and then conditioning on $E$:

$$\frac{\Pr(H_1|E)}{\Pr(H_2|E)}=\frac{\Pr(H_1)}{\Pr(H_2)}\frac{\Pr(E|H_1)}{\Pr(E|H_2)}$$