My conversations at the Munich Center for Mathematical Philosophy keep coming back to stopping rules, so I’ve decided to write a paper on the topic. Here is the general line that I plan to develop.
Abstract. One might think that a scientist should not be allowed to plan to end an experiment as soon as it produces data that favors his or her favored hypothesis. At the very least, a scientist who proceeds in such a “biased” way is obliged to report this fact and to account for it in his or her data analysis. I argue that these seeming platitudes are warranted only by contingent facts about the non-ideal ways in which data are typically disseminated and used. If we were perfect Bayesian decision-makers and data went missing only at random, for instance, then “biased” stopping rules would be unproblematic. In a less Utopian vein, attention to stopping rules becomes less important as institutional and technological advances allow us to approach those ideals in particular domains.
A key part of the argument is that “biased” stopping rules always have tradeoffs. For instance, there are simple strategies for increasing the probability of producing a result that favors your preferred hypothesis over a specified alternative to a particular degree (according to the Law of Likelihood). However, those strategies also decrease the probability of getting a result that strongly favors your preferred hypothesis and increase the probability of getting a result that very strongly disfavors it. There are general results (usually presented in the context of gambling strategies) that guarantee that something like this will always be the case. As a result, it is misleading to speak of stopping rules as biased or unbiased: one stopping rule can be more biased than another in a particular respect, but it must then be less biased in other respects. The details matter, but it is at least plausible that the existence of such tradeoffs is sufficient to address the main frequentist objections to ignoring stopping rules, which is that doing so would allow unscrupulous researchers to produce systematically misleading results.
These tradeoffs are cold comfort in the presence of certain non-ideal practices concerning the use and dissemination of data. For instance, if scientists are able to suppress results that do not support their preferred conclusions, and decisions to accept or reject one hypothesis relative to another are made once and for all on the basis of whether or not a threshold for evidential favoring is reached–rather than in a dynamical way that attends to the precise degree of evidential favoring–then ignoring stopping rules can be disastrous. Unfortunately, such selective reporting and threshold reasoning (e.g. $p<.05$) are ubiquitous in many areas of science.
The problem of selective reporting could be addressed to a large extent through the use of pretrial registries. The problem of once-and-for-all decision thresholds may be more difficult to eliminate, particularly in domains such as medical research in which decision-making power is largely delegated to authorities that are accountable to the public and thus have reason to proceed in a relatively transparent, stable, and “objective” manner. It could be less of a problem when the interests of the relevant parties are more aligned, such as in a business’s use of its own internal data.
My main claims are that (1) as a foundational matter, the Likelihood Principle’s implication that stopping rules are irrelevant to the evidential import of data (provided that they are “noninformative” in a technical sense) is defensible, and (2) as a practical matter, attention to stopping rules may or may not be necessary in a given domain depending on how the data are disseminated and used.
To share your thoughts about this post, comment below or send me an email.
Comments support $\LaTeX$ mathematical expressions: surround with single dollar signs for in-line math or double dollar signs for display math.