It’s a truism that more diverse evidence in favor of a hypothesis supports it better than less diverse evidence, all else being equal. But the precise meaning and justification of this claim are unclear. For that reason, it is a prime target for philosophers of science.
I once attempted to address this issue by (roughly) comparing the amount of information one could expect to get by consulting experts who were getting their information from a common intermediate source to the amount one could expect to get by consulting comparable experts who were getting their information through separate channels. (Think of reading multiple newspapers that are relying on the same wire service vs. reading multiple newspapers that each had their own reporter on the scene.)
The project sounded like a good idea, and it yielded results that sound fairly interesting:
- Getting your information from independent sources is always better (in expectation) in a case involving normal random variables, but not in a case involving binary random variables.
- There are strong limits on the amount of information one can get by consulting non-independent experts in the normal case. For instance, when the correlation between the testimony of each pair of non-independent experts is $.25$, it takes infinitely many of them to provide as much information (in expectation) as only four independent experts.
It’s obvious in looking at the paper that I spent a ton of time on it. It includes nice figures; relatively exciting-sounding theorems; and long, algebraically intensive proofs.
In the end, though, it wasn’t publishable. Or at least, I don’t think it was publishable. I might have been able to get some journal to accept it eventually, but I never tried because it wasn’t a paper that I could stand behind.
The problem with the paper in my eyes is that I cannot see why anyone would actually care about the results, beyond the initial reaction that they sound kind of interesting. Philosophers of science want to clarify the nature of diversity and its significance for the import of evidence in hand. But the paper addresses instead a question of experimental design: to what extent should you expect to get more information from independent sources? Moreover, it’s not clear to me that this question is really of interest to statisticians and scientists, either. They are broadly interested in experimental design, of course, but they are rarely if ever in a situation in which they need to decide whether to consult a given number of independent experiments or a given number of comparable non-independent experts.
I could have seen this problem coming. The problem was not that I asked a good question and just didn’t find anything to say about it. In fact, the results I found are about as interesting as I could have hoped for. The problem was that I was asking bad questions.
Validate your ideas before you sink a lot of time into them. Take a step back and ask, “How could this project possibly yield an interesting result?” And don’t stop there: get feedback early and often from smart people who will tell you the truth.
To share your thoughts about this post, comment below or send me an email. To use $\LaTeX$ in comments, surround mathematical expressions with single dollar signs for inline mode or double dollar signs for display mode.