Let N be a very large integer. (Ideally, N would be infinity, but this is a parable.) There are two games which are offered to Fred, the P and Q games, and he must play the one or the other. In each, a fair coin is tossed N times, whether or not Fred plays, and each time it comes up heads, Fred gets something of positive value +V, and each time it comes up tails, Fred gets something of negative value -V. In the P game, the coin tossed is a penny, and in the Q game, it’s a quarter. To an ordinary person, the rational choice which game to play would be entirely arbitrary. Moreover, as it happens (based on priors and other evidence), the following three hypotheses are live, and have, let us suppose, equal probability 1/3:

(H1) Fred is perfectly self-interested, and knows exactly how each toss of each coin would go. When two outcomes are equally good, he chooses randomly. (Two sub-hypotheses that I won’t distinguish: (a) The coins are indeterministic, and Fred has middle knowledge; (b) The coins are deterministic, and Fred can predict perfectly.)

(H2) Fred is perfectly self-hating (i.e., tries to minimize his own utility), and knows exactly how each toss of each coin would go. When two outcomes are equally good, he chooses randomly.

(H3) Fred has no knowledge of how future coin tosses will go and chooses which game to play at random.

You now observe that Fred chooses the P game. You also observe the first toss of the P game, and see that it’s tails, while the first toss of the Q game is heads, and so Fred gets -V, but would have got +V had he played the Q game (suppose that the outcomes in the Q game aren’t affected by whether Fred plays or not). You don’t get to observe any further steps in the game.

**Question**: How should your observation affect your probabilities of the three hypotheses?

The qualitative intuitive answer is easy. Your observation does not affect the probability of H3 at all. It increases the probability of H2 by exactly the amount by which it decreases the probability of H1. If, however, H1 were initially more probable than H2, as typically it would be, then the probabilities of both H2 and H3 would be increased.

But the interesting question is as to the details: Just how much do the probabilities change?

Let’s do some calculations. Put in our background knowledge the outcomes of the first tosses. In the case where N=1 (I know I said N is big, but let’s start with the easy cases) it’s easy. In that case, our evidence rules out H1, because a perfectly self-interested being who knows how the toss would go would opt to play the Q game. So, the evidence E leads to P(H1|E)=0, P(H2|E)=2/3 and P(H3|E)=1/3. (Why does H3 not increase? Because the P(E|H3)=1/2 and P(E|H1 or H2)=1/2 as well.) In this case, we learn that fairly likely Fred is a perfectly self-hating predictor.

What if N=2? Well, there are sixteen possibilities for the two tosses of the two games: HH/HH, HH/HT (first toss of P is H, second toss of P is H, first toss of Q is H, second toss of Q is T), HH/TH, etc. Now, let’s evaluate P(E|H1). If H1 is right, then the only way that Fred could pick the P game if the first P-toss is tails and the first Q-toss is heads would be if the second P-toss is heads and the second Q-toss is tails, so that both games lead to the same overall zero payoff. Given the first-toss results, the conditional probability of this arrangement is 1/4. But even if it happens, Fred’s only going to choose the P game half the time. So, P(E|H1)=1/8. Suppose now H2. Then, the only way that Fred wouldn’t pick the P game given the first tosses would be if we had the above situation, and so P(E|H2)=7/8. On the other hand, P(E|H3)=1/2. Then: P(E|H2 or H3)=11/16. Plugging into Bayes’ Theorem gives: P(H1|E)=1/12, P(H2|E)=7/12, and P(H3|E)=1/3. So, once again, we learn that fairly likely Fred is a perfectly self-hating predictor. But now note that our probabilities are lower, because of our ignorance of the second toss. In the N=1 case, the probability of H2 after the observation was 0.67; for N=2, it’s 0.58.

Now, we can go back to our general case. Here are some facts–just take my word for them (I may be off slightly, in ways that don’t affect the argument). Let M=N-1. Then:

1/2 > P(E|H1) > (1/2)(1-5(2M)!/(M!)^{2})

1/2 < P(E|H2) < (1/2)(5(2M)!/(M!)^{2})

P(E|H3) = 1/2.

(To get these formulae, I used the fact that the Bernoulli distribution peaks in the middle.) Let’s plug some large numbers in. If N=100, then 0.500 > P(E|H1) > 0.358 and 0.500 < P(E|H2) < 0.642. As far as these inequalities go, we still have fairly good evidence in favor of H2 and against H1. But what if N=1000000? Then, 0.500 > P(E|H1) > 0.499 and 0.500 < P(E|H2) < 0.501. And what if N=10^{100}, our finite approximation to eternity? Then, 0.5 > P(E|H1) > 0.5 – 1.8 x 10^{-50} and 0.5 < P(E|H2) < 0.5 + 1.8 x 10^{-10}. By Stirling’s approximation, the difference between P(E|H1) and P(E|H2) is of the order of N^{-1/2}.

Using Bayes, we find that if N=10^{100}, then learning E will raise our probability of H2 and lower that of H1, by a number of the order of 10^{-50}. This is such a tiny change that it would be outweighed by just about any non-trivial piece of evidence for H1 and against H2.

Now for something a bit more striking. Let’s say that Fred simultaneously plays a ten billion P- or Q-type games, in each case choosing whether to play a P-type game or the Q-type game. And let E now be the following very surprising fact. In the case of each pair of games, Fred chose the one that led to him getting -V on the first try. We might think this is very strong evidence for H2. Not at all. In fact, learning E will only raise the probability of H2 by a number of the order of 10^{-40} and lower the probability of H2 by a similar number.

Similar results hold if the priors are not 1/3, 1/3, 1/3. The changes are still tiny.

There is a lesson here. When we get to observe only the first step of an extremely long game, and when there is no correlation betweent he step we observe and the other steps, from the fact that the first step doesn’t look beneficial to the player we get extremely little evidence about whether the player chose well. Moreover, even seeing the first step of a ten billion games (or ten billion steps of a single game) will not provide much evidence that way. This is surprising, but it is simply thatwhen the game goes on extremely long, the results of the first steps are swamped.

At the same time, note that if we only knew what the first step of the P- and Q-games was, we would have good reason to choose the one where the first step favored us (assuming they differed in the first step). This is because the choice would be between these two options:

(A) Get +V and then play many steps of a game with probability 1/2 at each step of -V and probability 1/2 of +V at each step.

(B) Get -V and then play many steps of a game with probability 1/2 at each step of -V and probability 1/2 of +V at each step.

And here, option (A) is clearly the right one to choose, even though the probability that one does overall better by choosing (A) is insignificantly smaller than the probability that one does overall better by choosing (B).

All this, I think, shows a way to defend sceptical theism against the objection that it undercuts the epistemic basis for our moral choices. For the sceptical theist does not need to say that evil doesn’t lower the probability of the existence of God. If it lowers that probability infinitesimally, or by 10^{-40}, for our epistemic purposes it doesn’t matter. It would be a weird coincidence if some belief’s evidence were so close to the cutoff between justification and lack of justification that changing the probability by 10^{-40} would make the difference.