Kotzen on the Improbability of Nothing
February 26, 2014 — 18:36

Author: Kenny Pearce  Category: Existence of God Prosblogion Reviews  Tags: , , , , , ,   Comments: 5

When someone asks ‘why p rather than q?’, it is sometimes a good answer to say, ‘p is far more probable than q.’ When someone asks, ‘why is p more probable than q?’, it is sometimes a good answer to say, ‘there are many more ways for p to be true than for q to be true.’ According to a well-known paper by Peter Van Inwagen, the question ‘why is there something rather than nothing?’ can be answered in just this fashion: something is far more probable than nothing, because there are infinitely many ways for there to be something, but there is only one way for there to be nothing. In his contribution to The Puzzle of Existence, Matthew Kotzen argues that, this sort of answer is only sometimes a good one, and that we cannot know a priori whether it is a good answer to the question of something rather than nothing.

Kotzen’s general line of response is a standard one: he argues that there are many possible measures, and not all of them assign probability 0 to the empty world. Van Inwagen is perfectly aware of this problem, but argues that a priori considerations allow us to select a natural measure. Kotzen’s strategy is to identify some everyday examples where this pattern of explanation looks good, and some where it looks bad, and show that van Inwagen’s a priori considerations don’t draw the line between good and bad in the right place. Furthermore, he argues (p. 228) that van Inwagen’s considerations may not actually be sufficient to assign unique probabilities in the relevant cases, since it is not always clear what space the measure should be assigned over.

I think Kotzen’s argument against van Inwagen is quite compelling. The best thing about Kotzen’s article, though, is that it does a great job explaining these complex issues at a moderate level of rigor and detail without assuming hardly any background. This would be a great article to assign to undergraduate students.

In the rest of this post, I’m going to do two things. First, I’m going to explain the issue about measures in a much lesser level of rigor and detail than Kotzen does, just to make sure we are all up to speed. Second, I am going to raise the question of whether van Inwagen’s argument might have an even bigger problem: whether, instead of too many equally eligible measures, there might be none.

The simplest, most familiar, cases where the probabilistic pattern of explanation with which we are concerned works are finite and discrete. This is the case, for instance, with dice rolls or coin flips. The coin either comes up heads or tails; each die shows one of its six faces. So then, as one learns in one’s very first introduction to probabilities, in the case of the dice roll, the probability of any particular proposition about that dice roll is the number of cases in which the proposition is true divided by the total number of possible cases (for two six-sided dice, 36). In real life, by dividing the outcomes into discrete cases like this, we care about certain factors (which face is up) and not about others (e.g., where on the table the dice land). This division into discrete cases is called a partition. The reason the probabilities are so simple in the dice case, with each case in the partition being equally likely, is because we chose a good partition. (Well, actually, it’s because a fair die is defined as one that makes each of those outcomes equally probable, but let’s ignore that for now and imagine that fair dice just occur in nature rather than being made by humans on purpose.) Suppose that, on one of our dice, the face with six dots is painted red rather than white and, for some reason, what we really care about is whether the red face is up. Well then we might partition the outcomes accordingly, into the red outcome and the non-red outcomes. But these two cases (red and non-red) are not equally probable.

Sometimes the thing we care about is not a discrete case like this, but a fundamentally continuous case like (in a standard example) where on a dartboard a perfectly thin dart lands. A measure is basically the equivalent, in this continuous case, of the partition in the discrete case. For the dart board, there is a natural measure, one that ‘just makes sense’, and this is provided by our ordinary spatial concepts. So if, for instance, the bullseye takes up 1/10 of the area of the dartboard then, if the dart is thrown randomly, it will have a 1/10 chance of landing there. (Again, this is really just what it means for the dart to be thrown randomly.) This isn’t the only possible measure, but it’s the one that, in some sense, ‘just makes sense.’ But the question is, is there a natural measure on the space of possible worlds? That is, is there some ‘correct’ or ‘sensible’ or ‘natural’ way of saying how ‘far apart’ two possible worlds are? This is far from clear. The Lewis-Stalnaker semantics for counterfactuals supposes that we can talk about some worlds being ‘closer together’ than others, but this is not enough to define a measure. Furthermore, Lewis, at least, thinks that the closeness of worlds might change based on contextual factors (which respects of similarity we most care about), so it seems like there’s a plurality of measures there. Perhaps one could claim that all of these reasonably natural measures agree in assigning nothing probability 0, but that’s not clear either. For instance, Leibniz seems to think that one reason why the existence of something cries out for explanation is that “a nothing is simpler and easier than a something” (“Principles of Nature and Grace,” tr. Woolhouse and Francks, sect. 7). So maybe we should adopt a measure in which worlds get lower probability the more complicated they are. (I think Swinburne might also have a view like this.) On this kind of view, the empty world (if there is such a world) will be the most probable world. So the plurality of measures seems like a problem.

It’s not the only problem, though. Kotzen notes that “the Lebesque measure can be defined only in spaces that can be represented as Euclidean n-dimensional real-valued spaces” (222). (The Lebesgue measure is the standard measure used, for instance, in the dart board case: the bigger space it takes up the bigger its measure.) But the space of possible worlds is not like this! David Lewis has argued that the cardinality of the space of possible worlds must be greater than the cardinality of the continuum (Plurality of Worlds, 118). The reason is relatively simple: suppose that it is possible that there should be a two-dimensional Euclidean space in which every point is either occupied or unoccupied. The set of possible patterns of occupied and unoccupied points in such a space (each representing a distinct possibility) will be larger than the continuum. But if this is right, then there can be no Lebesgue measure on the possible worlds because there are too many worlds. Even if this exact class of worlds is not really possible (for reasons such as the considerations about space in modern physics I raised last time) it seems likely that there are too many worlds for the space of possible worlds to have a Lebesgue measure. Yet Kotzen attributes to van Inwagen that view “that we ought to associate a proposition’s probability with its Lebesgue measure in the relevant space” (227).

Maybe van Inwagen is not in quite this much trouble. He doesn’t actually seem to say anything about a Lebesgue measure in the paper, so I’m not sure exactly why Kotzen thinks van Inwagen is committed to this. In fact, in the paper Kotzen is discussing, van Inwagen cites his earlier discussion in Daniel Howard-Snyder’s collection, The Evidential Argument from Evil. In endnote 3 (pp. 239-240) of that article, van Inwagen says “the notion of the measure of a set of worlds gets most of such content as it has from the intuitive notion of the proportion of logical psace that a set of worlds occupies.” I find it a little bit ironic that van Inwagen says this, because he’s always denying that he has intuitions about things! I don’t have intuitions about proportions of logical space. In any event, it seems to me that van Inwagen is here disavowing the project of giving a well-defined measure in the mathematician’s sense.

Suppose one did want to identify a natural measure that was well-defined in the mathematician’s sense. I’m not sure about all the technicalities of trying to do this for sets of larger-than-continuum cardinality, and whether it can be done at all. Even if it can, thought, it’s going to be hard to say that one measure is more intuitive or natural than another in such an exotic realm. Things might be even worse: Pruss thinks (PSR, p. 100) that, for any cardinality k, it is possible that there be k many photons. If this is true, then there is a proper class of possible worlds, and one certainly can’t define a measure on a proper class. (This is another thing I don’t think I have intuitions about.)

All this to say: anyone who wants to assign a priori probabilities to all propositions (as van Inwagen does) is fighting an uphill battle, but if such probabilities cannot be assigned, then it does not seem that the probabilistic pattern of explanation can be used to tell us why there is something rather than nothing.

(Cross-posted at blog.kennypearce.net)

• FACT. Assume the Axiom of Choice. Then suppose S is an infinite set and P is a finitely additive probability measure such that:
(a) P is defined on some nontrivial algebra F of subsets of S that is invariant under permutations (if A is in F, then pi[A] is in F for every permutation pi; nontrivial here means that it contains some subset that is neither empty nor full), and
(b) P is invariant under permutations of S: if pi is a permutation of S, then P(pi[A])=P(A) for every subset A in F.
Then for every a in S, the singleton {a} is in F, and P({a}) = 0.

(And if we extend set theory to allow a sufficient upward hierarchy of superclasses, maybe this will work if S is a proper class. I don’t know about that sort of stuff myself.)

Lebesgue measure is generated by requiring invariance under Euclidean isometries. What van Inwagen may want to argue is that for world-counting, permutation invariance is the right analogue to isometric invariance. I doubt it myself, but that’s the best I can do for him at this point.

February 28, 2014 — 9:01
• But perhaps the most promising approach for PvI might be this. Getting a very helpful probability measure on the collection of all worlds is perhaps not the best approach. But maybe a weaker thing can be done. Namely, one might have a comparative probability. Thus, there is a transitive and reflexive relation ≤ of being at most as likely as that can hold between sets of worlds, subject to the obvious constraint that if A is a subset of B, then A≤B. (Further axioms are usual, but perhaps we shouldn’t require them.)

Now, say that a subset A is ≤-null provided that for every natural number n, there are n disjoint sets B1,…,Bn of worlds such that A≤Bi for each of the Bi.

There are several possibilities here. Either ≤ is a primitive probabilistic relation, or it is derived from a more fundamental unconditional (real-valued or even hyperreal-valued) probability via the rule A≤B iff P(A)≤P(B), or it is derived from a more fundamental conditional probability via the de Finetti rule A≤B iff P(A-B|(A-B)∪(B-A))≤P(B-A|(A-B)∪(B-A)). In the latter two cases, if A is ≤-null, then it has zero or infinitesimal unconditional probability. So being ≤-null should have similar kinds of explanatory consequences to having zero probability.

OK, now what van Inwagen needs to argue is that if E is the empty world, then { E } is ≤-null.

To do that, it suffices for him to argue that for any n, one can find n disjoint sets B1,…,Bn of worlds, such that each of these sets is at least as likely as { E }. If all worlds are equally likely, i.e., { w1 } ≤ { w2 } (and conversely) for all w1 and w2, then this is immediate: just let w1,…,wn be distinct worlds and let Bi = { wi }.

Now even if all worlds aren’t equally likely, and even if the empty world is the most likely because it is the simplest, nonetheless it could be that { E } is ≤-null. For instance, suppose that there is an infinite set S of worlds such that every world in S is somewhere between a billion and a trillion times less likely than E. Then for any n, we can find disjoint subsets B1,…,Bn of S such that Bi contains a trillion worlds for each i, and then { E } ≤ Bi, and so { E } will be ≤-null.

So the above makes van Inwagen’s task easier. Easier but far from trivial.

By the way, here’s an intuition on all this stuff. There are broadly three kinds of measures on worlds. There are Humean measures predicated on the intuitive idea of equal probability. There are Leibnizian measures predicated on value. And there are linguistic measures predicated on complexity of description. I think it’s intuitively clear that something like van Inwagen’s claim should hold on any Humean measure, but not on Leibnizian or linguistic measures.

So the deep question is: Which family of measures is the appropriate one, at least for explanatory (as opposed to evidential) purposes?

And I think the right answer is that it depends on what the correct metaphysics is. On a theistic, pantheistic or axiocentric metaphysics, a Leibnizian measure is appropriate. And on Intelligent Design metaphysics–where we suppose a designer but make no suppositions about the goodness of the designer–the linguistic measure is appropriate. And on many other metaphysical views, a Humean measure will be appropriate.

February 28, 2014 — 9:45
• Thanks, Alex. I was hoping you’d jump in here! I’m a little out of my depth on the math, but I think I caught most of what you said.

The last point is indeed interesting. I think Leibniz’s remark about nothing being simplest and easiest could be taken as a recogintion that a different sort of measure, something like the linguistic measure, would be appropriate if there weren’t a traditional sort of God. But I wonder what the relevant alternative for him is, because usually he seems to be thinking that God provides something like a causal explanation (and that everything needs that sort of explanation) and that’s why the argument from contingency is supposed to be demonstrative, rather than merely providing a probabilistic explanation.

Anyway, like I said, I’m not sure that PvI wants to define a rigorous measure, so something like your partial ordering approach might be amenable to him, though, as you say, it’s not trivial to get the improbability of nothing out of that approach.

February 28, 2014 — 11:46
• Alex & Kenny:

I like Alex’s comparative probability approach. Since it’s a relation, it can (perhaps) be made to scale-up through all the ranks of the iterative hierarchy. After all, the subset relation is an order relation on every (x, y) in V. Other approaches seem to require redefinition as one goes higher (perhaps at every large cardinal, perhaps at more finely grained breaks in V). But I’m not clear about the technicalities here.

I’d challenge Alex to work out the linguistic approach. Are these infinitary logical languages as in Karp? That is, languages with infinitely long quantifier blocks and symbol sequences and nestings? What would complexity of description be?

We do have something like formally rigorous complexity measures in the Leibnizian sense. These are measures of “depth”, as in logical depth or parallel depth (Bennett, Antunes, Machta). They can be generalized up through all transfinite levels (or at least I’ve read). I’d argue that any linguistic description complexity will reduce to some type of depth. And I’ll also note that depth is very similar to the informal measures of intrinsic value proposed by lots of authors.

Still, I confess that this is an area where I become very confused very quickly. I tend to think that probability isn’t at all the right approach to these issues.

– Eric

February 28, 2014 — 13:18
• Alexander Pruss

Eric:

I guess the linguistic complexity measure would be something like this. Fix a language L all of whose predicates and quantifiers are fundamental, perfectly natural, whatever. I don’t know how to specify this, and I don’t know that it can be specified, but we can pretend we’ve solved this. Call that a perfectly natural language. Add to L an “end of sentence” marker, a period. Now generate a random sequence of symbols from L, stopping when you added a period (but perhaps never stopping if a period is never selected). Let S be this random sequence. (It’s a random variable.)

Let E be this event: S is a meaningful sentence and S is true in one and only one world.

Then, for any world w, let P(w) = P(S is true in w | E).

This is something very close to Solomonoff probability.

As it stands, P(w) = 0 if w cannot be L-described in a finite sentence.

But P(empty world) > 0, since the empty world can be uniquely described very briefly in a perfectly natural language, e.g., as ~Ex(x=x).

February 28, 2014 — 14:22