The aim of this post is to give a direct proof of the theorems of measurable projection and measurable section. These are generally regarded as rather difficult results, and proofs often use ideas from descriptive set theory such as analytic sets. I did previously post a proof along those lines on this blog. However, the results can be obtained in a more direct way, which is the purpose of this post. Here, I present relatively self-contained proofs which do not require knowledge of any advanced topics beyond basic probability theory.

The projection theorem states that if is a complete probability space, then the projection of a measurable subset of onto is measurable. To be precise, the condition is that *S* is in the product sigma-algebra , where denotes the Borel sets in , and the projection map is denoted

Then, measurable projection states that . Although it looks like a very basic property of measurable sets, maybe even obvious, measurable projection is a surprisingly difficult result to prove. In fact, the requirement that the probability space is complete is necessary and, if it is dropped, then need not be measurable. Counterexamples exist for commonly used measurable spaces such as and . This suggests that there is something deeper going on here than basic manipulations of measurable sets.

By definition, if then, for every , there exists a such that . The measurable section theorem — also known as *measurable selection* — says that this choice can be made in a measurable way. That is, if *S* is in then there is a measurable section,

It is convenient to extend to the whole of by setting outside of .

The

*graph*of is

The condition that whenever can alternatively be expressed by stating that . This also ensures that is a subset of , and is a section of *S* on the whole of if and only if .

The results described here can also be used to prove the optional and predictable section theorems which, at first appearances, also seem to be quite basic statements. The section theorems are fundamental to the powerful and interesting theory of optional and predictable projection which is, consequently, generally considered to be a hard part of stochastic calculus. In fact, the projection and section theorems are really not that hard to prove.

Let us consider how one might try and approach a proof of the projection theorem. As with many statements regarding measurable sets, we could try and prove the result first for certain simple sets, and then generalise to measurable sets by use of the monotone class theorem or similar. For example, let denote the collection of all for which . It is straightforward to show that any finite union of sets of the form for and are in . If it could be shown that is closed under taking limits of increasing and decreasing sequences of sets then the result would follow from the monotone class theorem. Increasing sequences are easily handled — if is a sequence of subsets of then from the definition of the projection map,

If for each *n*, this shows that the union is again in . Unfortunately, decreasing sequences are much more problematic. If for all then we would like to use something like

(1) |

However, this identity does not hold in general. For example, consider the decreasing sequence . Then, for all *n*, but is empty, contradicting (1). There is some interesting history involved here. In a paper published in 1905, Henri Lebesgue claimed that the projection of a Borel subset of onto is itself measurable. This was based upon mistakenly applying (1). The error was spotted in around 1917 by Mikhail Suslin, who realised that the projection need not be Borel, and lead him to develop the theory of analytic sets.

Actually, there is at least one situation where (1) can be shown to hold. Suppose that for each , the slices

(2) |

are compact. For each , the slices give a decreasing sequence of nonempty compact sets, so has nonempty intersection. So, letting *S* be the intersection , the slice is nonempty. Hence, , and (1) follows.

The starting point for our proof of the projection and section theorems is to consider certain special subsets of where the compactness argument, as just described, can be used. The notation is used to represent the collection of countable intersections, , of sets in .

Lemma 1Let be a measurable space, and be the collection of subsets of which are finite unions over compact intervals and . Then, for any , we have , and the debut

is a measurable map with and .

*Proof:* Noting that and the collection of compact intervals in are closed under pairwise intersection, the same is true for . Then, for there exists, by definition, such that . Replacing by if necessary, we may suppose that is a decreasing sequence.

Now, the slices defined by (2) are finite unions of compact intervals, so are compact. The compactness argument explained above implies that

(3) |

As each is a finite union for and nonempty , the projection is in . Then, (3) shows that is also in .

If is the debut of *S*, then . This immediately implies and, as nonempty compact sets contain their infimum, . For every , the set is in and,

showing that is measurable. ⬜

When dealing with more general subsets of , it will not necessarily be the case that the projection onto is measurable. For that reason, we extend the probability measure to more general subsets of . For a probability space , define an *outer measure* on the power set by approximating from above by measurable sets,

(4) |

The outer measure has the following basic properties.

Lemma 2For a probability space , the outer measure is increasing and continuous along increasing sequences. That is, for , and for sequences increasing to a limitA.

Furthermore, for any , there exists in with .

*Proof:* The fact that is increasing is immediate from the definition. Now, let be increasing to the limit *A*. By the definition of , there exists in with

Replacing by if necessary, we may suppose that is an increasing sequence. Then, is in and, by monotone convergence,

So, as required. Incidentally, this also shows that there is a in with . ⬜

I now move on the the main component of the proof of the projection and section theorems. This will allow us to approximate measurable subsets of from below by sets in , as defined in lemma 1 above. While the statement of theorem 3 is simple enough, the proof can get a bit tricky. The method used here is elementary and, although the argument is a bit intricate, no advanced mathematics is required. The definition of means that it is the minimal collection of subsets of *X* which contains and is closed under taking limits of increasing and decreasing sequences. I refer to the result as the `capacitability theorem’ as it is a version of Choquet’s capacitability theorem although, here, we do not involve the concept of analytic sets. A set can be called capacitable if, for each , there exists a decreasing sequence with and . So, theorem 3 is saying that all sets in are capacitable.

Theorem 3 (Capacitability Theorem)LetXbe a set, be closed under pairwise intersections, and be increasing and continuous along increasing sequences. Denote the closure of under limits of increasing and of decreasing sequences by .

Then, for any and with , there exists a decreasing sequence with and for alln.

*Proof:* Fixing , let denote the collection of all with . The assumptions on *I* mean that for any then every is in and, for any sequence increasing to *A*, then for large *n*.

The proof of the theorem amounts to finding a collection containing and closed under taking limits of increasing and decreasing sequences, such that, for every , we can construct a decreasing sequence with . In that case, every will also be in , and the claimed result will follow.

The main difficulty in the proof is to describe a collection with the required properties. One way of doing this is as follows, and can be described in terms of a game. For , consider the following infinite game played between two players, who take turns choosing sets from . Starting with , at rounds , the players make the following moves.

- Player 1 chooses an in .
- Player 2 chooses a in .

At each round, both players can, at least, make a valid move. For example, player 1 can set and player 2 can set . We say that player 2 wins the game if, once completed, she is able to find a sequence in with .

For any , denote the game described above by . A strategy (for player 2) is just a sequence of functions satisfying

The idea is that represents player 2’s choice for at round *n*, given that player 1 has chosen so far. It is a *winning strategy* if, for any sequence satisfying

(5) |

for each , then there exists a sequence with

(6) |

Now, let be the collection of for which the game has a winning strategy. The case with is easy. Any strategy is a winning strategy simply by taking in (6). For we may as well take , which is a valid strategy.

Now, consider a sequence and let be winning strategies for . Construct a winning strategy for , with , as follows. Choose a bijection such that is increasing in *s*. For example, take . Then for and , write

It can be seen that this is a winning strategy. If (5) is satisfied, then it will also hold for the sequence over (for the game ). As is a winning strategy for , there exists in satisfying . In particular, writing gives

so (6) is satisfied, and .

If is increasing, construct a winning strategy for as follows. For any with , the sequence increases to . Hence, there is a minimum *r* such that . Set,

For then we do not really care, so can just take . This clearly gives a valid strategy. To see that it is a winning strategy, suppose that (5) is satisfied. Setting and for , we see that (5) is also satisfied with in place of and in place of . So, as is a winning strategy for the game , there exists a sequence with

So, is a winning strategy for and, hence, .

We have shown that contains and is closed under taking limits of increasing and decreasing sequences and, so, contains . Finally, for any , let be a winning strategy for and define a sequence by and

for all . As is a winning strategy, there exists a sequence satisfying (6). Replacing by if required, we can suppose that the sequence is decreasing. Finally, as , we have as required. ⬜

The argument above is along similar lines to the `rabotages de Sierpinski’ used by Dellacherie, Ensembles aléatoires II (1969). Although the description of the collection in terms of winning strategies of the games may not seem like an obvious approach, it is really quite natural. As a first attempt to prove the result, we could try defining to be the collection of sets for which the conclusion of the theorem holds. That is, the sets *A* for which there is a decreasing sequence with . We would then have to show that is closed under taking limits of increasing and decreasing sequences. While increasing sequences are easy to deal with, decreasing ones are problematic. Suppose that decreases to *A* and that, for each *n*, there is a decreasing sequence with . To construct a sequence of sets we could try to do the following. Reorder the doubly-indexed sequence into a singly-indexed one, and set . Then, it is clear that and . However, is not decreasing. We could try and ensure that it is decreasing by setting

Unfortunately, it is no longer necessarily true that is in . When we take intersections we need no longer be in . The easiest way around this, it seems, is to allow the choice of to depend on the previous choices of . That is, the choice of should depend on so as to enforce the condition that is in . This leads, essentially, to the requirement of winning strategies for the games as described in the proof of theorem 3.

We use theorem 3 to show that measurable subsets of can be approximated from below by .

Corollary 4Let be a probability space and be the collection of subsets of given in lemma 1. Then, for any and , there exists in satisfying

*Proof:* Setting , define

This is clearly increasing. Also, if is increasing to a limit *A* then increases to . Lemma 2 implies that , and *I* is continuous along increasing sequences.

As the complement of a compact interval in is a countable union of compact intervals, the complement of any is a countable union of . The monotone class theorem then says that the closure of under limits of increasing and decreasing sequences is the entire sigma-algebra generated by . Hence,

We apply theorem 3. For and , setting , there exists a decreasing sequence with and . Take which is in . As in the proof of lemma 1, decreases to . By monotone convergence,

as required. ⬜

Combining this result with the statement, in lemma 1, of measurable projection for sets in gives the measurable projection theorem.

Theorem 5 (Measurable Projection)Let be a complete probability space, and . Then, .

*Proof:* By corollary 4, for each positive integer *n*, there is an with

(7) |

We know from lemma 1 that are measurable, so is in , is contained in , and satisfies . Lemma 2 states that there is a in and satisfying .

We have constructed sets in and atisfying . By definition, this means that is in the completion of and, if the probability space is complete, it is in . ⬜

In a similar way, corollary 4 combined with the statement of measurable section for sets in , given by lemma 1, gives the measurable section theorem.

Theorem 6 (Measurable Section)Let be a probability space and . Then, there exists a measurable , such that and is -null.

*Proof:* As in the proof of theorem 5, there is a sequence in satisfying (7). Replacing by if necessary, we suppose that the sequence is increasing. Let be the debut of , Lemma 1 states that this is measurable and . Define a random time by,

(I am using ). This is measurable with graph contained in *S* and,

By lemma 2, there exists containing with . So, has zero probability and contains , which is -null as required. ⬜

Finally, we state the theorem for complete probability spaces, in which case the section is defined on all of , and not just up to a -null set.

Theorem 7 (Measurable Section)Let be a complete probability space and . Then, there exists a measurable , such that and .

*Proof:* By theorem 6 there exists a measurable map such that and is -null. Define by

Here, represents the slice of *S* defined as in (2). We do not care about which *t* is chosen in the third case but, as is nonempty on , a choice does exist. By construction, , , and almost surely. As is measurable, completeness of the probability space implies that is also measurable. ⬜

Hi the formalization of is a little confusing to me. Do you mean something like the following ?

For we set :

, such that

In which case the “measurability” of is not completely clear to me.

Moreover the graph seems also a little ambiguous as what arrows point to are and not (which are parts of unless mistaken).

Comment by TheBridge — 16 January 19 @ 12:58 PM |

I don’t follow what you are saying. In the second paragraph, is a map from (which is a subset of ) to . So, and . You seem to be suggesting that is an element of , rather than a subset, and that is a subset of instead of an element.

Comment by George Lowther — 16 January 19 @ 11:34 PM |

On reflection, maybe you are not suggesting that is an element of and it is just a typo in your comment. However, it still looks like you are suggesting that is a subset of , which is not the case.

Comment by George Lowther — 16 January 19 @ 11:38 PM |

As you spotted it’s a typo, I meant intead of sorry about that. Coming back to my point, maybe I was confused about this quote :

“…hat is, if is in then there is a measurable section, ”

So shouldn’t you write instead (as I understand your answer to my comment) ?

My second point in this regard is pointless and you can delete my other post. I also note that you make clear a “language abuse” shortly after all this when you write :

“For brevity, the statement above will also be expressed by writing .” But I think it’s a bit early at this stage in your post to use this convention.

Last let me correct you on one thing. It’s definitely not you who is happy to see me back (but I fill honored about that nevertheless so thanks), it is me indeed who is happy to see more posts from you on this amazing blog…, I have seen guys on MO forum who do not dare to quote and refer this blog in their papers as it is no “OK to refer a blog, but who can’t find in the literature equivalent theorems claimed and proved in such a clear and self contained manner… cela veut tout dire.

Comment by TheBridge — 17 January 19 @ 1:21 PM

Ah, I fixed the typo which caused your confusion, but it probably occurs elsewhere, so will fix properly later. I’ll also reread through and consider your suggestion regarding the notation when I have some time to properly edit this. Thanks!

Comment by George Lowther — 17 January 19 @ 1:52 PM

Last comment maybe would it be simpler to switch axis in your graph illustrating , as it’s a function of and not the other way around.Regards

Comment by TheBridge — 18 January 19 @ 8:13 AM

Rather than changing the graph, maybe it would be better to change the order of the Cartesian products throughout. instead of . That would be consistent with earlier stochastic calculus posts.

Comment by George Lowther — 18 January 19 @ 9:56 AM

Another point is the ambiguity on the notation of sets in the counterexample after equation (2) to illustrate the missing property needed for application of MCT. In some cases it’s a set in ( and shortly after it is in unless mistaken. Regards

Comment by TheBridge — 18 January 19 @ 10:36 AM

I don’t think there’s ambiguity. is a subset of , whereas is a subset of .

Comment by George Lowther — 18 January 19 @ 10:56 AM

I changed the order of all cartesian products. Let me know what you think – if it is better, I’ll update the other posts.

Comment by George Lowther — 18 January 19 @ 12:04 PM

Sorry a few more remarks (I am reading your post very slowly as you can notice;-) ):

-In the end I think that it would be nice to formalize the notion of “section”, by a fully fledged “definition 1” .

-Using the in your definition (2) of is a bit hard to follow for me as it is easy to forget that it’s only a compact of when is used and a part of when is dropped.

-You say that a decreasing sequence of compact that’s unless mistaken a theorem from Cantor, could be worth mentioning to be self contained :https://en.wikipedia.org/wiki/Cantor%27s_intersection_theorem

-The end of the argumentation for the “compact” example could be detailed a little bit more I think, I quote it his part :

“For each , the slices $latex{S_n(\omega)}$ give a decreasing sequence of nonempty compact sets, so has nonempty intersection. So, letting be the intersection , the slice is nonempty. Hence, , and (1) follows.”

So you proved then the slice of the intersection , namely is nonempty (part of ) in the first part and this is OK for me. But then the fact that this proves that still need a little more clarification even if it might seem trivial to you. So for your last claim to be true, I think you need to prove the following property :

For all nonempty and , we have :

.

Proof :

, by definition of a slice if it’s not empty then for then so that .

let’s take a look at the “contrapositive” (i.e. non ), if then there is no such that and and we are done. End of proof. Does that seems ok to you ?

Comment by TheBridge — 22 January 19 @ 1:48 PM

And, nice to see you again, TheBridge!

Comment by George Lowther — 16 January 19 @ 11:40 PM |