The BurkholderDavisGundy inequality is a remarkable result relating the maximum of a local martingale with its quadratic variation. Recall that [X] denotes the quadratic variation of a process X, and is its maximum process.
Theorem 1 (BurkholderDavisGundy) For any there exist positive constants such that, for all local martingales X with and stopping times , the following inequality holds.
(1)
Furthermore, for continuous local martingales, this statement holds for all .
A proof of this result is given below. For , the theorem can also be stated as follows. The set of all cadlag martingales X starting from zero for which is finite is a vector space, and the BDG inequality states that the norms and are equivalent.
The special case p=2 is the easiest to handle, and we have previously seen that the BDG inequality does indeed hold in this case with constants , . The significance of Theorem 1, then, is that this extends to all .
One reason why the BDG inequality is useful in the theory of stochastic integration is as follows. Whereas the behaviour of the maximum of a stochastic integral is difficult to describe, the quadratic variation satisfies the simple identity . Recall, also, that stochastic integration preserves the local martingale property. Stochastic integration does not preserve the martingale property. In general, integration with respect to a martingale only results in a local martingale, even for bounded integrands. In many cases, however, stochastic integrals are indeed proper martingales. The Ito isometry shows that this is true for square integrable martingales, and the BDG inequality allows us to extend the result to all integrable martingales, for .
Theorem 2 Let X be a cadlag integrable martingale for some , so that for each t. Then, for any bounded predictable process , is also an integrable martingale.
Proof: Without loss of generality, suppose that . If for a constant K, then . Applying (1) gives

(2) 
Also, by Doob’s inequality, . So, is integrable. In particular, Y is a local martingale of class (DL), so is a proper martingale.
The result does not hold for p=1, which would include all martingales. Instead, it is necessary to impose the condition that the maximum process is integrable.
Theorem 3 Let X be a cadlag martingale such that is integrable. Then, for any bounded predictable , is a martingale and is integrable.
Proof: Without loss of generality, suppose that . Then, applying (2) with p=1 shows that is integrable. So, Y is a local martingale of class (DL) and is a proper martingale.
It was previously shown that local integrability of is a sufficient condition for a predictable process to be Xintegrable, for a local martingale X. The BDG inequality enables us to reduce this to the weaker condition of local integrability of .
Lemma 4 Let X be a local martingale. Then, for any predictable process , the following are equivalent.
 is Xintegrable and is a local martingale.
 is locally integrable.
Proof: If is Xintegrable and then . The local martingale condition for Y is equivalent to local integrability of Y. As it has jumps , this is equivalent to local integrability of , which is the same as local integrability of or, equivalently, local integrability of . So, if is Xintegrable then the local martingale property for Y is equivalent to local integrability of . This shows that the first property implies the second.
For the converse, suppose that 2 holds. By localization, we may suppose that is integrable. To prove that is Xintegrable, it needs to be shown that if is a sequence of bounded predictable processes tending to zero then tends to zero in probability as n goes to infinity. Dominated convergence for LebesgueStieltjes integration implies that tends to zero almost surely. As , dominated convergence gives . So, by (1) with p=1,
and tends to zero in and, in particular, in probability.
The remainder of this post is dedicated to proving Theorem 1. We only prove (1) for , and the case for general stopping times follows from applying it to the stopped process and substituting in , . This is one of the longer proofs in these notes, and we do not attempt to find optimal values of the constants . The basic idea is not too difficult and, for continuous local martingales, follows quite quickly. The main problem is in handling the jumps for general local martingales. We will make use of a socalled good lambda inequality, (3) below. The proof of Theorem 1 will depend on showing that and satisfy a good lambda inequality. For the continuous local martingale case, this is possible, which results in the BDG inequalities for all values of . However, in the noncontinuous case, it will be necessary to break the process up into a term whose jumps do not become too large, which satisfies the good lambda inequality, and separate term only involving a small number of large jumps.
Lemma 5 Let be a constant and satisfy as . Then, there are positive constants such that, any pair of nonnegative random variables (X,Y) satisfying
(3)
(for all ) also satisfy the inequality
for all .
Note that only depends on and p, and is independent of the choice of random variables X, Y.
Proof: The following identity applying to all nonnegative random variables X will be used

(4) 
Inequality (3) gives
Then, multiplying by , integrating, and applying (4),
Rearranging gives
Choosing small enough that and setting gives the result.
Continuous Local Martingales
Let us first prove the result for continuous local martingales, for which (1) can be shown to hold for all . This will also help motivate the proof for more general local martingales. We make use of the elementary result that for any positive numbers a, b and continuous local martingale M with , then the probability that M hits a before b is bounded by b/(a+b). To see this, let be the first time at which or , so is a uniformly bounded martingale. By continuity, and, letting be the probability of hitting a before b, the martingale property gives
So, . For a local martingale X, the difference between the nonnegative processes and [X] is also a local martingale, which will be applied to the following.
Lemma 6 Suppose that X, Y are nonnegative continuous processes such that XY is a local martingale. Suppose, furthermore, that is a stopping time with for all . Then,
(5)
for all .
Proof: The sigma algebras define a filtration. Then, by optional sampling, conditioning on , is a continuous local martingale with respect to , and . On the event , M hits at some time and never hits . So, and multiplying by gives (5).
Applying this result in Lemma 7 below gives the good lambda inequality and, by Lemma 5, proves the BDG inequality (1) for continuous local martingales for all . From now on, whenever I state that a good lambda inequality is satisfied, it should be taken to mean that (3) holds for some fixed (universal) .
Lemma 7 Let M be a continuous local martingale with . Then, and satisfy a good lambda inequality.
Proof: Letting be the first time at which , then is a local martingale. Applying (5) with and gives
For any , consider the event . In this case, we have and . So,
which is the good lambda inequality for with .
Applying a similar argument, now let be the first time at which . As above, is a local martingale and applying (5) with and gives
For any , consider the event . Then, and . So,
giving the good lambda inequality for with .
Discretetime Martingales
A similar idea as used above for continuous local martingales can also be used to prove the BDG inequality for more general local martingales. There are, however, additional complications. The main problem is that a noncontinuous process can jump past a level without hitting it, so the bound given for a local martingale to hit a level a before b no longer applies. To get around this problem, the process can be decomposed into the sum of a process whose jumps are never too large and one with a small number of large jumps. To avoid tricky decompositions involving rather advanced results of stochastic calculus, we do this in discretetime. That is, suppose that the filtration is defined for times t in the nonnegative integers. Similarly, processes are assumed to only be defined at the nonnegative integers and stopping times only take integer (or infinite) values. The continuoustime scenario will then follow from a straightforward limiting argument.
A discretetime process X is said to be adapted if is measurable for each n and predictable if it is measurable for each . Given a discretetime process X, its increments are denoted by for , and the quadratic variation is .
To handle the jump terms, the following inequality will be used. This only applies for , which is the reason for the BDG inequality only holding for in the general case. Recall that is the L^{p}norm of a random variable U.
Lemma 8 Let Z be a nonnegative process and set , . Then, for any ,
Proof: The function is convex on the nonnegative reals. So, for any ,
Applying this to the increasing predictable process Y gives
for . Then, summing over k,
Setting q=p/(p1) and applying Hölder’s inequality,
Canceling from both sides gives the result.
Now, let us prove a discretetime version of Lemma 6. The additional complication is that it is necessary to bound the value of XY from below by a predictable process, which allows it to be stopped just before it drops below any given level.
Lemma 9 Suppose that X, Y are nonnegative processes such that XY is a local martingale and Z is a predictable process with . Suppose, furthermore, that is a stopping time such that for all . Then,
(6)
for all .
Proof: By optional sampling, conditioning on the event , is a martingale with respect to the filtration . Also, is predictable and satisfies .
Now, define the stopping time . The stopped process is a nonnegative local martingale and, hence, is a supermartingale. Furthermore, on the event we have and . So,
Multiplying by gives (6) as required.
We now move on to the discretetime version of Lemma 7. The difference now is that it is necessary to restrict to martingales whose increments are bounded by a predictable process.
Lemma 10 Suppose that M is a martingale satisfying and for some predictable process L. Then, and satisfy a good lambda inequality.
Proof: The predictable process satisfies . Define the stopping time so that . Then is a local martingale. Applying (6) with and gives
Now consider the event . In this case
So,
And satisfies the good lambda inequality with .
The argument for follows in a similar fashion. Define the stopping time , so . As above, is a local martingale and (6) can be applied to , and ,
Now consider the event . Then,
So,
and satisfies the good lambda inequality with .
Finally, the proof of the discrete time BDG inequality will involve subtracting out the large jumps of the martingale X. Define a process V by and for . If are times at which then
So, the variation of V is bounded by
As V will not, in general, be a martingale, a Doob decomposition will be used. Define A by and for . Then, N=VA satisfies and is a martingale. Lemma 8 is now used to bound the variation of A in the norm, for any ,
Then, the variation of N satisfies the following bound
In particular, the supremum and quadratic variation of N satisfy the same bound,

(7) 

(8) 
Next, from the definition of V, XV has increments bounded in absolute value by and, therefore, satisfies the same bound. So, the martingale M=XN=XV+A satisfies . Lemma 10 can now be applied to obtain the BDG inequality for discretetime martingales.
Theorem 11 There exist positive constants for each such that, for any discretetime local martingale X,
(9)
Proof: As , Lemma 10 says that and satisfy a good lambda inequality. So, by Lemma 5, for each there are positive constants (independent of the choice of X) such that
Now, as X=M+N, inequality (7) gives,
Also, M=XN so the triangle inequality together with (8) gives
Hence,
and, as [X] is an increasing process with increments , we have , giving the right hand inequality of (9) with .
A similar argument applies for the left hand side of (9). The triangle inequality together with (8) gives,
Then, as , (7) gives,
Finally, , so the left hand inequality of (9) is satisfied for .
Continuoustime Local Martingales
We finally prove the BDG inequality for and an arbitrary continuoustime local martingale X, which will follow from applying a limiting argument to the discretetime version stated in Theorem 11.
First, note that if is a sequence of stopping times increasing to infinity then are monotonically increasing to . Then, by monotone convergence, it is enough to show that the BDG inequality is satisfied for the stopped processes .
As the quadratic variation has jumps , it follows that local integrability of X is equivalent to local integrability of [X] and, therefore, to local integrability of . If neither of and are locally integrable then each term in (1) is infinite, so the inequality is trivially satisfied. We, therefore, suppose that one and, hence, both of are locally integrable. By localization, we can suppose that and are both integrable, in which case X is a proper martingale. Now, for each n, choose a sequence of times such that as n goes to infinity. For example, . Then,
converges ucp to [X]. Passing to a subsequence, if necessary, we suppose that uniformly on compacts with probability one. So, will be a cadlag process with jumps
which are bounded. So, by localization, we suppose that is finite. Fixing a time t and applying Theorem 11 to the discretetime martingale gives
Then, take the limit followed by and apply dominated convergence to and to get
Raising to the p‘th power and replacing by gives the result.
Notes
Historically, the last case of the BDG inequalities to be proven was for p=1 in the paper `On the integrability of the martingale square function’ by Burgess Davis. It was here that the decomposition of the discretetime martingale used above in the proof of Theorem 11 above was introduced. The proof given here follows, roughly, that given by Burkholder, Davis and Gundy in `Integral inequalities for convex functions of operators on martingales’. However, that paper considers a rather more general inequality, which I briefly mention now. A function is said to be moderate if it is continuous, increasing and there are constants and c such that . For example, if then we can take . Lemma 5 is easily generalized to show the existence of a constant C such that for any pair of nonnegative random variables (X,Y) satisfying the good lambda inequality (3). Then, the proof for continuous local martingales above also shows that there are positive constants such that
for any continuous local martingale X and stopping time . See here for a quick proof along the same lines. For arbitrary local martingales, it is necessary to impose the additional condition that F is convex, so that the required generalization of Lemma 8 holds. If , this corresponds to .
this is quite useful for me!
Comment by soul — 12 October 10 @ 1:40 AM 
I know that BDG inequalities hold also for local submartingales, which can be very useful sometimes.
Most likely the proof you gave would still work (with some equalities replaced by inequalities).
Could you please apply the few required changes and have the statement in this additional generality?
This would be useful as a reference, especially if you create a pdf version of the notes and post it on the Arxive…
P.S. Thank you for these wonderful notes!
Comment by pietro siorpaes — 20 March 12 @ 4:07 PM 
[...] http://almostsure.wordpress.com/2010/04/06/theburkholderdavisgundyinequality/ [...]
Pingback by a nice optional sampling problem  Aquazorcarson's Blog — 27 March 13 @ 10:28 PM 
Was this result published as a paper? just for reference
Comment by Dimbi — 30 January 14 @ 10:53 PM 