A random variable has the standard ndimensional normal distribution if its components are independent normal with zero mean and unit variance. A well known fact of such distributions is that they are invariant under rotations, which has the following consequence. The distribution of is invariant under rotations of and, hence, is fully determined by the values of and . This is known as the noncentral chisquare distribution with n degrees of freedom and noncentrality parameter , and denoted by . The moment generating function can be computed,

(1) 
which holds for all with real part bounded above by 1/2.
A consequence of this is that the norm of an ndimensional Brownian motion B is Markov. More precisely, letting be its natural filtration, then has the following property. For times , conditional on , is distributed as . This is known as the `ndimensional’ squared Bessel process, and denoted by .
Alternatively, the process X can be described by a stochastic differential equation (SDE). Applying integration by parts,

(2) 
As the standard Brownian motions have quadratic variation , the final term on the righthandside is equal to . Also, the covarations are zero for from which it can be seen that
is a continuous local martingale with . By Lévy’s characterization, W is a Brownian motion and, substituting this back into (2), the squared Bessel process X solves the SDE

(3) 
The standard existence and uniqueness results for stochastic differential equations do not apply here, since is not Lipschitz continuous. It is known that (3) does in fact have a unique solution, by the YamadaWatanabe uniqueness theorem for 1dimensional SDEs. However, I do not need and will not make use of this fact here. Actually, uniqueness in law follows from the explicit computation of the moment generating function in Theorem 4 below.
Although it is nonsensical to talk of an ndimensional Brownian motion for noninteger n, Bessel processes can be extended to any real . This can be done either by specifying its distributions in terms of chisquare distributions or by the SDE (3). In this post I take the first approach, and then show that they are equivalent. Such processes appear in many situations in the theory of stochastic processes, and not just as the norm of Brownian motion. It also provides one of the relatively few interesting examples of stochastic differential equations whose distributions can be explicitly computed.
The distribution generalizes to all real , and can be defined as the unique distribution on with moment generating function given by equation (1). If and are independent, then has moment generating function and, therefore, has the distribution. That such distributions do indeed exist can be seen by constructing them. The distribution is a special case of the Gamma distribution and has probability density proportional to . If is a sequence of independent random variables with the standard normal distribution and T independently has the Poisson distribution of rate , then , which can be seen by computing its moment generating function. Adding an independent random variable Y to this produces the variable .
The definition of squared Bessel processes of any real dimension is as follows. We work with respect to a filtered probability space .
Definition 1 A process X is a squared Bessel process of dimension if it is continuous, adapted and, for any , conditional on , has the distribution.
Substituting in expression (1) for the moment generating function, this definition is equivalent to X being a continuous adapted process such that, for all times ,

(4) 
This holds for all with nonnegative real part. Also, if the filtration is not specified, then a process X is a Bessel process if it satisfies Definition 1 with respect to its natural filtration .
Note that we have not yet shown that Bessel processes for arbitrary noninteger are welldefined. Definition 1 specifies the properties that such processes must satisfy, but this does not guarantee their existence. It is not difficult to show that (4) determines a Markov transition function, so that the ChapmanKolmogorov identity is satisfied. In fact, as I show below, it is Feller. See Lemma 8 below, where the existence of continuous modifications is also proven.
For now, let us determine some of the properties of Bessel processes. There are some properties which can be stated directly from the definition. The fact that the sum of independent and distributed random variables has the distribution gives the following result for sums of Bessel processes.
Lemma 2 Suppose that X and Y are independent and processes respectively. Then, X+Y is a process.
Next, Definition 1 only referred to the ratio between the process X and the time increments. So, scaling the time axis and the process values by the same factor leaves the property unchanged.
Lemma 3 Let X be a process and be constant. Then, is also a process.
Taking the limit in expression (4) gives us the probability that is equal to 0.

(5) 
So, a process has a positive probability of hitting 0 in any nontrivial time interval. Furthermore, since (5) gives , once it hits zero it remains there. That is, 0 is an absorbing boundary.
The case for is different. Equation (5) says that has zero probability of being equal to 0 at any given time. This does not mean that X cannot hit zero but, rather, that the total Lebesgue measure of its time spent there is zero,
In fact, as we will see, the process does hit zero for all values of n less than 2 so, for , 0 is a reflecting boundary.
We now show the equivalence of the definition of squared Bessel processes in terms of the skew chisquare distribution given in Definition 1 and in terms of the SDE (3). In particular, this demonstrates that (3) satisfies uniqueness in law, which we show by using Ito’s lemma to derive a partial differential equation for the moment generating function.
Theorem 4 For any nonnegative process X and real , the following are equivalent,
 X is a process.
 is a local martingale and .
 X satisfies the SDE
(6)
for a Brownian motion W (in the case n=0, it is necessary to assume the existence of at least one Brownian motion on the underlying filtration).
Proof:
(1) implies (2): The moments of a random variable can be computed by expanding and and comparing powers of . From this, we see that Z has mean and variance . So, for the squared Bessel process, has mean and variance conditional on (). Therefore, is a local martingale. Note that it can fail to be a proper martingale, since is not required to be integrable. Also,
Comparing this with the following expression
shows that is a local martingale. By properties of quadratic variations of local martingales, . As the finite variation term nt does not contribute to the quadratic variation, we have as required.
(1) implies (3): By the argument above, for a continuous local martingale M with quadratic variation . One of the consequences of Lévy’s characterization is that, assuming that there is at least one Brownian motion defined on the underlying filtration,
for a Brownian motion W. It just needs to be shown that, if n is greater than zero, there does exist such a Brownian motion. Set
which is a local martingale. Its quadratic variation is
As we have already shown that X is nonzero almost everywhere, this gives and, again by Lévy’s characterization W is a Brownian motion.
(3) implies (2): First, is a local martingale. Then, using gives as required.
(2) implies (1): The idea is to derive a partial differential equation for the moment generating function of , and show that the solution is given by (4). Using the fact that has quadratic variation , Ito’s lemma gives
for constant and times . The final term is a local martingale and is bounded on finite time intervals (as all the other terms are). So, it is a proper martingale. Multiplying by a bounded measurable random variable Z and taking expectations,
We now introduce the function and, noting that this has partial derivative ,
So, is continuously differentiable over and, by differentiating the above equation, it satisfies the following partial differential equation
This is a transport equation and can be simplified by replacing with a time dependent function satisfying ,
This is just an ordinary differential equation with the unique solution,

(7) 
The ODE for is also easily solved, with the unique solution . Using this, the following expressions can be calculated,
Substituting back into (7) and using gives equality (4) as required.
It is natural to ask about the properties of Bessel process paths such as, does it ever hit zero? Also, what happens to in the limit as t goes to infinity? We show below, in Theorem 6, that for the process hits zero at arbitrarily large times and, for it never hits zero. Furthermore, for it tends to infinity. The idea is to transform the process into a local martingale , so that standard convergence results for continuous local martingales can be applied. The function f is called the scale function of X.
Lemma 5 Let X be a process satisfying the SDE (6) with and make the following substitution.
 If , let be the first time at which so that is the process stopped when it first hits zero, and set with
(8)
Then, Y satisfies the SDE
(9)
 if then satisfies
(10)
 if then with a, b, c as in (8) satisfies
(11)
Although the substitutions above for are not defined when X=0, X remains strictly positive in this case.
Proof: For any , let be the first time at which , so and the stopped process is strictly positive. As is a smooth function on , Ito’s lemma can be applied to on the interval ,
Here, the expression has been used. If then substituting expression (8) for b makes the last term on the right hand side equal to zero. So,
Using gives according to whether b is positive () or negative (). So,
on the interval . Letting n increase to infinity, we see that equations (9) and (11) are satisfied on the interval . If then , so is equal to zero over the interval , and (9) is satisfied. On the other hand, if then so Y explodes to infinity at time . However, is a nonnegative local martingale and cannot explode in a finite time. This is a consequence of Fatou’s lemma,
So, Y is almost surely bounded and .
Now, consider the case with and . As is smooth on , Ito’s lemma can be applied on the interval , in a similar way as above,
As above, this holds over the interval and it needs to be shown that almost surely. Letting be the first time at which for some constant K, Y is a local martingale bounded above over the interval . Applying Fatou’s lemma to –Y,
However diverges to at time , so for large enough n (almost surely). Letting K increase to infinity, goes to infinity, showing that is almost surely infinite, as required.
Using the transformation given by Lemma 5 together with the convergence of continuous local martingales, it is possible to say whether a process hits zero and to determine and , for each value of n.
Theorem 6 Let X be a process. With probability one,
 if then X hits 0 at some time, and remains there.
 if then X hits zero at arbitrarily large times and .
 if then X is strictly positive at all positive times and , .
 if then X is strictly positive at all positive times and as .
Recall that the convergence theorem for continuous local martingales states that the events on which a continuous local martingale Y converges to a finite limit, the event on which , and on which , are all identical (up to zero probability sets). This fact is used several times in the following proof.
Proof: The case with is easy. We have already shown that it hits zero with probability by time t conditional on and that, once it hits zero, it stays there. Letting t increase to infinity this converges to 1, so it hits zero almost surely at some positive time.
Let us show that for all . From the definition, has the distribution conditional on , which can be written as the sum of independent random variables and . So, for any constant ,
as , so .
For it was shown in Lemma 5 that X never hits 0 at any time conditional on . As it has zero probability of being equal to zero at a time , this shows that it never hits zero on and, letting t decrease to zero, it never hits zero at any positive time.
Now, consider . As shown in Lemma 5, is a nonnegative local martingale for some continuous and strictly increasing function satisfying , where is the process stopped when it first hits zero. As this trivially satisfies , martingale convergence implies that exists and is finite almost surely. However, as we have just shown, is infinite, so at large enough times. This can only happen if the process hits zero, after which Y is stopped at zero.
Alternatively, if , then is a local martingale for . As shown above, is infinite so, by martingale convergence, . Therefore, .
Finally, it only remains to show that for . Again, using Lemma 5, is a nonnegative local martingale for a strictly positive and continuous function f satisfying as . As above, martingale convergence implies that exists almost surely. However, since we have already shown that , it follows that . So, as , giving as required.
An immediate consequence of the preceding result is to ndimensional standard Brownian motion. For positive integers n, the squared Bessel process was introduced above as the squared magnitude of an ndimensional Brownian motion. In one and two dimensions, we see that Brownian motion is recurrent. That is, with probability one, it enters every nonempty set at arbitrarily large times. As has a countable base for the topology, it is enough to prove this for a countable collection of open sets and, by countable additivity of probability measures, it is equivalent to saying that X almost surely enters the open set U at arbitrarily large times, for each open individually.
In three or more dimensions, Brownian motion is not recurrent. In fact, it diverges to infinity with probability one.
Theorem 7 Let B be an ndimensional Brownian motion. With probability one,
 if then B is recurrent.
 if then as .
Proof: For , let be a nonempty open set and choose . Then, is a process. By Theorem 6, , so at arbitrarily large times.
If then is a process so, by Theorem 6, as .
Finally for this post, I show that Bessel process are well defined Markov processes and that continuous modifications do indeed exist.
Lemma 8 Fix any and, for each , let be the transition probability on such that, for each , has moment generating function given by
(12)
Then, is a Feller transition function. Furthermore, any Markov process with this transition function has a continuous modification, which is then a process.
Proof: To show that is a transition function, it is only necessary to prove the ChapmanKolmogorov equation . This can be verified by directly computing the moment generating functions. Setting ,
so as required.
We now move on to the proof that this is a Feller transition function. It needs to be shown that for any in the set of continuous functions vanishing at infinity, then and as . First, if for a constant then
which satisfies the required properties. This extends to all by uniformly approximating by linear combinations of such functions (see Lemma 9 below).
It only remains to show that a Markov process X with the transition function has a continuous modification. Any such process automatically satisfies (4) and, if it is continuous, is a squared Bessel process by definition. By the existence of cadlag modifications for Feller processes, we may assume that X is cadlag. It just needs to be shown that the jumps are almost surely equal to zero. If are a sequence of times with , then . From this, the following inequality is obtained, bounding the maximum jump of X over the interval .

(13) 
We show that the left hand side converges to zero in probability. The moment generating function of can be computed from (4),
where . The expected value of conditional on can be calculated by expanding this out as a power series and looking at the coefficient of . This is a bit messy but, noting that the expression above expands in terms of and , we see that the coefficient of will be equal to multiplied by a polynomial in t–s, s and . Therefore,
over any bounded range for s and t. Taking the expected value conditional on , it follows that the left hand side of (13) goes to zero at rate . So, is almost surely zero, as required.
The following result for uniformly approximating continuous functions on was used in the proof that is a Feller transition function.
Lemma 9 The set of linear combinations of functions of the form for is dense in (in the uniform norm).
Proof: Any can be written as where is continuous with . By the StoneWeierstrass approximation theorem there are polynomials converging uniformly to g on the unit interval. Replacing by if necessary, we can suppose that . Then, are linear combinations of functions of the form for and converges uniformly to .
Dear George,
how would you (efficiently) simulate a BES^2_n process in such a manner that the process does stay positive (for non integer dimension n, indeed).
thanks again for these great posts!
Best
PS: tiny typo in the definition of in equation (8)
Comment by Alekk — 28 July 10 @ 2:56 PM 
Hi. I can give a quick answer now, but I’m not going to have much time to go into details for a few days.
1) You can sample a Bessel process exactly at a fixed sequence of times, by sampling from the skew chisquare distributions. As I explained near the top of the post, a distribution can be written as where, Y is , Z_{i} are standard normal and N is Poisson of rate . Equivalently, conditional on N. See the paper Exact Simulation of Bessel Diffusions which includes this and other related methods (actually, I found that paper while searching for another one which I remember from several years ago but forget the title. I’ll come back with the link if I remember…)
2) You can numerically simulate the SDE (3), which only gives an approximate solution, but will likely be much faster if you want to sample it at a dense set of times. It can be done by an Euler scheme. However, if the process is close to zero then the simulation could jump below zero. Then you would have to max with zero, which would bias the distribution causing the drift to increase as the process gets close to zero. This would cause the approximation to converge very slowly in the number of time steps used when n is small. A better way would be to modify the Euler scheme so that you are sampling positive numbers from the start while matching all the moments of dX up to O(dt^{k}), some appropriate k. For example, a trinomial distribution could be used to match all moments up to O(dt^{3}), leading to an O(dt^{2}) error term overall.
Comment by George Lowther — 29 July 10 @ 2:17 AM 
and, what’s the typo?
Comment by George Lowther — 29 July 10 @ 2:32 AM 
thank you for these very interesting answer, and the reference !
I remember a discussion where someone advised me to take the scale function of a diffusion and to try to simulate : this would often eliminate the problem of staying positive (boundary effect), say, and would often be much more precise: I do not remember really well the motivation for that, though. From a practical point of view, why would be worse than taking the scale function in this problem, for example. In one case is obviously a martingale, but I do not understand why this is better. Could you shed some light on this issue ?
Thank you for this fantastic blog!
PS: tiny typo in the definition of above eq 8, no ?
Comment by Alekk — 31 July 10 @ 4:12 AM 
I’m not sure why applying the scale function should eliminate any boundary effect. If you have a squared Bessel process of dimension n < 2, then the scale function is a positive power of X (Lemma 5 above). So it still has a boundary at zero. Also, if n=2, then the scale function is log(x). This does remove the boundary, but the SDE for Y=log(X) is dY = e^{Y} dW which has exponentially growing coefficients as Y goes negative. That does not look very promising from the point of view of obtaining accurate simulations.
In general, whether or not transforming by the scale function improves the simulations must depend on what particular SDE you start with and what simulation method you are using.
Another point: if f is the scale function then Y=f(X) does not have to be a martingale. In fact, for squared Bessel processes of dimension n > 0, this is never the case. For n < 2 it has a reflecting boundary, so only becomes a martingale if you stop the process at the boundary. For n>=2 the boundary is never attained but, still, Y is not a martingale. It’s just a local martingale.
[Apols. for not responding sooner. Not had chance to log on to my machine the last week].
Comment by George Lowther — 7 August 10 @ 3:42 AM 
Also, thanks for mentioning that the transformation I use in Lemma 5 is called the scale function. I updated the text to mention this.
Still don’t see the typo. I have Y = (aX^{τ})^{b}, which is what was intended. Maybe you are thinking that a should be outside the parentheses? But, that’s not true with my definition of a.
Comment by George Lowther — 7 August 10 @ 3:48 AM 
Thank you for all these very interesting comments (and apologize for this very late answer): seems like I have been a little bit optimistic with this “scale function” transformation!
Btw, numerical simulations of SDEs seem to be a very broad area: do you plan on writing on this ?
Comment by Alekk — 14 August 10 @ 6:48 PM 
I don’t have any immediate plans to post about numerical simulations. Maybe, at some point. I’ll bear your suggestion in mind though.
Comment by George Lowther — 15 August 10 @ 11:49 PM 
Hi George,
Great blog btw, really helpful and concise in clarifying many important concepts of stochastic processes. Just wanted to point out a small typo in Theorem 4 (3) > (2): I think you want {X_t – nt = 2 \int_0^t \sqrt{X_s} dW_s}.
Comment by g — 15 May 12 @ 10:35 PM 