The definition of Markov processes, as given in the previous post, is much too general for many applications. However, many of the processes which we study also satisfy the much stronger Feller property. This includes Brownian motion, Poisson processes, Lévy processes and Bessel processes, all of which are considered in these notes. Once it is known that a process is Feller, many useful properties follow such as, the existence of cadlag modifications, the strong Markov property, quasi-left-continuity and right-continuity of the filtration. In this post I give the definition of Feller processes and prove the existence of cadlag modifications, leaving the further properties until the next post.

The definition of Feller processes involves putting continuity constraints on the transition function, for which it is necessary to restrict attention to processes lying in a topological space . It will be assumed that *E* is locally compact, Hausdorff, and has a countable base (lccb, for short). Such spaces always possess a countable collection of nonvanishing continuous functions which separate the points of *E* and which, by Lemma 6 below, helps us construct cadlag modifications. Lccb spaces include many of the topological spaces which we may want to consider, such as , topological manifolds and, indeed, any open or closed subset of another lccb space. Such spaces are always Polish spaces, although the converse does not hold (a Polish space need not be locally compact).

Given a topological space *E*, denotes the continuous real-valued functions vanishing at infinity. That is, is in if it is continuous and, for any , the set is compact. Equivalently, its extension to the one-point compactification of *E* given by is continuous. The set is a Banach space under the uniform norm,

We can now state the general definition of Feller transition functions and processes. A topological space is also regarded as a measurable space by equipping it with its Borel sigma algebra , so it makes sense to talk of transition probabilities and functions on *E*.

Definition 1LetEbe an lccb space. Then, a transition function isFellerif, for all ,

- .
- is continuous with respect to the norm topology on .
- .

A Markov processXwhose transition function is Feller is aFeller process.

Note: Feller processes, as defined here, are sometimes referred to as *Feller-Dynkin* processes (and similarly for Feller transition functions). The term Feller process is sometimes used to refer to the more general class of processes obtained by replacing by the space of continuous bounded functions in the definition above. I am following the terminology used by Revuz and Yor (Continuous Martingales and Brownian Motion).

The first condition says that, if *X* is a Feller process and then, the conditional distribution of depends on in a continuous sense. Also, as vanishes at infinity, if *S* is a compact subset of *E*, then the probability that will vanish conditional on being far away. That is, we can find compact sets such that is as small as we like.

Examples of Feller processes include,

- standard Brownian motion, with the transition function
- Poisson processes of rate , with the transition function

More generally, as we will see in a later post, all -valued processes with stationary independent increments (i.e., Lévy processes) are Feller. In fact, all processes which are continuous in probability and have independent increments, even if they are not stationary, have a Feller space-time process .

Another situation in which Feller processes arise is from stochastic differential equations. Consider the SDE,

() for an *n*-dimensional process *X*, an *m*-dimensional Brownian motion *B*, and Lipschitz-continuous functions . As previously shown, such SDEs have a unique solution for any initial value . We can then define to be the distribution of , in which case *X* is Markov with transition function . By continuity of solutions with respect to the initial value *x*, it can be seen that will be continuous for . Next, as the coefficients are Lipschitz, they cannot grow any faster then linearly as . From this, it can be shown that, by making large, the probability of being in some fixed bounded region can be made as small as possible (this can be proven using a similar method as showing that SDEs with linearly bounded coefficients cannot explode). So, . Furthermore, continuity of such processes implies that will be a continuous function of *t* which, as we will show, implies that is a Feller transition function.

The second condition of Definition 1, that is continuous under the norm topology, can sometimes be quite tricky to prove. Fortunately, this is not necessary, as it turns out to be equivalent to a seemingly much weaker condition. That is, as stated in Theorem 7 below, it is enough to show that as , for each . In particular, in the examples of Feller processes mentioned above, this property is implied by continuity in probability.

Definition 1 generalizes to sub-Markovian transition functions, where it is not assumed that are probability measures. Instead, the inequality imposed, so the probabilities could sum to less than one. As with general sub-Markovian transition functions, it is possible to extend the state space by adding a cemetery or coffin state , and setting . The open sets defining the topology on this space consists of the subsets such that is open in *E*. Equivalently, is an isolated point and the subspace topology on *E* agrees with the original topology. Then, is also an lccb space and defining as before,

(1) |

gives a Feller transition function on describing a process which can jump to the state , and remain there.

The definition of Feller transition functions can be considered in the more general context of continuous linear semigroups on the Banach space . If is a probability measure on *E* then its integral, defines a linear map from to . The converse is given by the Riesz representation theorem, which I state here without proof.

Theorem 2 (Riesz-Markov)LetEbe a locally compact Hausdorff space and be a continuous linear functional. Then, there is a unique regular (finite signed) measure onEsuch that . Furthermore, .

The condition that is regular is not important here, as all finite signed measures on the Borel sigma-algebra are regular in the case of lccb spaces. Then is a probability measure if and only if *L* is positive and . Similarly, a positive linear function uniquely defines a transition kernel *N* such that . The property that , so that *N* is a transition probability is a bit trickier to state in terms of *L*. However, the inequality is equivalent to , allowing us to define sub-Markovian transition functions. A sub-Markovian Feller transition function uniquely defines a strongly continuous semigroup of positive linear operators on such that , simply by setting . Conversely, applying the Riesz representation theorem, every such semigroup arises in this way from a unique sub-Markovian Feller transition function.

The main property of Feller processes, which we concentrate on here, is that they always have cadlag modifications.

The proof of this is given below. A consequence is that any Feller transition function can be realized on the space of cadlag functions from to *E*. This is a big improvement over Theorem 5 of the previous post, which applied to arbitrary transition functions but did not impose any properties on the paths of the process.

Corollary 4LetEbe an lccb space and be the space of cadlag functions from toE. Denote its coordinate process by , let be the sigma-algebra on generated by and, for each , let be the sigma-algebra generated by .

Then, for any Feller transition function and probability measure onE, there is a unique probability measure on with respect to whichXis a Feller process with the given transition function and with initial distribution .

*Proof:* By Lemma 3 of the previous post the measure , if it exists, is unique. We just need to construct one such measure.

Let with coordinate process generating the sigma algebra so that, by Theorem 5 of the previous post, there is a probability measure on with respect to which is Markov with the given transition function and initial distribution. Then, by Theorem 3, has a cadlag modification *Y*, say. As *Y* is cadlag, it defines a map . Let be the measure induced on by this map. So *X* has the same distribution under as *Y* does under , and hence is Markov with the given transition function and initial distribution.

We now move on to the proof that Feller processes have cadlag versions. This will be split up into a couple of lemmas, the idea being to show that certain functions of Feller processes are supermartingales and, hence, existence of cadlag modifications for supermartingales can be applied.

For each , the resolvent of a transition function on a measurable space is the kernel defined by

(2) |

for each bounded measurable . For this to be well-defined, it is only necessary that is measurable, which is true for Feller processes (since it is continuous for ). If and the transition function is Feller then, using dominated convergence inside the integral in (2), for any sequence and for tending to the point at infinity. So, .

The resolvent identity

for follows directly from applying the definition (2) and applying a change of variables. Alternatively, applying the substitution to (2) gives

(3) |

In particular, for a Feller transition function, dominated convergence shows that as for any . Also, equation (3) implies that is a transition probability.

The resolvent helps us find functions of the process which are supermartingales.

Lemma 5LetXbe a Markov process taking values inEand with transition function . Then, for any nonnegative, bounded and measurable and ,

is a supermartingale.

*Proof:* Expression (2) for the resolvent gives

(4) |

So, if *X* is Markov with the given transition function and *f* is nonnegative then,

Applying this to the conditional expectation of *M*,

Next, it can be shown that if has a cadlag modification for enough functions , then *X* itself must have a cadlag modification. This does, however, require the state space to be compact. Although Feller processes are defined above for arbitrary lccb spaces, it is always possible to reduce it to the case where *E* is compact. This can be done by taking the one-point compactification of *E*. Then,

(for bounded measurable ) defines a Feller transition function on which reduces to on *E*. This describes a process which is either in *E* behaving according to the transition function , or is fixed at the point at infinity.

Lemma 6Let be a compact topological space and be a countable sequence of functions separating points inE.

Then, a stochastic processXtaking values inEhas a cadlag modification if has a cadlag modification for all .

*Proof:* We first show that a sequence in *E* is convergent if and only if converges for every , and that is the unique element of *E* such that for all . By continuity, this is a necessary condition for convergence and for *x* to be the limit, so we only need to show that it is sufficient. By compactness, every sequence has at least one limit point, and a sequence converges if and only if the limit point is unique. So, suppose that converges for all and let be limit points of . Then, and, as *S* separates points, as required. Next, if *x* is any point satisfying for then, . Since *S* separates points, we have as promised.

Now suppose that has a cadlag modification , say, for each *k*. Then, restricting to a set of probability one, for all and . It follows that , restricted to , is right-continuous and has left and right limits at all nonnegative reals. By the condition above for convergence in *E*, is also right-continuous with left and right limits everywhere in . So, we can define the cadlag modification

Using Lemma 5 to find functions such that has a cadlag modification and applying Lemma 6 to find a cadlag version gives us the proof of Theorem 3.

*Proof of Theorem 3:* We first prove this in the case where *E* is compact. Then, for each nonnegative , the process is a supermartingale and, if it is right-continuous in probability, it will have a cadlag modification. In fact, is right-continuous in probability for any . This follows from the criterion that a sequence of real-valued random variables converges to a limit *Y* in probability if and only if for each continuous and bounded . If then,

and in probability.

As *E* is an lccb space, there exists a countable set of nonnegative functions separating the points in *E*. Then, has a cadlag modification and, since as , the countable set also separates the points of *E*. By Lemma 6, *X* has a cadlag modification.

Finally, let us consider the case where *E* is not compact. Then, letting be its one-point compactification, *X* can also be considered as a Feller process with state space . So, we may pass to a cadlag modification taking values in , and it only needs to be shown that and for all *t*, with probability one. To prove this, choose a strictly positive and , and consider the submartingale . The function will be strictly positive on *E*, and can be extended to all of by setting . Letting be the first time at which , we can use optional sampling to get

for each and . Letting *n* increase to infinity,

As with probability one, is strictly positive and, therefore, . As this holds for each *t*, almost surely, meaning that is almost surely positive for all *T* and, hence, and for all *t*.

We now show that it, in the definition of Feller processes, it is not necessary to impose the condition that is continuous under the norm topology, for . In fact, the much weaker condition that as is enough.

Theorem 7A transition function on an lccb spaceEis Feller if and only if, for all ,

- for all .
- as , for each .

*Proof:* That these properties are necessary is trivial, so we just prove that they are also sufficient. To show that is norm-continuous, it suffices to prove continuity at 0. In that case, if then,

as required. Now, letting *A* be the set of all such that as , the idea is to show that every element of is in *A*. In fact, *A* is closed under the norm topology. If converge uniformly to a limit *f* then,

As the right hand side tends to which, by choosing *n* large, can be made as small as we like. So, tends to 0, and .

To find a sequence in *A* converging to any given , we make use of the resolvent. If then, by (4),

As and are transition probabilities, this gives and .

as . So, . It only remains to show that can be expressed as the uniform limit of such elements of {\t A}.

Choosing a sequence , consider . By assumption, as , for all . Then, applying dominated convergence in the integral in (3) gives . This is not quite enough, as we need to find a senquence converging uniformly to . However, Lemma 8 below states that it is possible to improve pointwise convergence to norm convergence just by passing to convex combinations. So, there is a sequence in *A* converging uniformly to , which must therefore also be in *A*, as required.

Finally, the proof of Theorem 7 required the following lemma, which allows us to strengthen the convergence of a series in from pointwise to uniform convergence simply by passing to convex combinations. The proof of this is rather non-constructive, relying on the Hahn-Banach theorem for locally convex spaces and the Riesz representation theorem (Theorem 2) to give a proof by contradiction. The notation denotes the set of finite linear combinations of elements of a set *S*.

Lemma 8LetEbe a locally compact space and be a uniformly bounded sequence of functions in converging pointwise to the limit . That is, for each .

Then, there is a sequence converging uniformly to .

*Proof:* The result is equivalent to the statement that is in the norm closure *S*, say, of . Suppose that this was not the case. Then, by the Hahn-Banach theorem, there exists a linear and norm-continuous map such that for some constant *K* and all . Then, the Riesz representation theorem gives a finite signed measure satisfying for . However, using bounded convergence, this leads to the following contradiction,

Mik – I moved your comment to the About page, and responded there, as it was not relevant to this post.

Comment by George Lowther — 16 July 10 @ 11:38 PM |

Dear George, thank you for this great post!

Do you think that it might be possible to include examples of processes that are almost Feller – and see how the properties that you prove in this series of posts break for these almost-feller processes ?

Best

Alekk

ps: any reference except Revuz-Yor and Rogers-Williams ?

Comment by Alekk — 24 July 10 @ 4:15 PM |

Good idea! I’ll give some thought towards writing a post on “almost” Feller to show how the properties proven here can fail (i.e., existence of cadlag versions, the strong Markov property, right-continuity of the filtration and quasi-left-continuity).

The next post is going to be on Bessel processes, which *are* Feller. After that, I am planning on a post demonstrating a simple SDE whose solutions are local martingales but fail to be proper martingales. Coincidentally, they also just fail to be Feller processes but, as they satisfy all the properties I have proven for Feller properties, it doesn’t really give the kind of counterexample you are asking for.

Almost-Feller processes could fit into any of the following categories.

1) P

_{t}f(x) is jointly continuous in t and x but not in C_{0}(E) for f ∈ C_{0}(E).2) The definition of Feller process is satisfied with C

_{b}(E) in place of C_{0}(E). The is, P_{t}f is continuous for bounded continuous f, and t → P_{t}f is continuous under the uniform norm.3) t → P

_{t}f is uniformly continuous for f ∈ C_{0}(E), but P_{t}f(x) is not continuous in x.4) E is not lccb. Either because is not locally compact or because it does not have a countable base.

Examples satisfying (1) and not having cadlag modifications are easy to construct. Just take something standard like Brownian motion or reflecting Brownian motion and remove the point {0} from the state space. Seems like a bit f a cheat, but I think all examples are like this. They are Feller processes in a larger state space with some points removed.

I think processes in class (2) satisfy all the properties of Feller processes anyway. You can pass to a larger space on which it is Feller (using the Gelfand representation) and then show that it doesn’t hit the additional points in the state space anyway (I think…).

Processes in class (3) are easily constructed which fail the strong Markov property. E.g., consider a real-valued process which either stays at zero or is a Brownian motion. This has transition function

but fails the strong Markov property at times T with X

_{T}=0. Again, I think all examples are similar to this in that they can be considered as processes on a larger space, but with some points identified. You can also construct examples in a similar way in which the completed filtration is not right-continuous.I need to think about (4).

Any nice examples you or anyone else know of would be gratefully accepted!

Comment by George Lowther — 25 July 10 @ 10:13 PM |

For references, I’m not sure what is best. I am mainly working through the subject from the perspective of semimartingales in these notes, rather than getting deeply into Markov process theory. All the results I cover here are included in Revuz-Yor. Checking Kallenberg, Foundations of Modern Probability, I see that it has a chapter on Feller Processes and Semigroups.

Comment by George Lowther — 25 July 10 @ 10:51 PM |

Hi George,

I was curious about what are necessary conditions that can ensure that the transform by a fonction of a Markov (or Feller) Processes ensures that it stays Markov (or Feller).

In particular, for a Markov (or Feller) process defined by an SDE (with coefficients s.t. we have existence and uniqueness), is there necessary conditions that would lead to an explicit calculation (using the coefficients of the SDE or the infinitesimal generator of the process itself and of its transform by ).

The question comes from the fact that I know a few sufficient conditions, but when those are not fullfiled, I feel it is always a case by case study that determines if yes or no the transformed process is Markovian (or Feller), and no general methodology is applicable.

As asked the question is not really completley “well posed” but an answer with additional assumptions on the process itself would still be interesting.

Best regards

Comment by TheBridge — 9 February 12 @ 12:01 PM |

Sorry for not responding quicker to this question. I did see it, but didn’t have a good answer immediately. I don’t think that there is a good answer to this though. You can easily construct necessary conditions and sufficent conditions on F, but I don’t think there are any useful conditions which are both necessary and sufficient. So, it does depend on your particular application and I think you are right that requires a case by case study.

Comment by George Lowther — 22 February 12 @ 2:09 AM |

Hi George,

Thank’s for your answer indeed the problem seems quite difficult to tackle. But I wonder if it is not linked in some way to Lie Group classification of SDE ( you can take a look at Kozlov article and the references therein : “The Group Classification of a Scalar Stochastic Differential Equation” -JOURNAL OF PHYSICS A: MATHEMATICAL AND THEORETICAL,J. Phys. A: Math. Theor. 43 (2010) 055202 (13pp)).

In particular if the group of transformation preserves the Markov property ( I’m not sure about this) and if it can be showed that only those transforms can do so ( even less sure about this), then there’s might be some hope.

Best regards

Comment by TheBridge — 24 February 12 @ 9:29 AM |

Hi there, I really like your blog!

I am an econometrician myself, and I appreciate your blog for the intuition that is not provided by many text books. I am actually writing a paper about optimal stopping in higher dimensions, where I actually use Feller processes. But I’m not an expert. I was hoping perhaps you could help me find a reference for a question, that I believe is possibly relatively simple, but haven’t been able to prove it / find a reference. Any suggestions are greatly appreciated.

If we have a Feller process with an infinitesimal generator and resolvent , then we have uniformly as for any . We also have the ‘inverse property’ of the resolvent for any . Multiplying by and re-writing this equality, we obtain . This shows that for some as if and only if and . This property clearly holds for any if is .

But now suppose is only locally in the neighbourhoud of the point , where we evaluate the limit , but elsewhere it is only . Specifically, in my case, it has a kink on a hyper surface of measure zero. But we evaluate away from this hyper surface, i.e. we evaluate the limit where and . Does the result remain true? I believe it does, perhaps by Dynkin’s formula, since is in any ball around . Any ideas to the truth of the statement and/or possible references?

Many thanks & best wishes!

Comment by Rutger-Jan Lange — 23 September 16 @ 1:51 AM |

PS Perhaps the last sentence of my second paragraph is a bit unclear. I meant to say if and only if as well as are true at the specific location (and in its infinitesimal neighbourhood for the derivatives to make sense) where we evaluate the limit. Does this statement remain true if is only locally , i.e. in a small ball around the location , where is identically zero in this ball, such that and ? (I believe so, but can only find proofs if is globally and vanishes at infinity, i.e. if is , which doesn’t hold for me.)

Comment by Rutger-Jan Lange — 23 September 16 @ 5:39 AM |

Hi. I’m not sure where you would find a precise statement of what you are asking. However, I do have a couple of comments. First, you are using where I think you need to be using the domain of A. So, I suppose that you are only considering generators with in their domain. Such generators can be expressed as a continuous diffusion term plus a jump term (Revuz & Yor state this in their book, Continuous Martingales and Brownian motion). Your later comment suggests that you are only considering the continuous case (i.e., where you say that Af(x)=0 if f is 0 in a neighbourhood of x). Either way, if f is only in a neighbourhood of x, then Af is strictly speaking not well defined as f is not in the domain of A. However, you can define Af in the region as the non- points will only contribute to the jump component, which does not require smoothness properties. As you say, it then reduces to the case where f is 0 in the neighbourhood of x. I think you just need to show that, in that case, . This certainly looks like it should be true. Or, prove directly that which, I think, follows from .

Btw, I moved your comment to the relevant post.

Comment by George Lowther — 28 September 16 @ 9:25 PM |

If and is well defined, you can use.

Comment by George Lowther — 28 September 16 @ 9:32 PM |

Many thanks! I am indeed considering Feller diffusions processes, i.e. no jump component, for which the domain of includes (at least) the space of functions (indeed as in Revuz and Yor). I should have said for some function in the domain of , so I was sloppy. Yes I agree that looks like it should be true if is locally identically zero, i.e. in some ball around of strictly positive radius. To me, it seems irrelevant what happens outside of the ball as long as remains continuous and bounded everywhere (we could still say the resolvent maps to itself, right?). Thanks for the derivation as well (last line should probably have rather than ?) I will let you know when our paper is ready and mention you in the acknowledgements. If you would be interested in reading the draft, then let me know. It’s about a new (we believe) class of monotone and uniformly converging algorithms for calculating the value function for optimal stopping problems in higher dimensions, which are considered hard because they feature free boundaries, using (global) resolvent/integral methods rather than (local) PDE/finite difference methods. Best wishes, Rutger-Jan

Comment by Rutger-Jan Lange — 30 September 16 @ 11:47 AM

I was convinced but now I’m having second thoughts… Again take and assume for all inside some ball of radius centred at some location . Clearly, for any . The question remains whether this implies . While this seems plausible, it is not obvious that the derivatives converge to zero… At least, it is not *generally* true that the derivative converges to zero if the value converges to zero.

Comment by Rutger-Jan Lange — 30 September 16 @ 4:16 PM |

It is true in the context you ask for. I can provide more details later.

Comment by George Lowther — 30 September 16 @ 4:25 PM |

That is all I need, so that would be great!! Curious what your solution is. I intuitively believe it to be true, but that’s not good enough, so many thanks for your help.

Comment by Anonymous — 30 September 16 @ 5:05 PM |

Consider the following statements.

i) If exists for some And then as .

ii) If is zero in a neighbourhood of then .

Combining these gives what you want. For the first, use

For the second, choose with And in a neighbourhood of . Then, so,

Comment by George Lowther — 30 September 16 @ 6:32 PM

So I was on travels, but that’s great, thanks! In (ii) you wrote and , was that intentional? Can we take both in ? To make things easier, suppose is actually non-negative, so we can take , with and and both and equal to zero in the neighbourhood of . Then using your argument we can write , since in the vicinity of , and we can use all the nice resolvent properties on , right?

Comment by Rutger-Jan Lange — 8 October 16 @ 7:41 PM |

I think I may have got a bit mixed up with the indices there — was intended to be bounded, measurable, and vanishing at infinity. is twice continuously differentiable and vanishing at infinity. The argument should generalise to arbitrary bounded measurable though, with a bit more work.

Comment by George Lowther — 9 October 16 @ 11:59 PM |

Dear George, is a Hawkes process a Feller process?

Comment by Hamilton Bellman — 31 December 17 @ 12:41 AM |

In Theorem 2 (and Lemma 8), I think you need to assume additionally that the space is -compact to be able to conclude that the measure is finite. The measure is in general only locally finite.

Thank you for your amazing blog.

Comment by Matti Kiiski — 6 May 18 @ 5:18 PM |

Nevermind, the finiteness is indeed guaranteed by the continuity.

Comment by Matti Kiiski — 6 May 18 @ 10:48 PM |