About

Almost Sure – (adj) true on a set of probability one.

Welcome to my maths blog, almost sure, which will contain posts on a variety of mathematical subjects. Although my background is mainly in stochastic processes, the aim of this blog is to discuss any interesting mathematical topics that arise. This could involve approaches to tackling unsolved theorems, any new techniques applied to old problems, or any other maths which seems to be worth posting. Hopefully, others will find at least some of what I post here to be interesting and worthy of discussion. Any feedback and constructive criticism is very welcome. For help on formatting equations, see LaTeX in Comments.

George Lowther

61 thoughts on “About

  1. Hi.
    I have found an answer you gave on mathoverflow.
    http://mathoverflow.net/questions/1294?sort=votes#sort-top
    I have a similar problem that relates to that question.
    I do genomic research. Modern techniques allows to sequence the genome at relatively low cost. They produce short sequences randomly (and let’s say uniformly) distributed reads. What I need is to know the distribution of distances between two consecutive reads, knowing the total length of the DNA molecule and the total number of reads.
    If we generalize the problem, we could ask:
    What is the distribution of distances among two consecutive points randomly distributed over linear space, knowing its average distance is k?

    I know it is somehow related to the Poisson distribution, but I do not want to know how many times an event occurs, but the distance between two events. My knowledge of probability is very poor, maybe you can give me a hint.

    going back to the original post, what is N in the formula?
    I guess:
    n = number of points
    d = required minimal distance
    L = total length
    N = ?

    Thank you

    1. Hi. I assume that when you talk about points “randomly distributed over linear space”, you mean some fixed number of points uniformly distributed on a line segment? Then, the sample average distance is k = length / (number of points + 1). Or do you mean something different? Either way, at least in the limit as number of points→infinity, the distances between points should be approximately exponentially distributed (like with the Poisson process).

      Also, to answer “what is N in the formula?” N=n. Just a typo:)

    2. In fact, your question, if I understood it properly, does have a very simple answer. Cut a line segment of length L at N uniformly chosen random points, and pick one of the N+1 pieces at random. By a symmetry argument, you may as well pick the first piece. Then, letting X be the length of the segment, the probability that X>x is the same as the probability that none of the random points are in the initial interval [0,x]. By independence, this has probability P(X>x)=(1-x/L)N. We can normalize X by the average length k=L/(N+1) to get

      P(X/k > x) = (1-x/(N+1))N.

      As I hinted in my other reply, for large N this is approximately exponential,

      P(X/k > x) ≈ exp(-x).

  2. I googled for the integration of correlated processes in hope to get some ideas on how to proceed next with my problem. And that is how I uncovered your blog. Great description of stochastic calculus. I think I am getting better understanding of many difficult spots in stochastic calculus because you chose to explain the topics from a different angle. I will be reading your blog regularly…

    If you can provide some clues I would greatly appreciate this.

    So here is a problem. Suppose we have two correlated processes Y and X. We also have, in general, deterministic time-dependent functions a(t) and rho(t). And we have Brownian motions W1 and W2. Here are the equations in differential form:

    \setlength\arraycolsep{2pt} \begin{array}{rl} \displaystyle dY(t) &\displaystyle= X(t) Y(t)\,dW_1\\ \displaystyle dX(t) &\displaystyle= a(t) X(t)\,dW_2\\ \displaystyle dW_1\,dW_2 &\displaystyle= \rho(t)\,dt \end{array}

    The task is to get an integral for Y(t) = … If there was no correlation (rho=0) then we would end up with double integral, I guess. But how to proceed if we have a correlation, any idea?

    Thank you!

    1. Welcome Mik.

      There’s three points which come to mind,

      • Most SDEs don’t have an explicit solution. It is only very special case which do, such as geometric Brownian motion (which your process X follows), Ornstein-Uhlenbeck processes, etc. You can usually prove useful properties, find approximations, etc. But, to actually construct the solution requires Monte-Carlo simulation or numerically solving a PDE. I suspect that this is the case here.
      • Actually, here, I do recognize this SDE. Your process Y is the same as that followed by an asset price (stock price, FX rate, etc) under a stochastic volatility model with a lognormal vol process X. Such models are sometimes used in practice, normally with additional drift terms in the SDE. This further suggests that there is no explicit solution because, if there was, it would be well known.
      • We can make some steps towards putting the SDE in a more easily understandable form by rewriting it in terms of independent driving BMs. This can always be done, by a form of Gram-Scmidt orthonormalization. As it is easy to solve for X here, which is just the lognormal process

        \displaystyle X=X(0)\exp\left(\int a(s)\,dW_2(s)-\frac12\int a(s)^2\,ds\right),

        we really just need to rearrange it to get a simpler or easier to understand form for Y.

      I’ll give this third point a try. The SDE for Y would just give a lognormal process for X, if X were deterministic. So, let’s start by taking its logarithm Z=\log(Y). By Ito’s lemma,

      \displaystyle dZ(t) = X(t)\,dW_1(t) - \frac12 X(t)^2\,dt.

      Next, introduce the BM

      \displaystyle dW_3 =\frac{1}{\sqrt{1-\rho^2}}\left(dW_1-\rho\,dW_2\right)

      which is independent of W2 (this is essentially Gram-Schmidt orthonormalization).
      Putting back into the expression for Z,

      \displaystyle dZ(t) = X(t)\sqrt{1-\rho^2}\,dW_3 + \rho(t) a(t)^{-1} dX(t) - \frac12X(t)^2\,dt.

      This can be integrated out,

      \displaystyle Z(t)=Z(0)+\int_0^t X(s)\sqrt{1-\rho(s)^2}\,dW_3 + \int_0^t \rho(s)a(s)^{-1}\,dX(s) - \frac12\int_0^tX(s)^2\,ds

      So, conditional on X, Z(t)=log(Y(t)) will be normal with mean and variance

      \setlength\arraycolsep{2pt} \begin{array}{rl} &\displaystyle\mu = \int_0^t \rho(s)a(s)^{-1}\,dX(s) - \frac12\int_0^t X(s)^2\,ds,\\ &\displaystyle\sigma^2 = \int_0^t X(s)^2(1-\rho(s)^2)\,ds. \end{array}

      Being lognormal, X is easy to simulate and this expression then allows you to simulate samples for Y(t). I don’t think I can go much further than this though.

      (Btw, I latexed the formula in your post.)

      1. In financial econometrics the nonzero correlation is called the leverage effect. The possibility of leverage greatly increases the complexity of proofs as in Jacod and Protter (2012).

  3. Thank you George. Of course, it is finance and the original problem is quite more messier. But as an honest scholar I did not want to ask someone to solve the problem for me. I wanted to see if there is a systematic way of getting correlation into the integral. I thought for a sec about Gram-Schmidt too (but I was not persistent enough). You showed clearly how to proceed with it to get the expression for Z(t). But this technique will probably get much more messier when we have 3 processes X(t), Y(t), Z(t). At some point I thought if there is a way to exploit the identity(?)
    d(X(t) Y(t)) = X(t) dY(t) + Y(y) dX(t) + dX(t) dY(t)
    But then the last term is going to yield correlation explicitly. The trick, of course, is to have a closed form Y(t) = f( X(t), rho, … )
    But as you mentioned above, not all SDE can give the closed form solution. Which is pretty much true for Ordinary Differential Equations.

    In ODEs, however, there is a nice apparatus of approximations (via asymptotic expansions, etc) of an original nasty equation. People use it extensively in Fluids problems where the full Navier-Stokes equations are pretty much intractable analytically. The approximations help yield closed form expressions around some limiting cases. Do you know if there is something similar in the SDE world? Any prominent publications are out there in which they apply asymptotics to SDE directly? One can in principle transform SDE to some PDE and then apply approximations there. But I was wondering if approximations can be applied to SDE directly…

    Thank you!

      1. @Milk: This is not really Gram-Schmidt, but rather Cholesky decomposition which is performed here.

        Also for approximations of Ito SDE solutions, you have all the tools coming from diffusion PDEs: perturbation methods, Heat Kernel Expansion,… By the way, Lewis contains outdated material (financially speaking, but also mathematically). You can grab a copy of Pierre Henry-Labordère’s “Analysis, Geometry, and Modeling in Finance” for applications of HKE in option pricing. Also, in your case, carrying an asymptotic expansion of the moments of Y with respect to the volvol process a should be straightforward with physicists’ perturbation theory.

        There is quite a great deal of quant. finance papers about various approximations to SDE solutions. SSRN and Arxiv are good starting places.

        Also, let me take the opportunity to thank you, Mr Lowther for your blog. Stochastic processes is a complex topic, and you manage to cut it down into manageable pieces in a very delightful way.

        Regards

  4. Sorry to intrude, is there a way to contact you? I sent you a “planetmath” email, but I’m not sure you’re reading those. Many thanks

  5. Hi, I found my way to your blog through your answer to the following problem, which by the way helped me improve a paper I was writing by replacing a bound by the exact probability. Thank you.

    http://mathoverflow.net/questions/1294/mean-minimum-distance-for-n-random-points-on-a-one-dimensional-line/1308#1308

    I am thinking about the same problem in a two-dimensional region (let’s say a unit square). Do we know the exact distribution for the minimum distance between n points inside the unit square chosen uniformly at random? And, how about other type of distances like l_infinity distance (so we avoid the problem of packing circles inside a square)?

    One thing I have noticed when researching these questions is that, because there are so many open problems related to these kind of problems, it becomes very confusing to separate what is known (or even trivial) and what is not. One example is the 1-d problem you answered. Honestly, until I found your answer, I was lost among a lot of papers talking about all kinds of fancy stuff about the spanning tree formed by the points, distance to the k’th neighbor, etc. A reference which says “okay, n points inside the unit square (or whatever it is), here’s what we know, here’s what’s probably really hard to calculate” would really work for me.

    Again thanks a lot for all your contribution.

    1. Welcome Job,

      I think this is the kind of problem which will get very difficult very quickly as you add complications, such as increasing the dimension, changing the shape of the region, using different metrics, etc. So, I’m not surprised that you encountered lots of papers using fancy methods.

      As you increase the dimension, the simple argument I used does not work anymore. You could obtain bounds using results on sphere packing. For exact formulas, even the case of just two points can be difficult in higher dimensions, according to Wadim Zudilin’s comment to this mathoverflow question. I’m not sure about a list of what is known and what isn’t, other than by searching through available papers on the subject.
      I think that some other related problems such as the minimum distance of a randomly chosen point to the other points can be a bit easier.

  6. Hi, this is an offtopic question, but i wasn’t sure at which topic to put it, so i put it at the most recent.
    Can a local martingale have decreasing paths? thank you very much.

    1. Hi. It’s not completely clear to me what you mean. Are you asking if a local martingale can almost surely have decreasing paths? In that case, the answer would be no. This is because a martingale cannot have almost surely decreasing paths, as its expectation should be constant. The same statement hold for local martingales by localizing this argument. Maybe you mean, can a local martingale have decreasing paths with nonzero probability? In that case, the answer is yes. Consider the process which starts at zero and, then at some positive time, jumps to ±1 each with 50% probability. This has decreasing paths with 50% probability and is a martingale. For continuous local martingales the answer would be no, it is not possible to have decreasing paths with positive probability, because it must have infinite variation whenever it is non-constant. However, it is possible to find continuous local martingales which have decreasing paths when restricted to a discrete set of times. Eg, let W be a Brownian motion started from zero and τ be the first time when it hits −1. Then,

      \displaystyle X_t = \begin{cases} W_{\min(t/(1-t),\tau)},&\textrm{if }t < 1,\\ -1,&\textrm{if }t\ge1 \end{cases}

      is a local martingale with X1 = −1 < 0 = X0.

      btw, I moved your post to the About page because, as you say, it was off-topic.

  7. Thank you very much for your detailed response. I was refering at local martingales that can hace decreasing paths with positive probability.
    But I didn’t understood your last example: how can you discus the value of X in 1, because then t/(1-t) doesn’t make sense(1/0)?

  8. Hi,

    Could someone help me please with the following problem(which actually could be very simple): If X and Y are two random variables such that:
    X => 0 and E[Y|F] =>0 prove that E[XY|F]=>0.(X and Y are not independent, nor is X and Y independent of F, or measurable with respect to F).
    Thank you very much.

    1. Soumik,

      Hi. It’s not really the intention of this blog to answer arbitrary maths questions. I’m not sure where the best place is for this kind of question is, possibly http://math.stackexchange.com would be suitable. But, as it is quite quick, I can tell you that the result you state is not true. Consider a uniformly distributed variable Y on the interval [−1,1], take X = 1{Y≤0} and let F be the trivial sigma-algebra. Then, X ≥ 0, E[Y|F] = E[Y] = 0 but E[XY|F] = E[XY] = −1/4.

  9. Dear Prof George Lowther,
    you may find the following “almost sure convergence/approximation” results by me in Stochastic calculus interesting:

    Rajeeva Karandikar (rlk@cmi.ac.in)

    Pathwise solution of stochastic di erential equations. Sankhya A, 43, 1981,pp: 121-132.

    On quadratic variation process of a continuous martingales.Illinois journal of
    Mathematics, 27, 1983, pp: 178-181.

    A. S. approximation results for multiplicative stochastic integrals. Seminaire de
    Probabilites XVI, Lecture notes in Mathematics 920, Springer-Verlag,Berlin,
    1982, pp: 384-391.

    On Metivier-Pellaumail inequality, Emery topology and pathwise formulae in
    stochastic calculus.Sankhya A, 51, 1989, pp: 121-143.

    On a. s. convergence of modi ed Euler-Peano approximations to the solution of
    a stochastic di erential equation Seminaire de Probabilites XVII, Lecture
    notes in Mathematics 1485, Springer-Verlag, Berlin, 1991, pp: 113-120.

    On pathwise stochastic integrationStochastic processes and their applica-
    tions, 57, 1995, pp: 11-18.

      1. It has been a long time, how’s things? What are you upto? I’ve been living in the Dallas area for the last 10+ years. Switch to email if you like. If it isn’t visible in WordPress, try winwaed.com and use the contact form.

  10. Dear George Lowther,

    Thanks for the great blog, it’s much easier to study from than wading through textbooks.

    One question: with reference to this StackExchange question http://math.stackexchange.com/questions/168566/stopped-filtration-filtration-generated-by-stopped-process/, I’m wondering if you (or anyone else) have any knowledge regarding the equivalence of the two sigma algebras described here. As I mentioned in my reply, the equivalence of the stopping time (wrt the natural filtration) sigma algebra and the the sigma algebra generated by the stopped process have been shown to be equivalent in 1970 by Haezendonck and Delbaen. However I can’t find the proof anywhere online, despite it being quite useful in practice. Maybe it’s too deep to be included here, but it seems like an important result to be included somewhere…
    ,

  11. Dear George,

    Can you please tell me what your main literature source was for the post about compensators?

    Kind regards

  12. Dear George,

    first of all thanks a lot for all this material you post for free on the web..
    I am wondering if there is a pdf version available, just to print it out and read it more easily..
    I saw the same question has already been asked, but that was a long time ago..

    thanks a lot

    Tom

  13. Hi George,
    You posted a solution to a question of mine on stackexchange at the address below. I would like to first inquire as to whether it has been published anywhere. If not, I have developed a fundamentally different proof which I am interested in publishing (I will be happy to explain in detail if you are interested), and I would like to include your proof as well, so I would like to discuss this with you.
    Thanks,
    Trevor Richards

    http://math.stackexchange.com/questions/437598/conjecture-every-analytic-function-on-the-closed-disk-is-conformally-a-polynomi

  14. Dear George,
    I had a follow-up question to a comment of yours to the following question at
    http://mathoverflow.net/questions/76625/there-are-d-random-variables-given-all-k-d-joint-probability-distributions-with

    You mention that
    “For k=2 then you need the covariance matrix to be positive semidefinite. This is not guaranteed just by having the one dimensional distributions being consistent. – George Lowther Sep 29 ’11 at 6:46”

    I can see that as a necessary condition but I do not see how to prove this is sufficient as well. Is there somewhere I can look this up?
    Best regards,
    Jibran

  15. Hi George

    Thanks for your notes. They are very nice and thoughtful. On the page , you mention notes. Is there a chance you will post a pdf version of your notes? I find it much easier to read a printed version, especially if it’s long and requires annotations etc.

    Thanks

    1. Hi. It is a standard result that random walks in the plane are recurrent. See, for example, Kallenberg, Foundations of Modern Probability (2nd Ed), Theorem 9.2.

  16. I’ve just found this fantastic blog, which solves a lot of my questions. Thank you very much!

  17. Sorry, I have been away from this blog for a while now. I plan to resurrect it soon, continuing with the stochastic calculus posts and, maybe, branch out with some other stuff too.

  18. Hi, George:

    Your blog is wonderful. I have learnt much from it.

    I have a question regarding your answer on math.stackexchange.com https://math.stackexchange.com/a/2260325/64809. I would very much appreciate it if you could answer it.

    For the sake of clarity, I quote my question here:
    Could you please write out the detailed proof of the existence of a continuous function f \ni C(f)=\inf\{C(f)\}? You claim that the existence of such a continuous function follows from the uniform convexity of L^p. It is not clear to me. In addition, given that such continuous f exists, it is not clear to me why your statement regarding Equation (2) should hold, if f\ne0. Essentially you seem to claim \|f_n\|\rightarrow \|f\|\Longrightarrow \|f_n-f\|\rightarrow 0 which is in general wrong.

    Hans

  19. Hi George, your blog is excellent. I’m a phd student doing research in mathematical finance. I’d really like to read your phd thesis, may I ask you if you could share a digital copy of your phd thesis with me?
    Thanks a lot. Kind regards.

    1. Sorry, I don’t have a digital copy. I see that it has now been made available by Cambridge university if it is for research purposes and you email them. Looks like I can ask them to make it available in the online repository, although might be for a fee.

      The main mathematical results of my thesis are included in the papers I published on ArXiv (+Annals of probability, for a couple of them). See Fitting Martingales to given Marginals. These papers include the main mathematical results of my thesis, and are better imo (I wrote this for publication some years later, and tidied up the argument quite a bit). One thing was in my thesis that I never published separately though (I didn’t really think it was worth it, as it seemed like a very technical extension of previous results) was the equivalence of the existence of martingale measures and absence of arbitrage, along the lines of results by Delbaen & Schachermayer, but with weakened hypotheses, and applied it to show that existence of all discount factors (i.e., zero coupon bonds of all maturities) implies existence of instantaneous spot rates.

      1. Thanks a lot George! I’ll look into your papers! Merely as an observation, there exists a database of theses offered by ProQuest (like a netflix of phd theses with like 5 million of them).
        If you ever want to make yours available to future researchers, perharps with your approval Cambridge can sell it to ProQuest. Kind regards!

  20. Hi George,
    Nice to connect. I have a question related to your mathoverlflow response (https://mathoverflow.net/questions/59739/gaussian-processes-sample-paths-and-associated-hilbert-space).

    Let X be a generic topological space and k : X x X -> R be a reproducing kernel indexed on X. Let H_k be the RKHS of functions f : X -> R associated to k. Let be the inner product of H_k. Consider a Gaussian process GP(0,k) supported on X.

    In the (machine learning) GP community people often refer to Driscoll’s theorem which states that if the RKHS H_k is infinite dimanensional then any sample f ~ GP does not belong to H_k with probability 1. Suppose that H_k has a countable orthonormal basis e = {e_1, e_2, …}.

    Is it possible for you expand your argument (based on cylindrical measures) to justify that for any sample f ~ GP and any basis element e_n, the quantity is well defined, despite the fact that f is not in H_k a.s.?

    In various papers in the GP community people use arguments that are specific to the particular choice of kernel k. It would be very nice to have a more general justification that doesn’t depend on the particular choice of kernel k.

    Many thanks,

    Cris

  21. Hello George,

    First of all thank you for providing us with this awesome blog full of knowledge. It has been extremely helpful to me throughout the years.

    I have a question about integrating a pure jump process with respect to another pure jump process, and how to obtain the characteristic function of this integral. I have been stuck for more than week on this now, so I was hoping you could give some insight here.
    Let’s call our two independent pure jump processes $X(t)$ and $Y(t)$, and we know their characteristic functions $\phi_{X}(u)$ and $\phi_{Y}(u)$. The integral is question would be
    \displaystyle Z(t) = Z(0) + \int_{0}^t Y(s_{-}) dX(s)
    I have been trying to find the characteristic function
    \displaystyle \mathbb{E}\left[ \exp\left( \int_{0}^t Y(s_{-}) dX(s) \right) \right]
    But so far I have been unsuccessful. I also have written a related post on mathoverflow: https://mathoverflow.net/questions/450064/characteristic-function-of-stochastic-integral-of-a-pure-jump-l%c3%a9vy-process-with
    In that post, I documented my attempt so far.

    Thank you,
    Tom

    1. Oops, I made a typo there, and I don’t know how to edit my post, so I’ll post the correction here:
      The characteristic function I’m looking for is obviously
      \displaystyl\mathbb{E}\left[ \exp\left( iu\int_{0}^t Y(s_{-}) dX(s) \right) \right]

      Also just t render the missing LaTeX objects, the independent pure jump processes are X(t), Y(t) and their associated characteristic functions \phi_{X}(u) and \phi_{Y}(u) respectively.

  22. Hello George,

    Fantastic blog. Your concise and clearly worded posts have certainly made the topic of Stochastic Calculus more accessible to me.

    I am a physicist by background so please forgive me if I am unable to frame my question in full mathematical rigour. Further, apologies if I have overlooked a direct answer to this question in one of your posts.

    I am exploring Brownian Drawdown Processes, Dt, as one of my own personal pursuits. My understanding is a drawdown time-series is essentially a series of non-negative Brownian Excursions, with zero-periods between, with the durations of each stochastically distributed.

    Given a drawdown process has entered a Brownian Excursion, i.e. Dt>0, what is the probability the drawdown value will exceed Dt=a before returning to zero? I am not concerned with what values the timeseries may achieve over all times (i.e. multiple excursions and zero events for some period 0<t<T) but just for that single drawdown event.

    I have spotted on your Brownian Excursion and Brownian Bridge posts references to maximums attained over a fixed interval, but presumably the interval for one specific excursion is variable and this should be accounted for in the calculation.

    Finally, if we arrive at a result for my problem, if you could re-frame it within the context of scaled brownian motion with standard deviation sigma that would help confirm I have understood everything correctly.

    Thank you kindly,

    Matthew

  23. Hello George,
    Thank you for Almoust Sure!
    I found “at least some of what you post here to be interesting”.
    Sincerelly,
    Vasyl

  24. Hello George,

    The blog articles are quite fantastic.

    Could you please add some reference list for further learning (e.g. proofs)? I think that could be very helpful for many readers.

Leave a reply to Yuchen Xie Cancel reply