Almost Sure

6 October 10

Asymptotic Expansions of a Recurrence Relation

In today’s post, I investigate a simple recurrence relation and show how it is possible to describe its behaviour asymptotically at large times. The relation describing how the series evolves at a time n will depend both on its value at the earlier time n/2 and on whether n is even or odd, which, as we will see, introduces a `period doubling’ behaviour in the asymptotics. The proof will involve defining a Dirichlet generating function for the series, showing that it satisfies a functional equation and has a meromorphic extension to the whole complex plane, and then inverting the generating function with Perron’s formula. Cauchy’s residue theorem relates the terms of the asymptotic expansion to the poles of the meromorphic extension of the Dirichlet series. Such an approach is well-known in analytic number theory, originally used by Riemann to give the explicit formula for the prime counting function in terms of the zeros of the Riemann zeta function. However, Dirichlet generating functions can be applied in many other situations to generate asymptotic expansions (see the references for examples). Although I will concentrate on a specific difference equation here, the techniques described are quite general and also apply to many other kinds of series.

This post grew out of an answer I gave to a question by Byron Schmuland at the math.stackexchange website. To motivate the difference equation, let us start by considering the following Markov chain whose state space is the positive integers, as described in the original question by Byron. The chain begins at state 1, and from state n the chain next jumps to a state uniformly selected from {\{n+1,n+2,\ldots,2n\}}. As time goes on, this chain goes to infinity, with occasional large jumps. In any case, it is quite unlikely to hit any particular large n.

If we define p(n) to be the probability that the chain visits state n, then p(n) goes to zero like c/n for some constant c. In fact,

\displaystyle  np(n)\rightarrow c=\frac{1}{2\log 2-1}=2.588699...


The question asked on math.stackexchange is to give an analytic proof of this fact. It is possible to prove this using a (tricky) probabilistic method and, as explained by Byron and in the other answers to his question, there are quick analytic proofs of conditional convergence. That is, if np(n) does converge to a limit, then it must be the value given in (1). In this post, I use Dirichlet series to not only answer the original question, but also give the entire asymptotic expansion. I am not aware if the results given here are already known although, as noted, it is very similar to asymptotic expansions used in analytic number theory and to the results of Flajolet and Golin mentioned above. The layout of this post is as follows. First, I briefly discuss the recurrence relation satisfied by p and some of the approaches which can be used, as suggested in the original math.stackexchange question. I will then look at the numerical evidence for the behaviour of p, plotting graphs of the first few asymptotic terms. The result of the numerical simulation is used to propose a full asymptotic expansion. Next, the Dirichlet generating function is introduced and a few of its properties, including a functional equation, are proved. The proof of the asymptotic expansion is then given by applying Perron’s formula to invert the generating function. Finally, I end with a few comments about the proof given here.

First, the description of the Markov process above leads to the following recurrence relation for p(n).

\displaystyle  \setlength\arraycolsep{2pt} \begin{array}{rl} \displaystyle p(1) &\displaystyle=1,\smallskip\\ \displaystyle p(n) &\displaystyle=\sum_{m=\lceil n/2\rceil}^{n-1}\frac{p(m)}{m},\textrm{\ \ for }n>1. \end{array}


Alternatively, if we take the difference of p(n+1) and p(n) then most of the terms in the summation above cancel, leading to the difference equation

\displaystyle  p(n+1)=(1+1/n)p(n)-1_{\{n{\rm\ even}\}}\frac{2}{n}p(n/2)-1_{\{n=1\}}.


This is just a linear recurrence relation for p, albeit with two complicating factors. First, there is the periodic term {1_{\{n{\rm\ even}\}}} which, as we will see, causes a period doubling effect in the asymptotics. Then, there is the dependence on p at the earlier time n/2. If we were to replace p(n/2) by p(n) on the right hand side of (3), this would give a simple first order difference equation, and the entire series would be determined by the value of p at any single time. So the space of solutions to the difference equation for large {n\ge N} would form a one-dimensional vector space. However, for (3), knowing the value of p(N) is not enough to find p(n) for {n\ge N}. We would also need to be given the values of p at times N-1, N-2,…,N/2. This means that the space of solutions to (3) at arbitrarily large times gives an infinite dimensional space and, consequently, we should not be surprised to find that the asymptotic expansion involves infinitely many independent terms.

A probabilistic argument can be given for the limit (1), as given in the answer by `T..’ to the math.stackexchange question. The increase in the logarithm of the position {X_n} of the Markov chain at each time is close to uniformly distributed over the interval {[0,\log 2]}. This distribution has mean {\mu=2\log 2-1} and variance {\sigma^2=(1-\mu)^2/2}. By the Central Limit Theorem, {\log X_n} will be approximately Gaussian with mean {n\mu+O(\log n)} and variance {n\sigma^2+O(\log n)}. From this, the number of steps taken for X to escape an interval [1,n] will be approximately {\mu^{-1}\log n} with fluctuations of order {\sqrt{\log n}}. As the average time spent in the interval [1,n] is {p(1)+\cdots+p(n)}, this leads to the asymptotic expression

\displaystyle  p(1)+p(2)+\cdots+p(n) = c\log n+o(\log n)


where {c=\mu^{-1}} is the constant appearing in (1). As it stands, (4) is not quite strong enough to prove the required limit (1). It is strong enough to prove convergence conditionally, however. If np(n) converges to some limit then it will follow that {(p(1)+\cdots+p(n))/\log n} tends to the same limit which, by (4), must be {c=1/(2\log 2-1)}. With a bit of work, it is possible to give an unconditional probabilistic proof of (1), which Byron posted on his home page.

Another conditional proof of (1) involves the generating function of p(n)/n,

\displaystyle  Q(t)=\sum_{n=1}^\infty \frac{p(n)}{n} t^n.


As the coefficients are bounded by 1/n, the power series converges for all complex {\vert t\vert < 1}. Differentiating and plugging in the recurrence relation (2) or (3) gives a differential equation for Q,

\displaystyle  Q^\prime(t)=1+\frac{Q(t)-Q(t^2)}{1-t}.


This idea was mentioned by Qiaochu Yuan in his response to the original question, and expanded into a conditional proof of (1) by Byron in the original question. Differentiating again and multiplying by 1-t,

\displaystyle  (1-t)Q^{\prime\prime}(t)=2\left(Q^\prime(t)-tQ(t^2)\right)-1.


Assuming that np(n) converges to a limit c, we can evaluate both sides of (7) to leading order as t goes to 1,

\displaystyle  \setlength\arraycolsep{2pt} \begin{array}{rl} &\displaystyle Q^{\prime\prime}(t)=\sum_{n=1}^\infty np(n+1)t^{n-1}\sim\sum_{n=0}^\infty c t^n=\frac{c}{1-t}\smallskip\\ & \begin{array}{rl} \displaystyle Q^\prime(t)-tQ^\prime(t^2)&\displaystyle=\sum_{n=1}^\infty p(n)(t^{n-1}-t^{2n-1})\sim c\sum_{n=1}^\infty\frac{t^n-t^{2n}}{n}\smallskip\\ &\displaystyle=c\log(1+t)\rightarrow c\log 2. \end{array} \end{array}

Plugging these limits back into (7) gives {c=1/(2\log 2-1)} as required. As I show below, to get an unconditional proof, we should replace the generating function by the Dirichlet generating function.

A Numerical Investigation

With the help of a computer, it is easy to simulate p(n) using the recurrence relation (3).

np(n) at even and odd points

Figure 1: Plots of np(n) at even and odd times

Figure 1 shows two plots of np(n). In the first one, the points are coloured blue for n even and red for n odd (similar to David Speyer’s plot in the math.stackexchange thread linked above). It can be seen that convergence is much faster at even times than at odd times. To see how the series behaves at even times, the second plot only includes those points where n is even, with n a multiple of 4 coloured blue, and n even but not a multiple of 4 coloured red. The simulation suggests that,

  • for n odd, np(n) converges to c from below at rate O(1/n).
  • for n a multiple of 4, np(n) converges to c from above at rate O(1/n2).
  • for n a multiple of 2, but not 4, np(n) converges to c from below at rate O(1/n2).

This suggests the following asymptotic form for p(n)

\displaystyle  p(n)=\frac{c}{n}\left(1+\frac{a_1(n)}{n}+\frac{a_2(n)}{n^2}\right)+o(n^{-3}),

where {a_1(n)} has period 2, with {a_1(0)=0}, {a_1(1)} and {a_2(n)} has period 4 with {a_2(0)>0} and {a_2(2)<0}. We can go further, and tentatively suggest the following expansion to arbitrary order {\alpha},

\displaystyle  p(n)=c\sum_{k=0}^\alpha \frac{a_k(n)}{n^{k+1}}+o(n^{-\alpha-1})


where {a_k(n)} is a periodic function of n, with period {2^k}. Here, {a_0} has been normalized to 1. In fact, as we will see below there are additional terms in the asymptotic expansion of p, so (8) does not quite hold. However, for now, we can try substituting (8) back into (3) to solve for the terms {a_k(n)}. Taking {\alpha\rightarrow\infty}, (8) can be considered as a formal power series in 1/n giving,

\displaystyle  \sum_k \frac{a_k(n+1)}{(n+1)^{k+1}}=\sum_k\frac{a_k(n)}{n^{k+1}}(1+1/n)-\sum_k2^{k+2}1_{\{n{\rm\ even}\}}\frac{a_k(n/2)}{n^{k+2}}.

Expanding the left hand side as a Taylor series in 1/n

\displaystyle  (n+1)^{-k-1}=\sum_{j=0}^\infty\binom{-k-1}{j}n^{-k-j-1}

and comparing coefficients of {n^{-k-1}} gives the following recurrence relation for {a_k},

\displaystyle  a_k(n+1)-a_k(n)=a_{k-1}(n)-\sum_{j=0}^{k-1}\binom{-j-1}{k-j}a_j(n+1)-1_{\{n{\rm\ even}\}}2^{k+1}a_{k-1}\left(\frac{n}{2}\right).


This defines {a_k(n+1)-a_{k}(n)} in terms of {a_j} for {j<k}, and the final term on the right hand side will cause the period to double at each step. Knowing {a_k(n+1)-a_k(n)}, however, tells us nothing about the average value of {a_k(n)} as n varies, which I will denote by {\bar a_k}. To obtain this, take the average of (9) over n and, noting that the left hand side goes to zero,

\displaystyle  0=(1+k-2^k)\bar a_{k-1}-\sum_{j=0}^{k-2}\binom{-j-1}{k-j}\bar a_j

This defines {\bar a_k} inductively and, in fact, is solved by {\bar a_k=(-1)^k}. Together with (9), this allows us to inductively calculate {a_k(n)} in terms of {a_j(n)} for {j<k}. Doing this and writing {a_k} for the vector {(a_k(0),\ldots,a_k(2^k-1))},

\displaystyle  \setlength\arraycolsep{2pt} \begin{array}{rl} &\displaystyle a_0=(1),\smallskip\\ &\displaystyle a_1=(0,-2),\smallskip\\ &\displaystyle a_2=(7,-3,-9,13)/2,\smallskip\\ &\displaystyle a_3=(0,-50, -64, -18, 0, 78, 64, -18). \end{array}


This expansion can be checked by plotting p(n) while successively subtracting out the leading order terms. As can be seen in Figure 2, it converges beautifully to the levels {a_0,a_1,a_2}.

Asymptotics of p(n)

Figure 2: Asymptotics of p(n)

However, after {a_2 n^{-3}} there is an oscillating term! This is something that I was not expecting when first plotting this graph. Fortunately, the form of this oscillation can be seen clearly from the plot. It appears to be very close to {cn^{-u}\cos(v\log n+\theta)} for constants {u,v,c,\theta}. This is more conveniently expressed as the real part of a complex power of n, {\Re[an^{-r}]}, where {r=u+iv} and {a=ce^{-i\theta}}. It is possible to solve for r by requiring that {\tilde p(n)=n^{-r}} satisfies the recurrence relation (3) to leading order. Replacing the {1_{\{n{\rm\ even}\}}} term by its average value of 1/2, the recurrence relation can be approximated by a differential equation

\displaystyle  p^\prime(n)=\frac{p(n)}{n}-\frac{p(n/2)}{n}.

Putting {p(n)=n^{-r}} into this gives the relation {r+1=2^r}. This only has two real solutions r=0,1. To find the complex solutions, equating real and imaginary parts gives a pair of simultaneous equations for {r=u+iv},

\displaystyle  \setlength\arraycolsep{2pt} \begin{array}{rl} &\displaystyle u=\log_2\left(\frac{v}{\sin(v\log2)}\right),\smallskip\\ &\displaystyle u=-1+\frac{v\cos(v\log 2)}{\sin(v\log 2)}. \end{array}


Looking for solutions with v positive, the first of these equations says that {\sin(v\log 2)} is positive and, then, the second says that {\cos(v\log 2)} is positive. So, {v\log2} must lie in an interval {(2k\pi,2k\pi+\pi/2)} for some nonnegative integer k. Across such an interval, the difference of the two expressions in (11) increases from minus infinity (or, from a positive value if k=0) to {1+\log_2(2k\pi+\pi/2)>0}. So, (11) has precisely one solution in {2k\pi<v\log2<2k\pi+\pi/2} for each positive integer k. Together with the complex conjugates {r=u-iv} and the real solutions r=0,1, this lists all solutions to {r+1=2^r}. See, also, Figure 4 below. The complex solution with smallest real part lies in the range {2\pi<v\log 2<5\pi/2} and, numerically solving (11) gives {r\approx3.54536\pm10.75398i}. This agrees with the bottom-right plot of Figure 2, showing that the oscillating term decays at rate approximately {n^{-3.55}}. To test this, we can try subtracting {\Re[an^{-r}]} from p(n) to see if this eliminates the oscillating term. This is done in Figure 3 and subtracting out this term does indeed remove the oscillating term (the coefficient a was calculated by a least-squares fit). In fact, the next asymptotic {a_3(n)/n^4} is visible, where {a_3} agrees with the value calculated in (10). Subtracting this out too shows that there is a further oscillating term decaying approximately as {n^{-4.37}}. This is about as far as we can go with this particular simulation (I used 128 bit floating point arithmetic), as numerical inaccuracies are starting to appear in the plots.

further asymptotics of p(n)

Figure 3: Further asymptotics of p(n)

The Asymptotic Expansion

The numerical investigation above suggests the following asymptotic expansion for p(n) up to order {n^{-\alpha}}, for any real {\alpha},

\displaystyle  p(n)=\sum_{\Re[r]+k<\alpha}a_{r,k}(n)n^{-r-k}+O(n^{-\alpha}).


This sum is taken over complex r satisfying {r+1=2^r} and with real part at least one, and over nonnegative integers k. The term {a_{r,k}(n)} is a periodic function of n, with period {2^k}. The set of {r=u+iv} with real part at least one and satisfying {r+1=2^r} are plotted in Figure 4. These lie on the intersections of the curve {(u+1)^2+v^2=4^u} with the real axis and with the set of almost horizontal lines given by the second of equation (11). The points r+k for positive integers k are also plotted. So, each point in Figure 4 corresponds to a term in the asymptotic expansion.

Complex solutions to r+1=2^r

Figure 4: Complex solutions to r + 1 = 2r

The terms {a_{r,k}(n)} for {k>0} can be computed as multiples of {a_{r,0}} in a similar way as for equation (9) above, by substituting the asymptotic expression into (3) and equating powers of n. Doing this gives

\displaystyle  \setlength\arraycolsep{2pt} \begin{array}{rl} \displaystyle a_{r,k}(n+1)-a_{r,k}(n)=&\displaystyle a_{r,k-1}(n)-\sum_{j=0}^{k-1}\binom{-r-j}{k-j}a_{r,j}(n+1)\smallskip\\ &\displaystyle-1_{\{n{\rm\ even}\}}2^{r+k}a_{r,k-1}\left(\frac{n}{2}\right) \end{array}


This allows us to calculate {a_{r,k}(n+1)-a_{r,k}(n)} inductively in k, and it can be seen that the period does indeed double at each step. It does not directly give us the average value {\bar a_{r,k}} of {a_{r,k}(n)} as n varies. This can be obtained by averaging (13) over n and noting that the left hand side goes to zero.

\displaystyle  (r+k+1-2^{r+k})\bar a_{r,k}=\sum_{j=0}^{k-1}\binom{-r-j}{k+1-j}\bar a_{r,j}.

It still remains to calculate the values of {a_{r,0}}, which will be done below. Although we have not yet proven that the asymptotic expansion (12) holds, we can define a series {p_\alpha} to satisfy this expansion up to order {n^{-\alpha}},

\displaystyle  p_\alpha(n) \equiv \sum_{\Re[r]+k<\alpha}a_{r,k}(n)n^{-r-k}.


The fact that the coefficients have been chosen to satisfy (13) means that, if we substitute {p_\alpha} back into (3), then all powers of 1/n cancel up to {O(n^{-\alpha-1})},

\displaystyle  p_\alpha(n+1)=(1+1/n)p_\alpha(n)-1_{\{n{\rm\ even}\}}\frac{2}{n}p_\alpha(n/2)+O(n^{-\alpha-1}).


The Dirichlet Generating Function

The Dirichlet generating function of the series p(n)/n is

\displaystyle  L(s)=\sum_{n=1}^\infty p(n)n^{-1-s},

which converges uniformly to an analytic function for the real part of s large. As we know that p is bounded by 1, it will in fact converge for all s with positive real part. It can also be seen that {L(s)-1\sim 2^{-1-s}} as the real part of s goes to infinity. I am looking at the Dirichlet series for p(n)/n here, rather than just p(n), because it will turn out that terms like {n^{-r}} in the asymptotic expansion will directly relate to poles in the meromorphic expansion of L at –r.

Whereas the standard generating function Q above satisfied a differential equation, the Dirichlet generating function satisfies a functional equation. Multiplying (3) by {n^{-s}}, summing over n, and expanding {n^{-s}} in terms of 1/(n+1) on the left hand side

\displaystyle  n^{-s}=(n+1)^{-s}\left(1-(n+1)^{-1}\right)^{-s}=\sum_{k=0}^\infty\binom{-s}{k}(-1)^k(n+1)^{-s-k}

leads to the following functional equation for L,

\displaystyle  (s-1+2^{-s})L(s)=s+\sum_{k=1}^\infty\binom{-s}{k+1}(-1)^k\left(L(s+k)-1\right).


This expresses L(s) in terms of L(s+k) for positive integers k. As L(s) is defined whenever s has positive real part, (16) extends L to all s with real part greater than -1. Inductively applying (16), then, extends L to a meromorphic function on the whole complex plane. The division by {s-1+2^{-s}} introduces simple poles at the points where this term vanishes, which are the points –r with {r+1=2^r}. Then, (16) will introduce further simple poles at –rk for positive integers k. Note that there is no pole at s=0, as the right hand side of (16) also vanishes there. So, the poles of the meromorphic extension of L are at the points –rk as plotted in Figure 4.

By a standard result on Dirichlet series with nonnegative coefficients, the fact that L has no poles in the half plane {\Re[s]>-1} means that the Dirichet series will also converge absolutely on this region. That is, {\sum_np(n)n^{-s}<\infty} for all {s>-1}. It is interesting to note that this is implied by basic properties of the Dirichlet generating function but, as the asymptotic expansion will imply much more than this anyway, I don’t make use of this fact here.

Noting that {s-1+2^{-s}} has derivative {1-2^{-s}\log 2}, the residue of L at -r where {r+1=2^r} can be computed from (16),

\displaystyle  {\rm Res}(L,-r)=\frac{1}{1-2^{r}\log 2}\left(-r+\sum_{k=1}^\infty\binom{r}{k+1}(-1)^k(L(k-r)-1)\right).


At r=1, all the binomial coefficients on the right hand side are zero, giving the residue

\displaystyle  {\rm Res}(L,-1)=\frac{1}{2\log2-1}.


There does not seem to be a simple expression for the residues at the other poles, although (17) can be used to compute them to whatever accuracy is desired.

We can also write out the Dirichlet generating function for the truncated asymptotic expansion {p_\alpha(n)/n} defined in (14),

\displaystyle  L_\alpha(s)=\sum_{n=1}^\infty p_\alpha(n)n^{-1-s}.

This also has a meromorphic expansion to the entire complex plane and the locations and residues of the poles can be determined by the following fact.

  • If {a\colon{\mathbb Z}\rightarrow{\mathbb C}} is periodic with average value {\bar a(n)} then {f(s)=\sum_{n=1}^\infty a(n)n^{-s}} extends to a meromorphic function on the complex plane with a simple pole at s=1 with residue {\bar a(n)}.

This result holds whenever a is a Dirichlet character, so f is a Dirichlet L-function and, since all periodic functions {a\colon{\mathbb Z}\rightarrow{\mathbb C}} are linear combinations of Dirichlet characters, it also holds for all periodic coefficients. In particular, the Dirichlet generating function {L_\alpha} extends to a meromorphic function on the whole complex plane with simple poles at the points –rk with the real part of r at least 1, {r+1=2^r}, {\Re[r]+k<\alpha} and residue {{\rm Res}(L_\alpha,-r-k)=\bar a_{r,k}}. The values of {a_{r,0}} are chosen to match the residues of L at the points –r,

\displaystyle  a_{r,0}={\rm Res}(L,-r).

In particular, from (18), {a_{1,0}=1/(2\log 2-1)}. Choosing the coefficients in this way means that the poles of {L-L_\alpha} cancel out at the points –r with {r+1=2^r} and {\Re[r]<\alpha}.

It is also possible to calculate an approximate functional equation for {L_\alpha}. Multiplying (15) by {n^{-s}} and summing over n, we note that the {O(n^{-\alpha-s-1})} term converges to an analytic function bounded over {\Re[s]\ge\beta} for any {\beta>-\alpha}. As above, we expand {n^{-s}} on the left hand side in terms of 1/(n+1) but, now, use a finite Taylor expansion,

\displaystyle  n^{-s}=\sum_{k=0}^{m}\binom{-s}{k}(-1)^k(n+1)^{-s-k}+O((1+|s|^{m+1})n^{-s-m-1}).

Summing the remainder term over n converges to an analytic function over {\Re[s]\ge\beta}, any {\beta>-m}, which is bounded by a multiple of {1+|s|^{m+1}}. Choosing {m=\lceil\alpha\rceil} gives the functional equation

\displaystyle  (s-1+2^{-s})L_\alpha(s)=s+\sum_{k=1}^{\lceil\alpha\rceil-1}\binom{-s}{k+1}(-1)^kL_\alpha(s+k)+R(s).


The remainder term R(s) is an analytic function over {\Re[s]>-\alpha} and {R(s)/(1+|s|^m)} is bounded over {\Re[s]\ge\beta}, any {\beta>-\alpha}. Applying (19) iteratively, then, shows that {L_\alpha} is bounded by a polynomial in |s| over {\Re[s]\ge\beta} and for the imaginary part of s large enough that {s-1+2^{-s}} doesn’t vanish.

The above argument for {L_\alpha} will also apply to L, since p also satisfies the approximate difference equation (15). This shows that L(s) satisfies a polynomial bound in s on each half-plane (for the imaginary part of s large enough). Also, applying (19) to the difference {L-L_\alpha} and using the fact that {a_{r,0}} have been chosen so that the poles of {L-L_\alpha} cancel at those points s with {s-1+2^{-s}=0}, we see that {L(s)-L_\alpha(s)} is an analytic function on {\Re[s]>-\alpha}.

Inverting the Dirichlet Series and Proof of the Expansion

The proof of the asymptotic expansion (12) will rely on Perron’s formula to invert the Dirichlet generating function

\displaystyle  {\sum_{n\le x}}^{\star}p(n)=\frac{1}{2\pi i}\int_{\beta_0-i\infty}^{\beta_0+i\infty}\frac{x^{s+1}}{s+1}L(s)\,ds,

which holds for any {\beta_0>0}. I find it convenient to differentiate this expression, in the sense of distributions. Doing this, the left hand side becomes a sum of Dirac delta functions centered about each positive integer n with weights p(n), {\phi(x)\equiv\sum_{n\ge1}p(n)\delta(x-n)},

\displaystyle  \phi(x)=\frac{1}{2\pi i}\int_{\beta_0-i\infty}^{\beta_0+i\infty}x^sL(s)\,ds.


As stated above, this expression is intended in the sense of distributions. That is, if {\theta\colon(0,\infty)\rightarrow{\mathbb R}} is infinitely differentiable with compact support, (20) states that

\displaystyle  \int\phi(x)\theta(x)\,dx\equiv\sum_{n=1}^\infty p(n)\theta(n)=\frac{1}{2\pi i}\int_{\beta_0-i\infty}^{\beta_0+i\infty}\int x^s\theta(x)\,dx L(s)\,ds.


The term {\int x^s\theta(x)\,dx} inside the integral vanishes faster than polynomially in 1/s in any vertical strip {\beta_0\le\Re[s]\le\beta_1}, which can be seen using integration by parts

\displaystyle  \setlength\arraycolsep{2pt} \begin{array}{rl} \displaystyle\int x^s\theta(x)\,dx &\displaystyle= (-1)^n\int \frac{x^{s+n}}{(s+1)(s+2)\cdots(s+n)}\theta^{(n)}(x)\,dx\smallskip\\ &\displaystyle=O(s^{-n}). \end{array}

As the Dirichlet generating function L is polynomially bounded on each half-plane for large s, it follows that the integrand on the right hand side of (21) goes to zero faster than polynomially in 1/s, and the integral in (20) is well defined in the sense of distributions on any line {\Re[s]=\beta_0} not passing through a pole of L.

The idea is to move the line of integration in (20) to the left, using Cauchy’s residue theorem to pick up terms from the poles of L,

\displaystyle  \phi(x)=\sum_{\Re[r]+k+\beta<0}{\rm Res}(L,-r-k)x^{-r-k}+\frac{1}{2\pi i}\int_{\beta-i\infty}^{\beta+i\infty}x^sL(s)\,ds


for any real {\beta} such that {\Re[s]=\beta} does not contain a pole of L. For a more precise argument, choose {\beta<\beta_0} and positive constant u, and use Cauchy’s residue theorem to replace the contour integral in (20) by an integral along the lines joining {\beta_0-i\infty}, {\beta_0-iu}, {\beta -iu}, {\beta+iu}, {\beta_0+iu}, {\beta_0+i\infty} plus the sum of the residues of {x^sL(s)} in the rectangle {[\beta,\beta_0]\times[-u,u]}. We know that, after integrating against a smooth function {\theta\colon(0,\infty)\rightarrow{\mathbb R}} of compact support, {x^sL(s)} goes to zero faster than polynomially in 1/s. So, in the limit as u goes to infinity, the integral along each of the line segments other than the one joining {\beta-iu,\beta+iu} goes to zero. Then, we obtain (22).

Equation (22) holds in the sense of distributions. We can convolve both sides by an infinitely differentiable function {\theta\colon(0,\infty)\rightarrow{\mathbb R}} with compact support. That is, evaluate (22) at a point xy and integrate against {\theta(y)\,dy}.

\displaystyle  \setlength\arraycolsep{2pt} \begin{array}{rl} \displaystyle\int\phi(xy)\theta(y)\,dy=&\displaystyle\sum_{n=1}^\infty p(n)\theta(n/x)/x\smallskip\\ =&\displaystyle\sum_{\Re[r]+k+\beta<0}{\rm Res}(L,-r-k)x^{-r-k}\int y^{-r-k}\theta(y)\,dy\smallskip\\ &\displaystyle+\frac{1}{2\pi i}\int_{\beta-i\infty}^{\beta+i\infty} x^s\int y^s\theta(y)\,dy L(s)\,ds \end{array}

The left hand side of this expression is a weighted sum of p(n) for n in a region near x. The integrand of the contour integral on the right hand side consists of a term going to zero faster than polynomially in 1/s multiplied by a term of size {\vert x^s\vert=x^\beta}. Therefore,

\displaystyle  \int\phi(xy)\theta(y)\,dy = \sum_{\Re[r]+k+\beta<0}{\rm Res}(L,-r-k)x^{-r-k}\int y^{-r-k}\theta(y)\,dy + O(x^\beta).


This expression is very close to the asymptotic expansion (12). It differs by replacing p(n) on the left hand side by a `smoothed’ version of p. Also, although the right hand side does have a term corresponding to each of the terms {a_{r,k}(n)n^{-r-k}} on the right hand side, it does not have any periodic dependence on n. This is because convolving with the smooth function {\theta} effectively smoothes out this periodic behaviour, replacing {a_{r,k}(n)} by {\bar a_{r,k}}.

There does not seem to be any way that periodic behaviour of the coefficients can come out of a direct application of Cauchy’s residue theorem in this way. So, instead, we can look at the difference {q(n)=p(n)-p_\alpha(n)}, where {p_\alpha} is the truncated asymptotic expression (14), and try to show that this vanishes at rate {O(n^{-\alpha})}. Let us also set {\phi_\alpha(x)=\sum_{n\ge1}q_\alpha(n)\delta(x-n)}. The same argument used to derive (23) can be applied to {q_\alpha}. However, the Dirichlet generating function {L(s)-L_\alpha(s)} of {q_\alpha(n)/n} has no poles with real part greater than {\alpha}, so the sum over the residues in (23) drops out,

\displaystyle  \int\phi_\alpha(xy)\theta(y)\,dy = O(x^\beta).


This holds for any {\beta>-\alpha}. We can use this to show that {q_\alpha(n)=O(n^\beta)}. As {q_\alpha} satisfies the approximate difference equation (14), we can multiply through by {(n+1)^{-\beta}} to get

\displaystyle  q_\alpha(n+1)(n+1)^{-\beta}=q_\alpha(n)n^{-\beta} + O(h(n)/n).


Here, I am using {h(n)} to denote the maximum value of {q_\alpha(m)m^{-\beta}} over {m\le n}. Now, if {q_\alpha(n)} was not bounded above by a multiple of {n^\beta} then, for any {K>0}, {q_\alpha(n)n^{-\beta}} will eventually exceed K. Letting N be the first time at which this happens, h(N) will equal {q_\alpha(N)N^{-\beta}}. Equation (25) shows that {q_\alpha(n)n^{-\beta}} can only take steps of size {O(K/N)} and {q_\alpha(N+n)(N+n)^{-\beta}} can only become small (less than K/2, say) after some number {\epsilon N} steps, for a fixed {\epsilon>0}. It follows that if the smooth function {\theta\colon(0,\infty)\rightarrow{\mathbb R}} is chosen to be nonnegative with support in {[1,1+\epsilon]} then,

\displaystyle  \setlength\arraycolsep{2pt} \begin{array}{rl} \displaystyle\int\phi_\alpha(Ny)\theta(y)\,dy&\displaystyle=\sum_{n=1}^\infty p(n)\theta(n/N)/N\ge\frac{K}{2}\sum_n n^\beta\theta(n/N)/N\smallskip\\ &\displaystyle=\frac{KN^\beta}{2}\left(\int x^\beta\theta(x)\,dx+O(1/N)\right). \end{array}

However, K can be chosen arbitrarily large, so this contradicts the bound (24). It follows that {q_\alpha(n)} is bounded above by a multiple of {n^\beta} and, by the same argument applied to {-q_\alpha}, it is also bounded below by a multiple of {-n^\beta}. We have shown that

\displaystyle  p(n) = p_\alpha(n)+q_\alpha(n)=p_\alpha(n)+O(n^\beta)

for any {\beta>-\alpha}. This is almost the asymptotic approximation (12), although we do not quite have as strong a bound on the remainder term as required. This is easily fixed, by replacing {\alpha} with {\alpha+1}, so that we can take {\beta=-\alpha} and, noting that {p_{\alpha+1}(n)-p_\alpha(n)} is bounded by a multiple of {n^{-\alpha}},

\displaystyle  p(n)=p_{\alpha+1}(n)+O(n^{-\alpha})=p_\alpha(n)+O(n^{-\alpha}).

This is exactly the asymptotic expansion proposed in (12).


It is useful to note why the Dirichlet generating function works in this proof, whereas the standard power series generating function defined in (5) does not. There is a similar inversion formula to (20) for Q, using Cauchy’s integral formula.

\displaystyle  p(n) = \frac{1}{(n-1)!}Q^{(n)}(0)=\frac{n}{2\pi i}\oint_C\frac{Q(t)}{t^{n+1}}\,dt,

where C is any closed curve in the unit disc winding once anticlockwise about the origin. If Q extended to a meromorphic function on some neighbourhood of the unit disc, then we could move the curve C outwards so that it lies on a circle of radius {r>1} giving an asymptotic expansion up to an exponentially decaying remainder term

\displaystyle  p(n)=\sum_{\vert\rho\vert<r}\frac{n}{\rho^{n+1}}{\rm Res}(Q,\rho) + O(r^{-n}).

However, we know that the series p(n) is not like this. It has asymptotics decaying at rate {n^{-r}}, which is much slower than exponential decay. It follows that the generating function Q does not have a meromorphic extension, and the kind of argument used above cannot work with Q. We should, therefore, use the correct kind of generating function whose inversion formula can lead to asymptotic terms appropriate to the series in question.

From a slightly different point of view, the generating function Q did not satisfy a functional equation which could directly give us a meromorphic expansion. Instead, it satisfied a differential equation (6). The reason for this is that the 1/n terms on the right hand side of (3) lead to a differential term in the expression for the generating function. On the other hand, multiplying a series by 1/n only causes its Dirichlet generated function to be translated by 1, so we obtain a functional equation (16) capable of generating a meromorphic extension to the whole complex plane.

It was noted above that, because of the p(n/2) term in the recurrence relation (3), it should be no surprise that the asymptotic expansion includes infinitely many independent terms. We can also see how this infinite set of terms arises, by looking at the functional equation (16) for L. The p(n/2) term leads to the {2^{-s}} in the factor on the left hand side of (16). As {s-1+2^{-s}} has infinitely many zeros, the Dirichlet generating function picks up an infinite sequence on independent poles. If, on the other hand, the recurrence relation only referred to p(nj) for a finite set of fixed numbers j then the term multiplying L on the left of (16) would have been a polynomial, with finitely many zeros. Then, we would have only obtained a finite sequence of independent terms in the asymptotic expansion.

The proof given above is, largely, quite standard and very similar to other well known proofs of asymptotic expansions. This involves the following steps,

  • Define a (Dirichlet) generating function.
  • Establish a functional equation.
  • Using the functional equation, prove the existence of a meromorphic extension.
  • Use Perron’s formula to express the original series as a contour integral.
  • Move the line of integration to the left, using Cauchy’s residue theorem to pick up asymptotic terms from the poles of the generating function.

There are, not surprisingly, further technicalities. It is necessary to use a bound on the growth of the generating function in order to bound the contour integrals. Above, I made use of the fact that L(s) is polynomially bounded on each half-plane.

However, there was one further complicating issue in the proof above, requiring a bit of back-and-forth between the Dirichlet series and the approximate recurrence relation (15). This is because, as was noted, there is no way that Cauchy’s residue theorem would give the correct periodic coefficients in the expansion. Instead, we obtained a formula for a smoothed version of the version of the series which averaged out the periodic terms. Applying this to the remainder term {p-p_\alpha} in the asymptotic expansion, it was still necessary to go back to the recurrence relation to show that the bound on the smoothed series also implied a similar bound on the original series. It might be possible to avoid this final step by using a different kind of generating function. Maybe, a `double’ generating function involving both a Fourier transform and a Dirichlet series such as,

\displaystyle  L(\omega,s)=\sum_{n=1}^\infty p(n)\omega^n n^{-1-s}

for complex roots of unity {\omega} and complex s with positive real part. Alternatively, a generating function involving Dirichlet characters {\chi} could be another way forwards,

\displaystyle  L(\chi,s)=\sum_{n=1}^\infty p(n)\chi(n)n^{-1-s}.


  1. Michael Drmota and Wojciech Szpankowski. A General Discrete Divide and Conquer Recurrence and Its Applications. Preprint, 2010. Available from Michael Drmota’s homepage.
  2. Philippe Flajolet and Mordecai Golin. Exact asymptotics of divide-and-conquer recurrences. Lecture notes in Computer Science, 1993, Volume 700, 137-149.
  3. Fillippe Flajolet and Robert Sedgewick. Analytic Combinatorics. Cambridge University Press, 2009.
  4. Hsien-Kuei Hwang. Asymptotic expansions of the mergesort recurrences. Acta Informatica, 1998, Volume 35, Number 11, 911-919.



  1. George,

    Let me be the first to thank you for the incredible amount of work that you put into this problem. I have been thinking about this problem for a long time, and am very grateful to see the “correct” approach to such recurrences. I look forward to reading your explanation in detail and delving into the references.

    Comment by Byron Schmuland — 6 October 10 @ 8:56 PM | Reply

    • No problem. It was a very interesting question, and was fun to work out. This kind of problem, and the solution, are new to me, and I don’t know if it has been studied before. The recurrences studied in the references have some similarity to this one, and Dirichlet generating functions are used. The asymptotics are rather different to the recurrence studied here though, and don’t have the periodic coefficients. If I come across any better references, I’ll add them to the list.

      Comment by George Lowther — 14 October 10 @ 6:00 PM | Reply

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Create a free website or blog at