Friday, March 03, 2023

an average number of 1s and 2s

The terms of the Virahanka–Fibonacci sequence \((V(n))_{n\ge0}\) (OEIS A000045, shifted to start with \(V(0)=1\) instead of 0) count the number of ways each natural number \(n\) can be expressed as a sum of 1s and 2s, in which the order in the sum matters. For example, \(V(5) = 8\) because \[ \begin{array}{rl} 5 &= 1 + 1 + 1 + 1 + 1 \\ &= 1 + 1 + 1 + 2 = 1 + 1 + 2 + 1 = 1 + 2 + 1 + 1 = 2 + 1 + 1 + 1 \\ &= 1 + 2 + 2 = 2 + 1 + 2 = 2 + 2 + 1 \end{array} \]

Here is one way to find an exact expression for \(V(n)\). The coefficient of \(x^n\) in the polynomial \((x+x^2)^k\) expresses the number of ways to write \(n\) as a sum of exactly \(k\) 1s and 2s. (Try it!) We can add these polynomials together as a geometric series to obtain the generating function \[ f(x) = \sum_{n=0}^\infty V(n)\, x^n = \sum_{k=0}^\infty \left(x+x^2\right)^k = \frac{1}{1 - x - x^2}. \] The denominator \(1 - x - x^2\) can be factored as \((1 - \varphi x)(1 + \varphi^{-1} x)\), where \(\varphi - \varphi^{-1} = 1\). The positive solution to this latter equation is the golden ratio \[ \varphi = \frac{1 + \sqrt5}{2}. \] Using partial fractions, we can write \[ \begin{array}{rl} f(x) &= \dfrac{\varphi}{\varphi+\varphi^{-1}}\cdot\dfrac{1}{1 - \varphi x} + \dfrac{\varphi^{-1}}{\varphi+\varphi^{-1}}\cdot\dfrac{1}{1 + \varphi^{-1} x} \\ &= \displaystyle \frac{\varphi}{\varphi+\varphi^{-1}}\sum_{n=0}^\infty (\varphi x)^n + \frac{\varphi^{-1}}{\varphi+\varphi^{-1}}\sum_{n=0}^\infty (-\varphi^{-1}x)^n \\ &= \displaystyle\sum_{n=0}^\infty \frac{\varphi^{n+1} - (-\varphi)^{-(n+1)}}{\varphi+\varphi^{-1}} x^n \\ \end{array} \] and by equating coefficients we get an explicit formula for \(V(n)\) in terms of powers of \(\varphi\): \[ V(n) = \frac{\varphi^{n+1} - (-\varphi)^{-(n+1)}}{\varphi+\varphi^{-1}}. \] This is known as Binet’s formula. (The denominator in this expression is simply \(\sqrt5\). But writing it as \(\varphi-(-\varphi^{-1})\) instead better shows how it relates to the numerator.)

Now, what if we want to know not only how many ways \(n\) can be expressed as a sum of 1s and 2s, but how many 1s and 2s are needed, on average, to create a sum that equals \(n\)? For example, among the eight ways to write 5 as a sum of 1s and 2s, there are a total of twenty 1s and ten 2s, with thirty terms in all. So, on average, a sum that equals 5 has 2.5 1s and 1.25 2s, for an average of 3.75 terms.

We can adapt the method of generating functions by introducing new variables that allow us to keep track of how many times 1 and 2 appear in a sum. We’ll use \(t\) to represent a 1 in a sum and \(u\) to represent a 2. Then the coefficient of \(x^n\) in \((tx+ux^2)^k\) will be a polynomial in \(t\) and \(u\) such that the coefficient of \(t^p u^q\) counts the number of ways to express \(n\) as a sum of \(p\) 1s and \(q\) 2s, with \(p+q = k\). As before, we can add these polynomials together in a series to get the multivariate generating function \[ F(x,t,u) = \sum_{k=0}^\infty \left(tx+ux^2\right)^k = \frac{1}{1 - tx - ux^2}. \] Note that \(F(x,1,1) = f(x)\). When \(F(x,t,u)\) is expanded as a power series, the coefficient of \(x^n\) is a polynomial \(P_n(t,u)\) such that the coefficient of \(t^p u^q\) counts the total number of ways to express \(n\) as a sum of \(p\) 1s and \(q\) 2s.

We can determine the coefficients of \(P_n(t,u)\) explicitly using binomial coefficients. If a sum has \(p+q\) terms of which \(p\) are 1s and \(q\) are 2s, then there are \(\binom{p+q}{p}=\binom{p+q}{q}\) ways to arrange the terms (among the \(p+q\) terms, select the positions for the 1s or for the 2s, respectively). If the sum is equal to \(n\), then \(p+2q=n\), and so \[ \begin{array}{rl} P_n(t,u) &= \displaystyle\sum_{p+2q=n} \binom{p+q}{p} t^p u^q = \displaystyle\sum_{p+2q=n} \binom{p+q}{q} t^p u^q \\ &= \displaystyle\sum_{q=0}^{\lfloor n/2 \rfloor} \binom{n-q}{q} t^{n-2q}u^q. \end{array} \] Because \(P_n(1,1)=V(n)\), we thereby also obtain an expression for \(V(n)\) as a sum of binomial coefficients: \[ V(n) = \sum_{q=0}^{\lfloor n/2 \rfloor} \binom{n-q}{q}. \] Here the parameter \(q\) counts the number of terms equal to 2. The total number of 2s in all compositions of \(n\) into 1s and 2s is thus \[ \sum_{q=0}^{\lfloor n/2 \rfloor} q\binom{n-q}{q} \] This sequence begins 0, 0, 1, 2, 5, 10, 20, 38, 71, … (OEIS A001629). If we divide these values by the corresponding terms of the sequence \(V(n)\), we obtain the average number of 2s in all compositions of \(n\) into 1s and 2s for particular values of \(n\):

in a sum of 1s and 2s equal to  1    2     3    4    5      6     7     8 
the average number of 2s is  0  1/2 2/3  1  5/4 20/13 38/21 71/34

But how can we study how this average behaves asymptotically?

Let’s write \(T_1(n)\) for the total number of 1s in all sums of 1s and 2s equal to \(n\), \(T_2(n)\) for the total number of 2s, and \(T(n)=T_1(n)+T_2(n)\) for the total number of all terms. We already have an expression for \(T_2(n)\) above; let’s consider its structure. It takes each coefficient of \(t^p u^q\) in \(P_n(t,u)\) and multiplies that coefficient by \(q\), then adds together the results. The same quantity can be obtained by differentiating \(P_n(t,u)\) with respect to \(u\) and evaluating the result at \((t,u)=(1,1)\). That is, \[ T_2(n) = \sum_{q=0}^{\lfloor n/2 \rfloor} q\binom{n-q}{q} = \frac{\partial}{\partial u}P_n(t,u)\bigg\vert_{(t,u)=(1,1)}. \] Similarly, \[ T_1(n) = \frac{\partial}{\partial t}P_n(t,u)\bigg\vert_{(t,u)=(1,1)} \] and \[ T(n) = \frac{d}{dt}P_n(t,t)\bigg\vert_{t=1} = \frac{\partial}{\partial t}P_n(t,u)\bigg\vert_{(t,u)=(1,1)} + \frac{\partial}{\partial u}P_n(t,u)\bigg\vert_{(t,u)=(1,1)}. \] To carry out this computation for all degrees simultaneously, we can use the partial derivatives of \(F(x,t,u)\) with respect to \(t\) and \(u\): \[ \frac{\partial}{\partial t} F(x,t,u) = \frac{\partial}{\partial t} \frac{1}{1 - tx - ux^2} = \frac{x}{(1 - tx - ux^2)^2} \] \[ \frac{\partial}{\partial u} F(x,t,u) = \frac{\partial}{\partial u} \frac{1}{1 - tx - ux^2} = \frac{x^2}{(1 - tx - ux^2)^2} \] Evaluating these at \((t,u)=(1,1)\) gives us the generating functions for \(T_1(n)\) and \(T_2(n)\): \[ \sum_{n=0}^\infty T_1(n)\,x^n = \frac{x}{(1 - x - x^2)^2} \] \[ \sum_{n=0}^\infty T_2(n)\,x^n = \frac{x^2}{(1 - x - x^2)^2} \] Note that these functions are respectively \(x(f(x))^2\) and \(x^2(f(x))^2\). So it is enough to determine the coefficients of \((f(x))^2\), and then shift them as appropriate: \[ (f(x))^2 = \displaystyle\left(\sum_{n=0}^\infty V(n)\, x^n\right)^2 = \displaystyle\sum_{n=0}^\infty \left(\sum_{k=0}^n V(k)V(n-k)\right) x^n \] From the formula for \(V(n)\) in terms of \(\varphi\), the coefficient of \(x^n\) becomes \[ \begin{array}{rl} &\displaystyle\sum_{k=0}^n \left(\frac{\varphi^{k+1}-(-\varphi)^{-(k+1)}}{\varphi+\varphi^{-1}}\right) \left(\frac{\varphi^{n-k+1}-(-\varphi)^{-(n-k+1)}}{\varphi+\varphi^{-1}}\right) \\ =&\displaystyle\frac{1}{(\varphi+\varphi^{-1})^2}\sum_{k=0}^n\big(\varphi^{n+2} + \varphi^k(-\varphi)^{k-n} + \varphi^{n-k}(-\varphi)^{-k} + (-\varphi)^{-(n+2)}\big) \\ =&\displaystyle\frac{1}{5}\left(\sum_{k=0}^n\big(\varphi^{n+2} + (-\varphi)^{-(n+2)}\big) + \frac{1}{(-\varphi)^n}\sum_{k=0}^n(-\varphi^2)^k + \varphi^n\sum_{k=0}^n (-\varphi^{-2})^k\right) \\ =&\displaystyle\frac{1}{5}\left((n+1)\big(\varphi^{n+2} + (-\varphi)^{-(n+2)}\big) + 2\cdot\frac{\varphi^{n+1}-(-\varphi)^{-(n+1)}}{\varphi+\varphi^{-1}}\right) \\ =&\displaystyle\frac{1}{5}\big((n+1)V(n+2) + (n+3)V(n)\big) \end{array} \] where we have used the formula for a sum of a partial geometric series, along with the fact that \[ \big(\varphi^{n+2}+(-\varphi)^{-(n+2)}\big)\frac{\varphi+\varphi^{-1}}{\varphi+\varphi^{-1}} = \frac{\varphi^{n+3}-(-\varphi)^{-(n+3)}}{\varphi+\varphi^{-1}}+\frac{\varphi^{n+1}-(-\varphi)^{-(n+1)}}{\varphi+\varphi^{-1}} \] (the reader is encouraged to fill in other details of the calculations). Therefore \[ \frac{T_1(n)}{V(n)} = \frac{1}{5}\cdot\frac{nV(n+1)+(n+2)V(n-1)}{V(n)} \] \[ \frac{T_2(n)}{V(n)} = \frac{1}{5}\cdot\frac{(n-1)V(n)+(n+1)V(n-2)}{V(n)} \] and thus, knowing that \(\displaystyle\lim_{n\to\infty} V(n+1)/V(n) = \varphi\), we have \[ \lim_{n\to\infty}\frac1n\cdot\frac{T_1(n)}{V(n)} = \frac{1}{5}\left(\varphi+\frac{1}{\varphi}\right) = \frac{1}{\sqrt5} \] \[ \lim_{n\to\infty}\frac1n\cdot\frac{T_2(n)}{V(n)} = \frac{1}{5}\left(1+\frac{1}{\varphi^2}\right) = \frac{2}{5+\sqrt5} \] \[ \lim_{n\to\infty}\frac1n\cdot\frac{T(n)}{V(n)} = \frac{1}{5}\big(\varphi+2\big) = \frac{5+\sqrt5}{10} \] That is, if \(n\) is large, on average a sum of 1s and 2s equal to \(n\) will have about \(n\cdot\dfrac{5+\sqrt5}{10}\approx 0.7236n\) terms, and the ratio of 1s to 2s will on average be about \(\varphi\), or about 1.618. It took some work to get here, but once we have the answer in hand, it isn’t too surprising: the prevalence of \(\varphi\)’s appearances on this topic makes it entirely likely that the 1s and 2s would appear in that ratio, and once we know that \(p+2q=n\) and \(p\approx\varphi q\), we can solve to find \(p\approx n/\sqrt5\) and \(q\approx n/(\varphi\sqrt5)\).


P.S. The polynomials \(P_n(t) = P_n(t,1)\) that appear as coefficients of \(x^n\) in the expansion of \(1/(1-tx-x^2)\) are sometimes called Fibonacci polynomials. They satisfy a recurrence relation similar to that of the Fibonacci numbers themselves: \[ P_0(t) = 1, \qquad P_1(t) = t, \qquad P_n(t) = tP_{n-1}(t) + P_{n-2}(t). \]

P.P.S. In the book Analytic Combinatorics by P. Flajolet and R. Sedgewick, an example in chapter 3 (Proposition III.4) shows that, if one allows compositions with terms of any size, not just 1s and 2s, then for any \(n\geq1\) the average number of terms in a composition of \(n\) is always \((n+1)/2\), and as \(n\to\infty\), the number of times an integer \(m\geq1\) appears in a composition of \(n\) tends to \(n/2^{m+1}\). That is, for large \(n\), about half of the terms are 1s, about a quarter of the terms are 2s, and so on. My guess is that for compositions with terms no larger than \(N\), the appearances of 1, 2, …, \(N\) will distribute according to powers of the positive solution to \(x^N + x^{N-1} + \cdots + x = 1\). When \(N=2\), this solution is \(1/\varphi\), and as \(N\to\infty\) it approaches \(1/2\).

No comments: