Thales’ triangles

Tuesday, June 24, 2025

a broad context for Möbius functions

This post is about the Möbius function, which is not something I have had to think about very often. It has come up in some combinatorial problems I’ve been working on recently, and so I’ve been reading about it bit by bit. One particular presentation I have found especially helpful, and that’s what I’m going to share here.

The “classical” Möbius function $\mu$ is defined on the positive integers by \[ \mu(n) = \begin{cases} 1 & \text{if}\ n = 1 \\ (-1)^r & \text{if}\ n=p_1\cdots p_r,\ \text{a product of $r$ distinct primes} \\ 0 & \text{otherwise.} \end{cases} \] I have never particularly understood this function. Its definition is simple, but to me opaque. Why would this function be of interest? (I’m not interested in answers such as “because it works.”) Maybe it’s just because I’m not a number theorist at heart.

For independent reasons, I recently read the monograph Commutation and Rearrangements by Pierre Cartier and Dominique Foata, and therein they demonstrate the existence of a Möbius function for any monoid $M$ with finite factorization: that is, given any $x\in M$, there are only finitely many sequences $(x_1,\dots,x_n)$ of non-identity elements $x_i$ such that $x=x_1\cdots x_n$. If $1_M$ is the identity element of $M$, then it is assumed to be the product of the empty sequence.

As two examples, think of the positive integers with multiplication as the operation, or strings over a given alphabet with concatenation as the operation. In a sense, these are two “extreme” examples, because the positive integers form a commutative monoid with an infinite set of generators (the primes), while the only elements that commute in the monoid of strings are powers of a single generator (the elements of the alphabet being the generators).

One consequence of the assumption that every element of $M$ has finitely many factorizations is that $1_M$ has only the empty factorization, because otherwise if $1_M=x_1\cdots x_n$ for some nonempty sequence $(x_1,\dots,x_n)$, then this could be used to produce infinitely many factorizations of every element of $M$.

Here is the definition that Cartier and Foata give for the Möbius function $\mu_M$ of M: given $x\in M$,

$d_+(x)$ is the number of ways $x$ can be written as a product of an even number of factors;
$d_-(x)$ is the number of ways $x$ can be written as a product of an odd number of factors;
$\mu_M(x) = d_+(x)-d_-(x)$.

This is a definition I can get behind! Now the Möbius function is revealed as the difference between two natural counting functions. Note that for any $M$, because the identity $1_M$ has only the empty factorization, with zero factors, $d_+(1_M)=1$ and $d_-(1_M)=0$, which means $\mu_M(1_M)=1$ always.

Let’s see how the rest of the values of $\mu_M$ work out for the two examples we have in mind.

First suppose that $M$ is the monoid of strings over an alphabet $A$. (This will turn out to be the easier of the two cases.) If $x\in A$ is a string with only one letter, then it can only be written as $x$, so $\mu_M(x) = -1$. If $x\in M$ has $n>1$ letters, then it can be factored in $2^{n-1}$ ways, by introducing “breaks” between adjacent letters. Half of these factorizations have an even number of factors, and half have an odd number of factors, so $d_+(x)=d_-(x)=2^{n-2}$, which means $\mu_M(x) = 0$. For example, we can factor the string $acbb$ in eight ways:

$(acbb)$	$(a)(c)(bb)$	$(a)(cb)(b)$	$(ac)(b)(b)$
$(a)(cbb)$	$(ac)(bb)$	$(acb)(b)$	$(a)(c)(b)(b)$

The top row shows the factorizations with an odd number of factors, and the bottom row shows the factorizations with an even number of factors.

Now suppose $M = \mathbb{Z}_+$ is the monoid of positive integers. If $p$ is a prime, then it can only be written as the product $p$, so $d_+(p)=0$ and $d_-(p)=1$, which means $\mu_M(p) = -1$. If $p$ and $q$ are distinct primes, then we have $pq = (pq) = (p)(q) = (q)(p)$, so $d_+(pq) = 2$ and $d_-(pq) = 1$, which means $\mu_M(pq) = 2-1 = 1$. On the other hand, $p^2 = (p^2) = (p)(p)$, so $d_+(p^2)=d_-(p^2)=1$, which means $\mu_M(p^2)=0$. More generally, if $x=p_1\cdots p_n$ with $p_1,\dots,p_n$ all distinct primes, then any factorization of $x$ is given by choosing $k_1$ prime factors, then $k_2$ prime factors, and so on, until $x$ is expressed as a product of $m$ factors as \[x=(p_1\cdots p_{k_1})(p_{k_1+1}\cdots p_{k_1+k_2})\cdots(p_{k_1+\cdots+k_{m-1}+1}\cdots p_{k_1+\cdots+k_m})\] where $k_1+\cdots+k_m=n$. Within each parenthetical factor, the order of the primes doesn’t matter, but the order of the parenthetical factors does matter. This is an ordered partition of the prime factors of $x$.

It is possible to compute $d_+(x)$ and $d_-(x)$ directly, based on the ordered partitions of the prime factors of $x$. (See OEIS A000670 for the total number of ordered partitions of $n$ elements, A089677 for the number of ordered partitions with an odd number of blocks, and A052841 for the number of ordered partitions with an even number of blocks.) However, here we will instead use induction to show the simpler result we need, that if $p_1,\dots,p_n$ are $n$ distinct primes, then \[ d_+(p_1\dots p_n) - d_-(p_1\dots p_n) = (-1)^n \] which recovers the classical Möbius function on numbers of this form. We have already shown the cases $n=1$ and $n=2$ in the previous paragraph, so suppose the result is true for some arbitrary value of $n$, and let $q$ be a prime distinct from $p_1,\dots,p_n$. Set $x=p_1\cdots p_n$. A factorization of $xq$ can be obtained from a factorization of $x$ in three ways:

by inserting $q$ into one of the existing factors;
by introducing $q$ as a new factor immediately before one of the existing factors;
by introducing $q$ as a new factor at the end.

The first operation preserves the parity of the number of factors, while the second operation switches the parity of the number of factors. Because there are the same number of these two types of operations, in computing the difference $d_+(xq)-d_-(xq)$, they cancel out. The third operation also switches the parity of the number of factors, and it is only applied once to each of the factorizations of $x$. Thus \[ d_+(xq)-d_-(xq) = d_-(x)-d_+(x) = -(-1)^n = (-1)^{n+1} \] and the result follows.

On the other hand, if $x\in\mathbb{Z}_+$ is divisible by a prime power $p^k$ with $k>1$, then we can first examine the ways to split up this power. By treating $p^k$ as a string over the alphabet $\{p\}$, we see that there are the same number of factorizations having an even number of factors as an odd number of factors, and so $d_+(p^k)-d_-(p^k)=0$. Introducing the rest of the prime factors of $x$ one at a time, the same argument as in the previous paragraph shows that $d_+(x)-d_-(x)=d_+(p^k)-d_-(p^k)=0$. (The case where more than one prime factor of $x$ appears with multiplicity higher than $1$ is left as an exercise to the reader.)

The purpose of the Möbius function of $M$ is mainly to enable Möbius inversion, a way of transforming between related complex-valued functions on $M$. I won’t go into that process now, because this post has taken long enough already, but I encourage anyone interested to read Cartier and Foata’s monograph (which has a load of other great ideas!).

I will observe that in the original paper where Möbius introduced his eponymous function ( “Über eine besondere Art von Umkehrung der Reihen”, 1832), he derived the expression \[ x = \frac{x}{1-x} - \frac{x^2}{1-x^2} - \frac{x^3}{1-x^3} - \frac{x^5}{1-x^5} + \frac{x^6}{1-x^6} - \frac{x^7}{1-x^7} + \frac{x^{10}}{1-x^{10}} \cdots \] which can be written more succinctly as \[ x = \sum_{n=1}^\infty \frac{\mu(n)x^n}{1-x^n} \] (the expression on the right is a Lambert series). In this context the definition of $\mu(n)$ also makes sense to me, as a collection of coefficients, as it can be derived by an application of the inclusion–exclusion principle. Recall that the summation formula for the geometric series yields \[ \frac{x^n}{1-x^n} = \sum_{k=1}^\infty x^{nk}\text. \] So if you start with the series $x+x^2+x^3+\cdots = x/(1-x)$ and you want to reduce it to $x$, you can remove all the powers that are multiples of $2$, then remove all powers that are multiples of $3$, then add back in all powers that are multiples of $6$ because they got removed twice, then remove all powers that are multiples of $5$, then add back powers that are multiples of $10$ or $15$, then remove powers that are multiples of $30$ because they got removed three times and added back in three times, and so on…

Friday, March 03, 2023

an average number of 1s and 2s

The terms of the Virahanka–Fibonacci sequence $(V(n))_{n\ge0}$ (OEIS A000045, shifted to start with $V(0)=1$ instead of 0) count the number of ways each natural number $n$ can be expressed as a sum of 1s and 2s, in which the order in the sum matters. For example, $V(5) = 8$ because \[ \begin{array}{rl} 5 &= 1 + 1 + 1 + 1 + 1 \\ &= 1 + 1 + 1 + 2 = 1 + 1 + 2 + 1 = 1 + 2 + 1 + 1 = 2 + 1 + 1 + 1 \\ &= 1 + 2 + 2 = 2 + 1 + 2 = 2 + 2 + 1 \end{array} \]

Here is one way to find an exact expression for $V(n)$. The coefficient of $x^n$ in the polynomial $(x+x^2)^k$ expresses the number of ways to write $n$ as a sum of exactly $k$ 1s and 2s. (Try it!) We can add these polynomials together as a geometric series to obtain the generating function \[ f(x) = \sum_{n=0}^\infty V(n)\, x^n = \sum_{k=0}^\infty \left(x+x^2\right)^k = \frac{1}{1 - x - x^2}. \] The denominator $1 - x - x^2$ can be factored as $(1 - \varphi x)(1 + \varphi^{-1} x)$, where $\varphi - \varphi^{-1} = 1$. The positive solution to this latter equation is the golden ratio \[ \varphi = \frac{1 + \sqrt5}{2}. \] Using partial fractions, we can write \[ \begin{array}{rl} f(x) &= \dfrac{\varphi}{\varphi+\varphi^{-1}}\cdot\dfrac{1}{1 - \varphi x} + \dfrac{\varphi^{-1}}{\varphi+\varphi^{-1}}\cdot\dfrac{1}{1 + \varphi^{-1} x} \\ &= \displaystyle \frac{\varphi}{\varphi+\varphi^{-1}}\sum_{n=0}^\infty (\varphi x)^n + \frac{\varphi^{-1}}{\varphi+\varphi^{-1}}\sum_{n=0}^\infty (-\varphi^{-1}x)^n \\ &= \displaystyle\sum_{n=0}^\infty \frac{\varphi^{n+1} - (-\varphi)^{-(n+1)}}{\varphi+\varphi^{-1}} x^n \\ \end{array} \] and by equating coefficients we get an explicit formula for $V(n)$ in terms of powers of $\varphi$: \[ V(n) = \frac{\varphi^{n+1} - (-\varphi)^{-(n+1)}}{\varphi+\varphi^{-1}}. \] This is known as Binet’s formula. (The denominator in this expression is simply $\sqrt5$. But writing it as $\varphi-(-\varphi^{-1})$ instead better shows how it relates to the numerator.)

Now, what if we want to know not only how many ways $n$ can be expressed as a sum of 1s and 2s, but how many 1s and 2s are needed, on average, to create a sum that equals $n$? For example, among the eight ways to write 5 as a sum of 1s and 2s, there are a total of twenty 1s and ten 2s, with thirty terms in all. So, on average, a sum that equals 5 has 2.5 1s and 1.25 2s, for an average of 3.75 terms.

We can adapt the method of generating functions by introducing new variables that allow us to keep track of how many times 1 and 2 appear in a sum. We’ll use $t$ to represent a 1 in a sum and $u$ to represent a 2. Then the coefficient of $x^n$ in $(tx+ux^2)^k$ will be a polynomial in $t$ and $u$ such that the coefficient of $t^p u^q$ counts the number of ways to express $n$ as a sum of $p$ 1s and $q$ 2s, with $p+q = k$. As before, we can add these polynomials together in a series to get the multivariate generating function \[ F(x,t,u) = \sum_{k=0}^\infty \left(tx+ux^2\right)^k = \frac{1}{1 - tx - ux^2}. \] Note that $F(x,1,1) = f(x)$. When $F(x,t,u)$ is expanded as a power series, the coefficient of $x^n$ is a polynomial $P_n(t,u)$ such that the coefficient of $t^p u^q$ counts the total number of ways to express $n$ as a sum of $p$ 1s and $q$ 2s.

We can determine the coefficients of $P_n(t,u)$ explicitly using binomial coefficients. If a sum has $p+q$ terms of which $p$ are 1s and $q$ are 2s, then there are $\binom{p+q}{p}=\binom{p+q}{q}$ ways to arrange the terms (among the $p+q$ terms, select the positions for the 1s or for the 2s, respectively). If the sum is equal to $n$, then $p+2q=n$, and so \[ \begin{array}{rl} P_n(t,u) &= \displaystyle\sum_{p+2q=n} \binom{p+q}{p} t^p u^q = \displaystyle\sum_{p+2q=n} \binom{p+q}{q} t^p u^q \\ &= \displaystyle\sum_{q=0}^{\lfloor n/2 \rfloor} \binom{n-q}{q} t^{n-2q}u^q. \end{array} \] Because $P_n(1,1)=V(n)$, we thereby also obtain an expression for $V(n)$ as a sum of binomial coefficients: \[ V(n) = \sum_{q=0}^{\lfloor n/2 \rfloor} \binom{n-q}{q}. \] Here the parameter $q$ counts the number of terms equal to 2. The total number of 2s in all compositions of $n$ into 1s and 2s is thus \[ \sum_{q=0}^{\lfloor n/2 \rfloor} q\binom{n-q}{q} \] This sequence begins 0, 0, 1, 2, 5, 10, 20, 38, 71, … (OEIS A001629). If we divide these values by the corresponding terms of the sequence $V(n)$, we obtain the average number of 2s in all compositions of $n$ into 1s and 2s for particular values of $n$:

in a sum of 1s and 2s equal to	1	2	3	4	5	6	7	8
the average number of 2s is	0	1/2	2/3	1	5/4	20/13	38/21	71/34

But how can we study how this average behaves asymptotically?

Let’s write $T_1(n)$ for the total number of 1s in all sums of 1s and 2s equal to $n$, $T_2(n)$ for the total number of 2s, and $T(n)=T_1(n)+T_2(n)$ for the total number of all terms. We already have an expression for $T_2(n)$ above; let’s consider its structure. It takes each coefficient of $t^p u^q$ in $P_n(t,u)$ and multiplies that coefficient by $q$, then adds together the results. The same quantity can be obtained by differentiating $P_n(t,u)$ with respect to $u$ and evaluating the result at $(t,u)=(1,1)$. That is, \[ T_2(n) = \sum_{q=0}^{\lfloor n/2 \rfloor} q\binom{n-q}{q} = \frac{\partial}{\partial u}P_n(t,u)\bigg\vert_{(t,u)=(1,1)}. \] Similarly, \[ T_1(n) = \frac{\partial}{\partial t}P_n(t,u)\bigg\vert_{(t,u)=(1,1)} \] and \[ T(n) = \frac{d}{dt}P_n(t,t)\bigg\vert_{t=1} = \frac{\partial}{\partial t}P_n(t,u)\bigg\vert_{(t,u)=(1,1)} + \frac{\partial}{\partial u}P_n(t,u)\bigg\vert_{(t,u)=(1,1)}. \] To carry out this computation for all degrees simultaneously, we can use the partial derivatives of $F(x,t,u)$ with respect to $t$ and $u$: \[ \frac{\partial}{\partial t} F(x,t,u) = \frac{\partial}{\partial t} \frac{1}{1 - tx - ux^2} = \frac{x}{(1 - tx - ux^2)^2} \] \[ \frac{\partial}{\partial u} F(x,t,u) = \frac{\partial}{\partial u} \frac{1}{1 - tx - ux^2} = \frac{x^2}{(1 - tx - ux^2)^2} \] Evaluating these at $(t,u)=(1,1)$ gives us the generating functions for $T_1(n)$ and $T_2(n)$: \[ \sum_{n=0}^\infty T_1(n)\,x^n = \frac{x}{(1 - x - x^2)^2} \] \[ \sum_{n=0}^\infty T_2(n)\,x^n = \frac{x^2}{(1 - x - x^2)^2} \] Note that these functions are respectively $x(f(x))^2$ and $x^2(f(x))^2$. So it is enough to determine the coefficients of $(f(x))^2$, and then shift them as appropriate: \[ (f(x))^2 = \displaystyle\left(\sum_{n=0}^\infty V(n)\, x^n\right)^2 = \displaystyle\sum_{n=0}^\infty \left(\sum_{k=0}^n V(k)V(n-k)\right) x^n \] From the formula for $V(n)$ in terms of $\varphi$, the coefficient of $x^n$ becomes \[ \begin{array}{rl} &\displaystyle\sum_{k=0}^n \left(\frac{\varphi^{k+1}-(-\varphi)^{-(k+1)}}{\varphi+\varphi^{-1}}\right) \left(\frac{\varphi^{n-k+1}-(-\varphi)^{-(n-k+1)}}{\varphi+\varphi^{-1}}\right) \\ =&\displaystyle\frac{1}{(\varphi+\varphi^{-1})^2}\sum_{k=0}^n\big(\varphi^{n+2} + \varphi^k(-\varphi)^{k-n} + \varphi^{n-k}(-\varphi)^{-k} + (-\varphi)^{-(n+2)}\big) \\ =&\displaystyle\frac{1}{5}\left(\sum_{k=0}^n\big(\varphi^{n+2} + (-\varphi)^{-(n+2)}\big) + \frac{1}{(-\varphi)^n}\sum_{k=0}^n(-\varphi^2)^k + \varphi^n\sum_{k=0}^n (-\varphi^{-2})^k\right) \\ =&\displaystyle\frac{1}{5}\left((n+1)\big(\varphi^{n+2} + (-\varphi)^{-(n+2)}\big) + 2\cdot\frac{\varphi^{n+1}-(-\varphi)^{-(n+1)}}{\varphi+\varphi^{-1}}\right) \\ =&\displaystyle\frac{1}{5}\big((n+1)V(n+2) + (n+3)V(n)\big) \end{array} \] where we have used the formula for a sum of a partial geometric series, along with the fact that \[ \big(\varphi^{n+2}+(-\varphi)^{-(n+2)}\big)\frac{\varphi+\varphi^{-1}}{\varphi+\varphi^{-1}} = \frac{\varphi^{n+3}-(-\varphi)^{-(n+3)}}{\varphi+\varphi^{-1}}+\frac{\varphi^{n+1}-(-\varphi)^{-(n+1)}}{\varphi+\varphi^{-1}} \] (the reader is encouraged to fill in other details of the calculations). Therefore \[ \frac{T_1(n)}{V(n)} = \frac{1}{5}\cdot\frac{nV(n+1)+(n+2)V(n-1)}{V(n)} \] \[ \frac{T_2(n)}{V(n)} = \frac{1}{5}\cdot\frac{(n-1)V(n)+(n+1)V(n-2)}{V(n)} \] and thus, knowing that $\displaystyle\lim_{n\to\infty} V(n+1)/V(n) = \varphi$, we have \[ \lim_{n\to\infty}\frac1n\cdot\frac{T_1(n)}{V(n)} = \frac{1}{5}\left(\varphi+\frac{1}{\varphi}\right) = \frac{1}{\sqrt5} \] \[ \lim_{n\to\infty}\frac1n\cdot\frac{T_2(n)}{V(n)} = \frac{1}{5}\left(1+\frac{1}{\varphi^2}\right) = \frac{2}{5+\sqrt5} \] \[ \lim_{n\to\infty}\frac1n\cdot\frac{T(n)}{V(n)} = \frac{1}{5}\big(\varphi+2\big) = \frac{5+\sqrt5}{10} \] That is, if $n$ is large, on average a sum of 1s and 2s equal to $n$ will have about $n\cdot\dfrac{5+\sqrt5}{10}\approx 0.7236n$ terms, and the ratio of 1s to 2s will on average be about $\varphi$, or about 1.618. It took some work to get here, but once we have the answer in hand, it isn’t too surprising: the prevalence of $\varphi$’s appearances on this topic makes it entirely likely that the 1s and 2s would appear in that ratio, and once we know that $p+2q=n$ and $p\approx\varphi q$, we can solve to find $p\approx n/\sqrt5$ and $q\approx n/(\varphi\sqrt5)$.

P.S. The polynomials $P_n(t) = P_n(t,1)$ that appear as coefficients of $x^n$ in the expansion of $1/(1-tx-x^2)$ are sometimes called Fibonacci polynomials. They satisfy a recurrence relation similar to that of the Fibonacci numbers themselves: \[ P_0(t) = 1, \qquad P_1(t) = t, \qquad P_n(t) = tP_{n-1}(t) + P_{n-2}(t). \]

P.P.S. In the book Analytic Combinatorics by P. Flajolet and R. Sedgewick, an example in chapter 3 (Proposition III.4) shows that, if one allows compositions with terms of any size, not just 1s and 2s, then for any $n\geq1$ the average number of terms in a composition of $n$ is always $(n+1)/2$, and as $n\to\infty$, the number of times an integer $m\geq1$ appears in a composition of $n$ tends to $n/2^{m+1}$. That is, for large $n$, about half of the terms are 1s, about a quarter of the terms are 2s, and so on. My guess is that for compositions with terms no larger than $N$, the appearances of 1, 2, …, $N$ will distribute according to powers of the positive solution to $x^N + x^{N-1} + \cdots + x = 1$. When $N=2$, this solution is $1/\varphi$, and as $N\to\infty$ it approaches $1/2$.

Wednesday, December 21, 2022

converting between sums

The number $5$ can be written in eight ways as a sum of $1$s and $2$s (in which order matters): \[ 1+1+1+1+1,\; 1+1+1+2,\; 1+1+2+1,\; 1+2+1+1, \] \[ 2+1+1+1,\; 1+2+2,\; 2+1+2,\; 2+2+1. \] The number $6$ can also be written in eight ways as a sum of odd numbers: \[ 1+1+1+1+1+1,\; 1+1+1+3,\; 1+1+3+1,\; 1+3+1+1, \] \[ 3+1+1+1,\; 1+5,\; 3+3,\; 5+1. \] Coincidence? No, and if you look carefully at the two lists of sums above, you might be able to guess how the rest of the post is going to go.

Let $a(n)$ be the number of ways to write $n$ as a sum of $1$s and $2$, and let $b(n)$ be the number of ways to write $n$ as a sum of (positive) odd numbers. The overall goal of this essay is to show that $b(n) = a(n-1)$, in two different ways.

As it turns out, $a(n)$ and $b(n)$ are both closely related to (or, depending on your perspective, the same as) the Fibonacci sequence $f(n)$, defined recursively by $f(1) = f(2) = 1$, and for $n\geq3$ \[ f(n) = f(n-1) + f(n-2). \] The sequence $f(n)$ begins, starting at $n=1$, with the terms $1, 1, 2, 3, 5, 8, 13, \dots$. The recursive formula for this sequence famously arises from the rabbit problem posed by Leonardo of Pisa: each month, Fibonacci’s Immortal Rabbits™ produce a new pair from each pair more than a month old. The number of pairs of rabbits in month $n$ is therefore the sum of the number of pairs the previous month (all of them having stuck around), and the number of pairs from two months prior (all of these pairs having produced a new pair). Start with $f(1)=1$ in the first month, and the Fibonacci sequence counts the number of pairs of rabbits you have in the $n$th month.

We can see that $a(1) = 1$ and $a(2) = 2$, because $1$ can be written just as $1$, and $2$ can be written as $2$ or as $1+1$. For larger values of $n$, the ways of writing $n$ as a sum of $1$s and $2$s can be partitioned into the sums that start with $1$ and those that start with $2$. Therefore, $a(n)$ can be expressed as $a(n-1) + a(n-2)$ for $n \ge 3$. The sequence $a(n)$ begins $1,2,3,5,8,13,21,\dots$. This is the Fibonacci sequence again, but shifted by one position: $a(n) = f(n+1)$. It is perhaps more appropriately called the Virahanka sequence. What happened to the initial $1$ in the Fibonacci sequence, though? It shifted to the 0th position: the empty sum, having no terms, is equal to 0 by definition, so $a(0)=1$. Zero can be written as a sum of $1$s and $2$ in one way, as the empty sum. (I have begun calling both $f(n)$ and $a(n)$ the Fibonacci–Virahanka sequence.)

Now let’s try to do something similiar for the sequence $b(n)$. We can check by hand that $b(1) = b(2) = 1$, because only the sum $1+1$ is allowed for $2$. Any sum of odd numbers could start with 1, or with 3, or with 5, and so on, which implies \[ b(n) = b(n-1) + b(n-3) + b(n-5) + b(n-7) + \cdots. \] While this looks potentially like an infinite sum, notice that for any particular $n$, the number of nonzero terms is finite, because $b(n) = 0$ when $n < 0$. But notice what happens when we peel off the first term of this infinite sum: \[ \begin{array}{rl} b(n) &= b(n-1) + \big(b(n-3) + b(n-5) + b(n-7) + \cdots\big) \\ &= b(n-1) + \big(b((n-2)-1) + b((n-2)-3) + b((n-2)-5) + \cdots\big) \\ &= b(n-1) + b(n-2) \end{array} \] So $b(n)$ satisfies the same recursive formula as $f(n)$ and $a(n)$! Thus this sequence begins $1,1,2,3,5,8,13,\dots$, which looks the same as the start of the sequence $f(n)$. Well, almost. It’s not quite the same sequence as $f(n)$, because $b(0)$, like $a(0)$, equals 1. So, starting at $n=0$, the sequence $b(n)$ begins $1,1,1,2,3,5,8,\dots$. The “empty sum” convention is necessary for the recursive formula for $b(n)$ to work in the case of both even and odd values of $n$. However, in Fibonacci’s rabbit problem, the rabbit keeper has zero pairs of rabbits in month 0, so $f(0)=0\ne b(0)$.

The conclusion of the preceding is that $a(n)=b(n+1)$ when $n\ge0$. Yet a second proof was promised. So consider a sum of $1$s and $2$s that is equal to $n$. For example, if $n=9$, we could write \[ 9 = 2 + 1 + 1 + 2 + 2 + 1. \] Now append a $1$ to the beginning of the sum, then wherever a $2$ appears, group it with the preceding terms, until a $1$ is reached, to produce an odd number. For example, using the sum above that equals $9$, we get \[ 10 = (1 + 2) + 1 + (1 + 2 + 2) + 1 = 3 + 1 + 5 + 1. \] This process can be reversed. Given a sum of odd numbers, convert each term to a sum of $1$ followed by a sequence of $2$s, then remove the initial $1$. For example, the sum \[16 = 1 + 3 + 5 + 7\] is converted to \[1 + (1 + 2) + (1 + 2 + 2) + (1 + 2 + 2 + 2),\] which corresponds to \[15 = 1 + 2 + 1 + 2 + 2 + 1 + 2 + 2 + 2.\] Thus we have found an explicit bijection between sums of $1$s and $2$ equal to $n$ and sums of odd numbers equal to $n+1$. These sums are also called compositions. A comment on the OEIS entry for the Fibonacci sequence observes that, more generally, “the numbers of compositions [of $n$] using parts $1$ and $k$ is equivalent to the numbers of compositions of $(n+1)$ using parts $1$ mod $k$.” An exercise for the reader! Can you generalize further?

Saturday, June 04, 2022

in triplicate

Academic grades, as conventionally understood and used, have three distinct audiences, and they serve a different function for each audience. Grades are problematic largely due to the ways these functions are at odds with each other.

The three audiences are:

The student. In the context of a single course, assessment and feedback are necessary parts of the educational process. They let the student know how they’re doing as they go along. Traditional letter grades are a crude form of feedback, and where they are used students rarely have a chance to meaningfully respond, other than to try to adapt for the next assessment. Alternative grading systems often treat the entire course duration as a traning period, in which the student can respond to feedback through revision, reassessment, or other forms of demonstrated proficiency following an initial evaluation.
The institution. Let’s be as generous as possible and assume that the primary goal of the college or university in which a student is enrolled is to educate that student in such a way that they reach their fullest potential. This requires communication among the numerous instructors that student will encounter throughout their studies, and also coordination with the other offices and institutional bodies that are present to support the student. A grade can be a succinct summary to the next instructor in a sequence about how developed are the student’s prerequisite skills. It can also be a signal to the institution about how well the student is progressing in their overall academic career. (This is the reason for “mid-semester” grades that can be used to alert the school when a student needs assistance or intervention.)
The outside world. Outside of the institution (and inside as well, to an extent), grades become currency that the student can trade for prestige or opportunities. This currency may be in the form of a GPA (to which all of the individual course grades contribute, despite being largely incommensurable with each other) or, for a more granular view, a transcript (which provides the opportunity to craft a narrative about the grades). Of course, not all persons or organizations outside academia value this currency in the same way. But “good grades” provide an inexhaustible supply of recommendations, and “bad grades” are a perpetual obstacle to be minimized or navigated around.

Thus grades are expected to operate at three different social scales, and also at three different time scales. The feedback to a student within a course is short-term, the communcation with the institution is medium-term, and the message to the outside world is long-term (or as they say, permanent). Much of the “objectivity theater” surrounding the assignment of grades is based on the pretence that these three purposes can be fulfilled by a single summative object.

The fact that the third use of grades has both the largest audience and the longest-lasting effects means that it becomes their dominant purpose, their telos. Student anxiety about grades, for the most part, is caused not by an intrinsic dislike of getting feedback about how they can improve their understanding and performance in a subject, but by the belief that in the end what has the greatest practical impact is the final letter or number they can show to the outside world when the course is done. Institutional concerns about “rigor” are based not on the needs of the student, but the needs of the school to present the final scores to the outside world as meaningful indicators of their students’ quality.

Those of us engaged in the practice of developing and implementing alternative grading systems are primarily focused on the first and smallest-scale purpose of grades, providing a useful feedback process to the student. Yet our systems must also interface with the institutional and outside world audiences. At those interfaces lie, in my view, the most difficult ethical questions of grading: how do we support students beyond our time as their instructor? how do we provide an honest evaluation that meets the needs of all three audiences? to whom are we primarily responsible? is the merging of the three functions into a single metric flawed in such a way that it needs to be overthrown?

Honestly, I think (at the time of this writing) that grades are most useful at the institutional level. If it were not for the outward-facing use of grades, they could serve as a quick, qualitative (not quantitative) shorthand in communicating among the internal parts of a college or university what are the needs or successes of an individual student. (To be supplemented by more personalized detail as necessary.) Within a course, as we’ve seen, any number of assessment/feedback systems can work, as long as they’re built on clear communication and building trust in the student-faculty relationship. As for the presentation of grades to the outside world, well, that’s where the dirty work happens.

Monday, May 23, 2022

more on poetry and partitions (sort of)

In my previous post I defined the ordered Virahanka numbers $V_{m,n}$ and the unordered Virahanka numbers $U_{m,n}$ recursively by \[ \begin{array}{rl} V_{m,n} = U_{m,n} = 0 &\qquad \text{for $n < 0$ or $m < 0$} \\ V_{m,0} = U_{m,0} = 1 &\qquad \text{for $m \ge 0$} \\ V_{m,n} = \displaystyle\sum_{k=1}^m V_{m,n-k} &\qquad \text{for $n > 0$} \\ U_{m,n} = \displaystyle\sum_{k=1}^m U_{k,n-k} &\qquad \text{for $n > 0$} \\ \end{array} \] The value of $V_{m,n}$ counts the number of compositions of $n$ with no term greater than $m$, and the value of $U_{m,n}$ counts the number of partitions of $n$ with no part greater than $m$. (Following a standard convention, I consider the “empty sum” having no terms to be equal to $0$, which is why $V_{m,0} = U_{m,0} = 1$; moreover, $V_{0,n} = U_{0,n} = 0$ for $n > 0$, because the recursive formulas above become empty sums.) Some special cases:

$V_{2,n}$ is the sequence of Virahanka–Fibonacci numbers $1,1,2,3,5,8,\dots$.
$V_{n,n}$ is the total number of compositions of $n$ (which equals $2^{n-1}$ for $n \ge 1$).
$U_{n,n}$ is the total number of partitions of $n$ (which has no known closed-form expression).

For later reference, here are partial tables of values for $V_{m,n}$ and $U_{m,n}$. \[ V_{m,n}: \begin{array}{c|c|c|c|c|c|c|c|c|c} m\backslash n & 0 & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 \\ \hline 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ \hline 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 \\ \hline 2 & 1 & 1 & 2 & 3 & 5 & 8 & 13 & 21 & 34 & 55 & 89 \\ \hline 3 & 1 & 1 & 2 & 4 & 7 & 13 & 24 & 44 & 81 & 149 & 274 \\ \hline 4 & 1 & 1 & 2 & 4 & 8 & 15 & 29 & 56 & 108 & 208 & 401 \\ \hline 5 & 1 & 1 & 2 & 4 & 8 & 16 & 31 & 61 & 120 & 236 & 464 \\ \hline 6 & 1 & 1 & 2 & 4 & 8 & 16 & 32 & 63 & 125 & 248 & 492 \\ \hline 7 & 1 & 1 & 2 & 4 & 8 & 16 & 32 & 64 & 127 & 253 & 504 \\ \hline 8 & 1 & 1 & 2 & 4 & 8 & 16 & 32 & 64 & 128 & 255 & 509 \\ \hline 9 & 1 & 1 & 2 & 4 & 8 & 16 & 32 & 64 & 128 & 256 & 511 \\ \hline 10 & 1 & 1 & 2 & 4 & 8 & 16 & 32 & 64 & 128 & 256 & 512 \end{array} \] \[ U_{m,n}: \begin{array}{c|c|c|c|c|c|c|c|c|c} m\backslash n & 0 & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 \\ \hline 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ \hline 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 \\ \hline 2 & 1 & 1 & 2 & 2 & 3 & 3 & 4 & 4 & 5 & 5 & 6 \\ \hline 3 & 1 & 1 & 2 & 3 & 4 & 5 & 7 & 8 & 10 & 12 & 14 \\ \hline 4 & 1 & 1 & 2 & 3 & 5 & 6 & 9 & 11 & 15 & 18 & 23 \\ \hline 5 & 1 & 1 & 2 & 3 & 5 & 7 & 10 & 13 & 18 & 23 & 30 \\ \hline 6 & 1 & 1 & 2 & 3 & 5 & 7 & 11 & 14 & 20 & 26 & 35 \\ \hline 7 & 1 & 1 & 2 & 3 & 5 & 7 & 11 & 15 & 21 & 28 & 38 \\ \hline 8 & 1 & 1 & 2 & 3 & 5 & 7 & 11 & 15 & 22 & 29 & 40 \\ \hline 9 & 1 & 1 & 2 & 3 & 5 & 7 & 11 & 15 & 22 & 30 & 41 \\ \hline 10 & 1 & 1 & 2 & 3 & 5 & 7 & 11 & 15 & 22 & 30 & 42 \end{array} \]

There are so many fun ways to play with these collections of numbers, it’s hard to know when to stop! To limit the scope of this post, I will only consider two related notions, and those only in part: first differences, and generating functions.

First differences

Given a collection of numbers $S_{m,n}$ indexed by integers $m$ and $n$, and a pair $(p,q) \in \mathbb{Z}^2$, let \[ S_{m,n}^{(p,q)} = S_{m,n} - S_{m-p,n-q}\text. \]

From the definitions, we have for $n > 1$ \[ \begin{array}{rl} V_{m,n}^{(0,1)} &= V_{m,n} - V_{m,n-1} \\ &= \sum_{k=1}^m V_{m,n-k} - \sum_{k=1}^m V_{m,(n-1)-k} \\ &= \sum_{k=1}^m \big(V_{m,n-k} - V_{m,(n-k)-1}\big) \\ &= \sum_{k=1}^m V_{m,n-k}^{(0,1)} \end{array} \] and so the first differences $V_{m,n}^{(0,1)}$ satisfy the same recurrence relation as the original numbers $V_{m,n}$ when $n\ge 2$. Here is the table of values for $V_{m,n}^{(0,1)}$. \[ V_{m,n}^{(0,1)}: \begin{array}{c|c|c|c|c|c|c|c|c|c} m\backslash n & 0 & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 \\ \hline 0 & 1 & -1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ \hline 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ \hline 2 & 1 & 0 & 1 & 1 & 2 & 3 & 5 & 8 & 13 & 21 & 34 \\ \hline 3 & 1 & 0 & 1 & 2 & 3 & 6 & 11 & 20 & 37 & 68 & 125 \\ \hline 4 & 1 & 0 & 1 & 2 & 4 & 7 & 14 & 27 & 52 & 100 & 193 \\ \hline 5 & 1 & 0 & 1 & 2 & 4 & 8 & 15 & 30 & 59 & 116 & 228 \\ \hline 6 & 1 & 0 & 1 & 2 & 4 & 8 & 16 & 31 & 62 & 123 & 244 \\ \hline 7 & 1 & 0 & 1 & 2 & 4 & 8 & 16 & 32 & 63 & 126 & 251 \\ \hline 8 & 1 & 0 & 1 & 2 & 4 & 8 & 16 & 32 & 64 & 127 & 254 \\ \hline 9 & 1 & 0 & 1 & 2 & 4 & 8 & 16 & 32 & 64 & 128 & 255 \\ \hline 10 & 1 & 0 & 1 & 2 & 4 & 8 & 16 & 32 & 64 & 128 & 256 \end{array} \] We see that the “initial conditions” of the recurrence $V_{m,n}^{(0,1)} = \sum_{k=1}^m V_{m,n-k}^{(0,1)}$ have shifted, so that before beginning to apply the recurrence relation, we insert one term equal to $0$. But what are these numbers counting? Well, $V_{m,n}$ is the number of compositions of $n$ with no term greater than $m$, and $V_{m,n-1}$ is the number of compositions of $n-1$ with no term greater than $m$. Any composition of $n-1$ can be converted to a composition of $n$ by adding a $1$ at the end. So $V_{m,n}^{(0,1)}$ equals the number of compositions of $n$ that do not end with $1$. Perhaps not the most interesting quantity to count, although having the recurrence formula is nice. (The same principle will lead to other interesting quantities, so don’t go away yet!)

On the other hand, $V_{m,n}^{(1,0)} = V_{m,n} - V_{m-1,n}$ does have an interesting interpretation. Because $V_{m-1,n}$ counts all compositions of $n$ whose largest term is at most $m-1$, $V_{m,n}^{(1,0)}$ equals the number of compositions of $n$ whose largest term is exactly $m$. Here’s a partial table of values. \[ V_{m,n}^{(1,0)}: \begin{array}{c|c|c|c|c|c|c|c|c|c} m\backslash n & 0 & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 \\ \hline 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ \hline 1 & 0 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 \\ \hline 2 & 0 & 0 & 1 & 2 & 4 & 7 & 12 & 20 & 33 & 54 & 88 \\ \hline 3 & 0 & 0 & 0 & 1 & 2 & 5 & 11 & 23 & 47 & 94 & 185 \\ \hline 4 & 0 & 0 & 0 & 0 & 1 & 2 & 5 & 12 & 27 & 59 & 127 \\ \hline 5 & 0 & 0 & 0 & 0 & 0 & 1 & 2 & 5 & 12 & 28 & 63 \\ \hline 6 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 2 & 5 & 12 & 28 \\ \hline 7 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 2 & 5 & 12 \\ \hline 8 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 2 & 5 \\ \hline 9 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 2 \\ \hline 10 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \end{array} \] One interesting feature of this table is that the entries stabilize below the line $n = 2m$: when $n < 2m$, we see that $V_{m+1,n+1}^{(1,0)} = V_{m,n}^{(1,0)}$. The sequence $V_{m,2m-1}^{(0,1)}$ begins (starting with $m = 1$) \[ 1,\;2,\;5,\;12,\;28,\;64,\;144,\;320,\;704,\;\dots \] The OEIS says that this sequence comes from counting all the $1$s that appear in all compositions of $m$. Why should that sequence arise here? I guess that’s a mystery that will have to wait for later…

Turning to first differences of the unordered Virahanka numbers, let’s start with $U_{m,n}^{(1,0)}$. \[ U_{m,n}^{(1,0)}: \begin{array}{c|c|c|c|c|c|c|c|c|c} m\backslash n & 0 & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 \\ \hline 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ \hline 1 & 0 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 \\ \hline 2 & 0 & 0 & 1 & 1 & 2 & 2 & 3 & 3 & 4 & 4 & 5 \\ \hline 3 & 0 & 0 & 0 & 1 & 1 & 2 & 3 & 4 & 5 & 7 & 8 \\ \hline 4 & 0 & 0 & 0 & 0 & 1 & 1 & 2 & 3 & 5 & 6 & 9 \\ \hline 5 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 2 & 3 & 5 & 7 \\ \hline 6 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 2 & 3 & 5 \\ \hline 7 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 2 & 3 \\ \hline 8 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 2 \\ \hline 9 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 \\ \hline 10 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \end{array} \] It looks like no new numbers have appeared in this table; each row has simply been shifted to right, with the $m$th row having $m$ new $0$s at the start. Indeed, using the recurrence formula for $U_{m,n}$, we calculate that \[ \begin{array}{rl} U_{m,n}^{(1,0)} &= U_{m,n} - U_{m-1,n} \\ &= \sum_{k=1}^m U_{k,n-k} - \sum_{k=1}^{m-1} U_{k,n-k} \\ &= U_{m,n-m} \end{array} \] Why should this be true, from a counting perspective? Reasoning as we did with $V_{m,n}^{(1,0)}$, we see that $U_{m,n}^{(0,1)}$ counts the number of partitions of $m$ whose largest part is exactly $m$. But if the largest part is exactly $m$, then it only remains to partition $n-m$ using parts no greater than $m$, which is exactly what $U_{m,n-m}$ counts! So $U_{m,n}^{(0,1)} = U_{m,n-m}$.

Ok, now for taking first differences of $U_{m,n}$ within rows. \[ U_{m,n}^{(0,1)}: \begin{array}{c|c|c|c|c|c|c|c|c|c} m\backslash n & 0 & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 \\ \hline 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ \hline 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ \hline 2 & 1 & 0 & 1 & 0 & 1 & 0 & 1 & 0 & 1 & 0 & 1 \\ \hline 3 & 1 & 0 & 1 & 1 & 1 & 1 & 2 & 1 & 2 & 2 & 2 \\ \hline 4 & 1 & 0 & 1 & 1 & 2 & 1 & 3 & 2 & 4 & 3 & 5 \\ \hline 5 & 1 & 0 & 1 & 1 & 2 & 2 & 3 & 3 & 5 & 5 & 7 \\ \hline 6 & 1 & 0 & 1 & 1 & 2 & 2 & 4 & 3 & 6 & 6 & 9 \\ \hline 7 & 1 & 0 & 1 & 1 & 2 & 2 & 4 & 4 & 6 & 7 & 10 \\ \hline 8 & 1 & 0 & 1 & 1 & 2 & 2 & 4 & 4 & 7 & 7 & 11 \\ \hline 9 & 1 & 0 & 1 & 1 & 2 & 2 & 4 & 4 & 7 & 8 & 11 \\ \hline 10 & 1 & 0 & 1 & 1 & 2 & 2 & 4 & 4 & 7 & 8 & 12 \end{array} \] Hmm, that’s curious: this is the first time we’ve seen any cases (outside of the $0$th row or the $1$st column) where the entries occasionally decrease moving left to right. What exactly are we seeing? Well, any partition of $n-1$ can be converted to a partition of $n$ by adjoining a $1$, so the difference between $U_{m,n}$ and $U_{m,n-1}$ is the number of partitions of $n$ that do not include $1$ as a part and whose largest part is at most $m$. So,

In row $2$, we are only allowing partitions using $2$, so every even number has one partition, and every odd number has zero.
In row $3$, we are only allowing partitions using $2$ and $3$. By hand, we find that $2$, $3$, $4$, and $5$ have one partition each, but $6$ has two: $6 = 2 + 2 + 2 = 3 + 3$. However, $7$ only has one: $7 = 3 + 2 + 2$. From this point, each time we add $6$, we increase the number of possible partitions by $1$, so $U_{3,n+6} = U_{3,n}+1$.

As the rows go on, the jumps down become harder to predict, in my view. But we could plug them into the OEIS and see what comes up! Here’s what row 4 yields.

Evidently, there is more fun to be had, looking at first differences $V_{(m,n)}^{(p,q)}$ and $U_{m,n}^{(p,q)}$ for other values of $(p,q)$, second differences, and so on. But for now let’s turn to a new game.

Generating functions

Given a sequence of numbers $\{a_n\}$, the generating function of the sequence is the formal power series \[ g(x) = \sum a_n x^n\text. \] I have not included a starting or ending index in order to allow for whatever values of $n$ make sense. In cases where the power series converges to an analytic function on some neighborhood of $0$, this function is also called the generating function of the sequence, and the terms of the sequence are the Taylor coefficients of the function.

The Virahanka–Fibonacci numbers $V_{2,n}$ produce the generating function \[ g_2(x) = \sum_{n=0}^\infty V_{2,n}x^n\text. \] Because $V_{2,0} = 1$, $V_{2,n} = 0$ for $n < 0$, and $V_{2,n} = V_{2,n-1} + V_{2,n-2}$ for $n \ge 1$, we can rewrite the power series as \[ \begin{array}{rl} g_2(x) &= \displaystyle 1 + \sum_{n=1}^\infty (V_{2,n-1} + V_{2,n-2}) x^n \\ &= \displaystyle 1 + \sum_{n=1}^\infty V_{2,n-1} x^n + \sum_{n=1}^\infty V_{2,n-2} x^n \\ &= \displaystyle 1 + x\sum_{n=1}^\infty V_{2,n-1} x^{n-1} + x^2\sum_{n=1}^\infty V_{2,n-2} x^{n-2} \\ &= \displaystyle 1 + x\sum_{n=0}^\infty V_{2,n} x^n + x^2\sum_{n=0}^\infty V_{2,n} x^n \\ &= 1 + xg_2(x) + x^2g_2(x) \end{array} \] and therefore $g_2(x) = 1 + xg_2(x) + x^2g_2(x)$. Solving this equation, we find \[ g_2(x) = \frac{1}{1 - x - x^2}\text. \] By a similar process, if we set \[ g_m(x) = \sum_{n=0}^\infty V_{m,n}x^n \] then using $V_{m,0} = 1$ and $V_{m,n} = V_{m,n-1} + \cdots + V_{m,n-m}$ for $n > 0$, we get \[ g_m(x) = \frac{1}{1 - x - \cdots - x^m}\text. \] As $m \to \infty$, the series $x + x^2 + \dots + x^m$ that is subtracted in the denominator converges to $x/(1-x)$, so the generating functions converge to \[ g_{\infty}(x) = \frac{1}{1 - x/(1 - x)} = \frac{1 - x}{1 - 2x}\text. \] This is precisely equivalent to \[ g_{\infty}(x) = 1 + \sum_{n=1}^\infty 2^{n-1} x^n \] which is the generating function of the sequence $V_{n,n}$ to which the terms of $V_{m,n}$ stabilize as $m$ grows.

Now let’s find the generating functions of the $U_{m,n}$ sequences. For each $m \ge 0$, set \[ f_m(x) = \sum_{n=0}^\infty U_{m,n}x^n \] By inspection, we have \[ f_0(x) = 1 \qquad\text{and}\qquad f_1(x) = 1 + x + x^2 + \cdots = \frac{1}{1 - x}\text. \] (These are the same as $g_0(x)$ and $g_1(x)$.) We know that $U_{2,n} = U_{1,n-1} + U_{2,n-2}$ for $n \ge 1$, and so \[ \begin{array}{rl} f_2(x) &= \displaystyle 1 + \sum_{n=1}^\infty (U_{1,n-1} + U_{2,n-2}) x^n \\ &= \displaystyle 1 + x\sum_{n=1}^\infty U_{1,n-1}x^{n-1} + x^2\sum_{n=1}^\infty U_{2,n-2} x^{n-2} \\ &= \displaystyle 1 + x\sum_{n=0}^\infty U_{1,n}x^n + x^2\sum_{n=0}^\infty U_{2,n} x^n \\ &= 1 + xf_1(x) + x^2f_2(x) \end{array} \] This implies $(1 - x^2) f_2(x) = 1 + xf_1(x)$, or \[ f_2(x) = \left(1 + \frac{x}{1 - x} \right) \frac{1}{1 - x^2} = \frac{1 - x + x}{1 - x} \frac{1}{1 - x^2} = \frac{1}{(1 - x)(1 - x^2)}\text. \] Proceeding inductively, a similar process shows that \[ f_m(x) = 1 + xf_1(x) + x^2f_2(x) + \cdots + x^mf_m(x) \] and therefore \[ f_m(x) = \frac{1}{(1-x)(1-x^2)\cdots(1-x^m)} \] for all $m \ge 2$. On compact subsets of $(-1,1)$, $x^m$ converges uniformly to $0$, and so $f_m$ converges uniformly on compact subsets of $(-1,1)$ to \[ f_\infty(x) = \prod_{k=1}^\infty \frac{1}{1 - x^k} \] which is the generating function of $U_{n,n}$, the sequence that counts all partitions of $n$.

Now that we have the generating functions $g_m(x)$ and $f_m(x)$ in hand, we can have more fun! Note that \[ x\sum a_n x^n = \sum a_n x^{n+1} = \sum a_{n-1} x^n \] and so multiplying a generating function by $x$ shifts all the terms of the sequence to the right by one position. (We have already used this fact above; it’s just worth making the general principle explicit.) This allows us to find the generating functions of the first differences calculated in the previous section. We’ll consider these in the reverse order from what we did previously.

For $U_{m,n}^{(1,0)}$, the generating function is \[ \begin{array}{rl} f_m(x) - f_{m-1}(x) &= \dfrac{1}{(1 - x)\cdots(1 - x^m)} - \dfrac{1}{(1 - x)\cdots(1 - x^{m-1})} \\ &= \dfrac{x^m}{(1 - x)\cdots(1 - x^m)} = x^mf_m(x) \end{array} \] Hey, that’s great! This matches what we just saw: the coefficients of $f_m(x) - f_{m-1}(x)$ are the same as those of $f_m(x)$, but shifted $m$ places to the right.

For $U_{m,n}^{(0,1)}$, the generating function is \[ \begin{array}{rl} f_m(x) - xf_m(x) &= \dfrac{1}{(1 - x)\cdots(1 - x^m)} - \dfrac{x}{(1 - x)\cdots(1 - x^m)} \\ &= \dfrac{1 - x}{(1 - x)\cdots(1 - x^m)} = \dfrac{1}{(1 - x^2)\cdots(1 - x^m)} \end{array} \] Oh, neat—we wanted a function that counts the number of partitions that don’t contain $1$ as a part, and the factor $1-x$ vanished! This suggests that if we want to count partitions that only allow $a_1,\dots,a_N$ then perhaps we should use the generating function $[(1-x^{a_1})\cdots(1-x^{a_N})]^{-1}$? Something to explore further…

Now for $V_{m,n}^{(1,0)}$, the generating function is \[ \begin{array}{rl} g_m(x) - g_{m-1}(x) &= \dfrac{1}{1 - x - \cdots - x^m} - \dfrac{1}{1 - x - \cdots - x^{m-1}} \\ &= \dfrac{x^m}{(1 - x - \cdots - x^m)(1 - x - \cdots - x^{m-1})} \\ &= x^m g_m(x) g_{m-1}(x)\text. \end{array} \] Hmm, this definitely tells us something interesting. Multiplying out the power series for $g_m$ and $g_{m-1}$, and equating coefficients, we find \[ V_{m,n}^{(1,0)} = \sum_{k=0}^{n-m} V_{m,k} V_{m-1,n-m-k}\text. \] Recall from our interpretation of $V_{m,n}^{(1,0)}$ that it counts the number of compositions of $n$ whose largest term is exactly equal to $m$. Indeed, since we know the largest term is $m$, we can sort the compositions by the last term that equals $m$. The sum of the terms before this will be equal to some $k \le n - m$, for which there are $V_{m,k}$ compositions with no terms greater than $m$. After the last term equal to $m$, the remaining terms will sum to $n - k - m$, for which there are $V_{m-1,n-k-m}$ compositions whose terms are all less than $m$. We thus could have found this equality earlier in our study, but the generating function suggested it immediately! (I had hoped that looking at the generating functions would also help explain why $V^{(1,0)}_{m,2m-1}$ equals the number of $1$s in all compositions of $m$, as observed above, but I haven’t quite got there yet… possible update to come!)

For $V_{m,n}^{(0,1)}$, the generating function is \[ \begin{array}{rl} g_m(x) - xg_m(x) &= \dfrac{1}{1 - x - \cdots - x^m} - \dfrac{x}{1 - x - \cdots - x^m} \\ &= \dfrac{1 - x}{1 - x - \cdots - x^m} \end{array} \] Oh, so changing the initial conditions of the sequence just corresponds to changing the numerator of the generating function. That’s neat! (Indeed, the Lucas numbers $L_n$ famously satisfy the same recursive formula as the Virahanka–Fibonacci numbers, $L_n = L_{n-1} + L_{n-2}$, and their generating function is $(2-x)/(1-x-x^2) = g_2(x) + (g_2(x)-xg_2(x))$.)

I can’t help doing one more thing with these generating functions, especially since we haven’t seen much interaction between the numbers $V_{m,n}$ and $U_{m,n}$. Let’s try dividing their generating functions. We know that $V_{m,n}$ is at least as large as $U_{m,n}$, so we’ll divide $g_m$ by $f_m$: \[ \frac{g_m}{f_m} = \frac{(1-x)\cdots(1-x^m)}{1-x-\cdots-x^m}\text. \] What could the coefficients of this generating function represent? Loosely speaking, they carry the information about the number of compositions (with no part greater than $m$) that is not contained in the number of partitions. That’s fairly vague, so let’s consider the case $m = 2$. We calculate \[ \frac{g_2}{f_2} = \frac{(1-x)(1-x^2)}{1-x-x^2} = 1 + \frac{x^3}{1-x-x^2} = 1 + x^3 g_2 \] which gives us the factorization $g_2 = f_2\big(1 + x^3 g_2\big)$, or $g_2 = f_2 + x^3 f_2 g_2$. Equating coefficients of the power series yields the relation \[ V_{2,n} = U_{2,n} + \sum_{k=0}^{n-3} U_{2,k} V_{2,n-3-k}\text. \] This equation has a sensible meaning. Recall that a partition of $n$ can be thought of as a composition (a sum of positive integers summing to $n$) in which the terms are non-increasing. (For example, $2 + 2 + 1 + 1 + 1$ is a partition of $7$, but $2 + 1 + 1 + 2 + 1$ is not.) The right-hand side of the last equation splits the number of compositions into the number of partitions $U_{2,n}$ and the number of non-partitions. If we are only allowing terms equal to $1$ or $2$, then a composition that is not a partition must have a $1$ followed by a $2$ somewhere in its expression. In poetic terms, we can think of $1$ and $2$ as short and long syllables, and $n$ as the number of beats in a line. We sort compositions by where the first appearance of $1 + 2$ occurs. Everything before that in the sum is non-increasing; say this takes $k$ beats (this is the $U_{2,k}$ factor in the sum on the right). Then we have $1 + 2$, which accounts for $3$ more beats, and finally the remaining $n-3-k$ beats can be subdivided in any manner using shrot and long syllables (which is what the $V_{2,n-3-k}$ factor counts).

The sequence of coefficients for $g_4(x)/f_4(x)$ begins \[ 1,\;0,\;0,\;1,\;2,\;5,\;8,\;16,\;30,\;58,\;113,\;217,\;418,\;\dots\text. \] As of this writing (May 2022) this sequence does not appear in the OEIS (should I submit it?). However, the sequence of coefficients for \[ \frac{g_\infty}{f_\infty} = 1 + x^3 + 2x^4 + 5x^5 + 9x^6 + 19x^7 + 37x^8 + 74x^9 + 148x^{10} + \cdots \] does appear, as A178841, which counts “the number of pure inverting compositions of $n$.” These seem to be related to a basis for the algebra of quasisymmetric functions as a module over the symmetric functions. So there’s something deep and important going on with these quotients!

In case it isn’t obvious from this post, I’m still learning about generating functions, and I’m kind of in awe of them. It’s entirely possible that everything in this post appears among the exercises of some book on combinatorics. (Certainly most of them are mild generalizations of facts that can be found in the OEIS.) One of my summer activities may be to finally get through Herbert Wilf’s book generatingfunctionology (the second edition is available for free download at the link).

Wednesday, May 18, 2022

on poetry and partitions

Last fall, I learned the names of Pingala and Hemachandra from a blog post by James Propp. Both have been described as poets and mathematicians, and both produced works related to what are commonly known as the Fibonacci sequence \[ 1,\;1,\;2,\;3,\;5,\;8,\;13,\;21,\;34,\;55,\;89,\;144,\;\dots\text. \] Pingala lived around the 3rd century BCE, and Hemachandra lived in the 12th century CE. (The book Liber abaci—which contains the rabbit problem that associated Fibonacci’s name with the above sequence, along with other, more useful ideas—was produced later, in 1202.) In between them came Virahanka, sometime between the 6th and 8th centuries CE; following Manjul Bhargava, it is Virahanka’s name I will emphasize in this post.

In Sanskrit poetry, each syllable is either short or long, lasting one or two beats. Some forms of poetic meter have a fixed number of syllables in each line, and some have a fixed number of beats in each line. In the latter case, a natural question (asked at least as early as Pingala) is: how many ways can a line of $n$ beats be metrically divided into short and long syllables? For example, a line with three beats could be arranged “short, short, short” or “long, short” or “short, long”. This corresponds to the equalities $3 = 1+1+1 = 2+1 = 1+2$.

Virahanka seems to have been the first to provide a complete answer, stating that each line of $n$ beats must end with either a short syllable or a long syllable; therefore, the number of possibilities equals the sum of the numbers of possibilities for a line with $n-1$ beats and a line with $n-2$ beats. Because a line with one beat can be divided in only one way, and a line with two beats can be divided in two ways (“short, short” or “long”), the resulting sequence is the same as Fibonacci’s rabbit-counting sequence.¹ (The additional initial $1$ in the sequence above will be accounted for momentarily.)

Not only does poetic meter provide a more natural setting for this counting problem than rabbits, it is more reasonably generalized. Suppose, for instance, that instead of just two lengths of syllables, we had a language with three lengths of syllables: one beat, two beats, and three beats. Now how many metrical ways are there to compose a poetic line of $n$ beats? As before, we can work out the first few cases by hand: $1 = 1$, $2 = 1 + 1 = 2$, $3 = 1 + 1 + 1 = 2 + 1 = 1 + 2 = 3$. For larger numbers of beats, as before we can sort by the length of the final syllable ($1$, $2$, or $3$), and then count how many ways there are to complete the line, using our knowledge of previous line lengths. For example, a line with four beats can be divided in seven ways, four that end with a $1$-beat syllable, two that end with a $2$-beat syllable, and one that ends with a $3$-beat syllable: \[ \begin{array}{rl} 4 &= 1 + 1 + 1 + \boxed{1} = 2 + 1 + \boxed{1} = 1 + 2 + \boxed{1} = 3 + \boxed{1} \\ &= 1 + 1 + \boxed{2} = 2 + \boxed{2} \\ &= 1 + \boxed{3}\text. \end{array} \] In this sequence, each term is therefore the sum of the previous three terms: \[ 1,\;2,\;4,\;7,\;13,\;24,\;44,\;81,\;149,\;274,\;504,\;\dots\text. \] (Has anyone seen a biological application of this sequence, even an implausible one?)²

Generalizing further, suppose we have syllables of lengths $1$, $2$, …, $m$, and we wish to count the number of rhythmic ways to compose a line of $n$ beats. Or put another way, how many sums with terms between $1$ and $m$ add up to $n$?³ A sum of positive integers equalling $n$ is called a composition of $n$. Let $V_{m,n}$ be the number of compositions of $n$ whose largest term is no greater than $m$. I will call $V_{m,1},V_{m,2},V_{m,3},\dots$ the $m$th Virahanka sequence, so that $V_{2,n}$ is the usual $n$th Fibonacci number. By the same reasoning as in the case of $V_{2,n}$, we have \[ V_{m,n} = V_{m,n-1} + V_{m,n-2} + \cdots + V_{m,n-m}, \qquad n > m. \]

We can naturally extend $V_{m,n}$ to be defined when $m$ or $n$ equals $0$, or both. Using the convention that the “empty sum” having no terms is equal to $0$, we get $V_{m,0} = 1$ for all $m$, and $V_{0,n} = 0$ for $n \ge 1$. We can even set $V_{m,n} = 0$ when $n < 0$, which allows us to drop the $n > m$ restriction entirely in the recursive formula above.

Interesting patterns⁴ emerge when we arrange the numbers $V_{m,n}$ in a single table:

\[ \begin{array}{c|c|c|c|c|c|c|c|c|c} m\backslash n & 0 & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 \\ \hline 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ \hline 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 \\ \hline 2 & 1 & 1 & 2 & 3 & 5 & 8 & 13 & 21 & 34 & 55 & 89 \\ \hline 3 & 1 & 1 & 2 & 4 & 7 & 13 & 24 & 44 & 81 & 149 & 274 \\ \hline 4 & 1 & 1 & 2 & 4 & 8 & 15 & 29 & 56 & 108 & 208 & 401 \\ \hline 5 & 1 & 1 & 2 & 4 & 8 & 16 & 31 & 61 & 120 & 236 & 464 \\ \hline 6 & 1 & 1 & 2 & 4 & 8 & 16 & 32 & 63 & 125 & 248 & 492 \\ \hline 7 & 1 & 1 & 2 & 4 & 8 & 16 & 32 & 64 & 127 & 253 & 504 \\ \hline 8 & 1 & 1 & 2 & 4 & 8 & 16 & 32 & 64 & 128 & 255 & 509 \\ \hline 9 & 1 & 1 & 2 & 4 & 8 & 16 & 32 & 64 & 128 & 256 & 511 \\ \hline 10 & 1 & 1 & 2 & 4 & 8 & 16 & 32 & 64 & 128 & 256 & 512 \end{array} \]

Looking along and below the diagonal, for instance, we see that $V_{m,n} = 2^{n-1}$ if $1 \le n \le m$, and also $V_{n-1,n} = 2^{n-1} - 1$. These values make sense if we imagine $n$ beats in a poetic line, which may be grouped up into syllables of lengths $1$, $2$, …, $m$. If $m \ge n$, then any grouping is allowed, and so we just need to decide at which of the $n-1$ spaces between beats to break the syllables, yielding $2^{n-1}$ possibilities. Likewise, if $m = n-1$, then the only grouping that is not allowed is to put all the beats into a single syllable. (These values also follow from the recursive equation for $V_{m,n}$, using the formula for summing geometric series: $1 + 2 + \cdots + 2^{n-1} = 2^n - 1$.)

What happens if we disregard the order of terms in the sum? Or said another way, what if each syllable in a line of poetry must be no longer than the previous one (that is, since order doesn’t matter, we always sort the syllables from largest to smallest)? These correspond to sums in which the terms are non-increasing. Let us call the number of such sums equalling $n$ with no terms larger than $m$ the unordered Virahanka number $U_{m,n}$. For example, \[ \begin{array}{ll} 6 &= 1 + 1 + 1 + 1 + 1 + 1 \\ &= 2 + 2 + 2 = 2 + 2 + 1 + 1 = 2 + 1 + 1 + 1 + 1 \\ &= 3 + 3 = 3 + 2 + 1 = 3 + 1 + 1 + 1 \end{array} \] and so $U_{3,6} = 7$. Here is the table of unordered Virahanka numbers:

\[ \begin{array}{c|c|c|c|c|c|c|c|c|c} m\backslash n & 0 & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 \\ \hline 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ \hline 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 \\ \hline 2 & 1 & 1 & 2 & 2 & 3 & 3 & 4 & 4 & 5 & 5 & 6 \\ \hline 3 & 1 & 1 & 2 & 3 & 4 & 5 & 7 & 8 & 10 & 12 & 14 \\ \hline 4 & 1 & 1 & 2 & 3 & 5 & 6 & 9 & 11 & 15 & 18 & 23 \\ \hline 5 & 1 & 1 & 2 & 3 & 5 & 7 & 10 & 13 & 18 & 23 & 30 \\ \hline 6 & 1 & 1 & 2 & 3 & 5 & 7 & 11 & 14 & 20 & 26 & 35 \\ \hline 7 & 1 & 1 & 2 & 3 & 5 & 7 & 11 & 15 & 21 & 28 & 38 \\ \hline 8 & 1 & 1 & 2 & 3 & 5 & 7 & 11 & 15 & 22 & 29 & 40 \\ \hline 9 & 1 & 1 & 2 & 3 & 5 & 7 & 11 & 15 & 22 & 30 & 41 \\ \hline 10 & 1 & 1 & 2 & 3 & 5 & 7 & 11 & 15 & 22 & 30 & 42 \end{array} \]

Again the entries stabilize below the diagonal. The entries $U_{n,n}$ are generally called the number of partitions of $n$. Correspondingly, the remaining values of $U_{m,n}$ count the number of partitions of $n$ with no part greater than $m$, which is why $U_{m,n} = U_{n,n}$ if $m \ge n$. By considering partitions whose largest part is $k$, with $k = 1, \dots, m$, we get a recursive formula for these numbers, as well: \[ U_{m,n} = U_{1,n-1} + U_{2,n-2} + \cdots + U_{m,n-m}. \] Unlike the case of $V_{m,n}$, which can be calculated using only earlier terms in the $m$th (ordered) Virahanka sequence, determining $U_{m,n}$ recursively requires knowing entries from all previous rows.

There is a nice visualization that expresses the relationship between the unordered Virahanka numbers $U_{m,n}$ and the ordered Virahanka numbers $V_{m,n}$. A partition is determined by the number of times each integer appears. So we can think of $U_{m,n}$ as counting the number of integer points $(i_1,\dots,i_m)$ such that $i_k \ge 0$ for all $1\le k\le m$ and $\sum\nolimits_{k=1}^m ki_k = n$. This last condition defines a hyperplane orthogonal to $(1,2,\dots,m)$.

If $m = 2$, for each $n$ we count the number of integer points on the line $x + 2y = n$ in the first quadrant (including the origin and the positive $x$- and $y$- axes).

From this image, you can see the sequence $1, 1, 2, 2, 3, 3, \dots$, or $1+\lfloor n/2 \rfloor$, emerge.

In the case $m = 3$, $U_{3,n}$ counts the number of integer points in the positive octant such that $x + 2y + 3z = n$. The image below shows the 10 points corresponding to partitions of $n = 8$.

Now we need to convert partitions to compositions. A partition in which $k$ appears $i_k$ times corresponds to the $m$-tuple $(i_1,\dots,i_m)$. A composition with the same number of appearances of each $k$ can be obtained by placing the terms of the partition in any order; the number of distinguishable ways of arranging these terms is given by the multinomial coefficient \[ \binom{i_1+\cdots+i_m}{i_1,\dots,i_m} = \frac{(i_1+\cdots+i_m)!}{i_1!\cdots i_m!}\text. \] For example, the partition $8 = 3 + 2 + 2 + 1$ can be arranged into $\binom{1+2+1}{1,2,1} = \frac{4!}{1!2!1!} = 12$ different compositions, first by choosing the position of the $1$, then the positions of the two $2$s, and then placing $3$ in the remaining spot: \[ \begin{array}{rl} 8 &= 3 + 2 + 2 + 1 = 2 + 3 + 2 + 1 = 2 + 2 + 3 + 1 \\ &= 3 + 2 + 1 + 2 = 2 + 3 + 1 + 2 = 2 + 2 + 1 + 3 \\ &= 3 + 1 + 2 + 2 = 2 + 1 + 3 + 2 = 2 + 1 + 2 + 3 \\ &= 1 + 3 + 2 + 2 = 1 + 2 + 3 + 2 = 1 + 2 + 2 + 3 \end{array} \] Thus if we think of each point $(i_1,\dots,i_m)$ as being weighted by this corresponding multinomial coefficient, we can write $V_{m,n}$ as a sum having $U_{m,n}$ terms, as follows: \[ V_{m,n} = \sum_{(i_1,\dots,i_m):\sum ki_k = n} \binom{i_1+\cdots+i_m}{i_1,\dots,i_m}\text. \] When $m = 2$, we have $i_1 + 2i_2 = n$, so $i_1 + i_2 = n - i_2$ and $i_1 = n - 2i_2$, and by setting $j = i_2$ we get the following expression of $V_{2,n}$ (remember, these are the usual Fibonacci numbers) as a sum of binomial coefficients: \[ V_{2,n} = \sum_{j=0}^{\lfloor n/2 \rfloor} \binom{n - j}{n - 2j} = \sum_{j=0}^{\lfloor n/2 \rfloor} \binom{n - j}{j}\text. \] The binomial coefficients are, of course, the entries in what is commonly known as Pascal’s triangle. But this also was known to Indian scholars, possibly even Pingala: a later commentator, Halayudha (10th century CE), explicitly arranged the same numbers into a triangle, under the name of Meru Prastaara⁵, and clarified Pingala’s references to them in relation to poetic meter.

When $m = 3$, the formula is a little more elaborate, so let's first look at an example. We have seen that there are seven partitions of $6$ that have no part greater than $3$, corresponding to the triples $(i_1,i_2,i_3) = (6,0,0)$, $(4,1,0)$, $(2,2,0)$, $(0,3,0)$, $(3,0,1)$, $(1,1,1)$, and $(0,0,2)$. (Remember that these are the triples of positive integers such that $i_1 + 2i_2 + 3i_3 = 6$.) The resulting sum in terms of multinomial coefficients is \[ \begin{array}{rl} V_{3,6} &= \displaystyle\binom{6}{6,0,0} + \binom{5}{4,1,0} + \binom{4}{2,2,0} + \binom{3}{0,3,0} + \binom{4}{3,0,1} + \binom{3}{1,1,1} + \binom{2}{0,0,2} \\ &= 1 + 5 + 6 + 1 + 4 + 6 + 1 = 24 \end{array} \] Now for the general sum⁶: setting $j = i_2$ and $k = i_3$, we have \[ V_{3,n} = \sum_{k=0}^{\lfloor n/3 \rfloor} \sum_{j=0}^{\lfloor (n-3k)/2 \rfloor} \binom{(n-2j-3k)+j+k}{n-2j-3k,j,k} = \sum_{k=0}^{\lfloor n/3 \rfloor} \sum_{j=0}^{\lfloor (n-3k)/2 \rfloor} \binom{n-j-2k}{n-2j-3k,j,k}\text. \]

I’m sure there is much more depth to be explored regarding these numbers, probably all of it known to combinatorialists, but I have to pause here! I hope to post a part 2 soon.

Footnotes:

^1. I first learned about a similar way of obtaining the Virahanka–Fibonacci sequence from the book Proofs that Really Count by Arthur Benjamin and Jennifer Quinn. In that text, the problem is to count the number of ways to tile an $n\times1$ grid using $1\times1$ and $2\times1$ pieces, called squares and dominoes. Then they use this visual presentation to prove several common identities involving terms of the sequence.↩
^2. Some sources, including the OEIS, call numbers in this sequence “Tribonacci numbers.” Cute, but nonsensical, given that the “fi-” prefix refers to the “son” of Bonacci, not to any counting procedure. However, the OEIS does attribute the coining of “tribonacci” to Mark Feinberg in 1963, when he was a 9th grader, and that is honestly a high pedigree—how often does a 14-year-old get to invent a mathematical term that endures for over half a century?↩
^3. Exactly this kind of generalization seems to have been considered by the 14th century mathematician Narayana Pandita. It might therefore be reasonable to name the resulting sequences “Virahanka–Narayana” sequences. See “The So-called Fibonacci Numbers in Ancient and Medieval India” by Parmanand Singh and “Nārāyana’s Generalisation of Mātrā-vrtta-prastāra and the Generalised Virahānka-Fibonacci Representation of Numbers” by Raja Sridharan, R Sridharan and M D Srinivas.↩
^4. Hopefully I’ll write more about these patterns soon in a followup post.↩
^5. According to P. Singh in “The So-called Fibonacci Numbers in Ancient and Medieval India”, “meru is the name of the imaginary mountain that is supposed to stand at the center of the earth.”↩
^6. At least a version of this sum was known to Narayana Pandita. He derived the multinomial coefficients through something called a “needle sequence” or “arrow-head” sequence, which I have not yet fully understood. See the references in Footnote 3.↩

Friday, May 28, 2021

a geometric interpretation for the sum of a geometric series

Start with the geometric series: \[ \sum_{n=0}^\infty z^n = 1 + z + z^2 + z^3 + \cdots\text. \] If $z$ is not a real number, the partial sums of this series form a spiral pattern, which appears to be self-similar. Below is the pattern for $z = (1+i)/2$. (Click for interactive graph.)

A similarity of the plane can be expressed as a (complex) affine function $f(w) = aw + b$. To determine $a$ and $b$ for our spiral, we note that $f$ should satisfy \[ f(0) = 1 \qquad\text{and}\qquad f(1) = 1+z\text. \] The first equation implies $b = 1$. From the second equation, we then have $a + 1 = 1 + z$, so $a = z$. Our function is therefore $f(w) = zw + 1$, with $z$ treated as a constant.

Now we check whether this function $f(w)$ does in fact show that the spiral is self-similar. Indeed, $f(1+z) = z(1+z) + 1 = 1 + z + z^2$, and more generally, the sequence $1$, $f(1)$, $f(f(1))$, $f(f(f(1)))$, … coincides with the sequence of partial sums of the geometric series.

If $z \ne 1$, then the function $f(w)$ has a fixed point. Solve $f(w) = w$, or $zw + 1 = w$, to find that the fixed point is $w = 1/(1 - z)$. When $|z| < 1$, this fixed point is the sum of the series. But in general it has a nice geometric connection to the series, even when the series diverges: it is the center of similarity for the sequence of partial sums. For example, when $z = i+1/\sqrt3$, the partial sums and center look like this:

In particular, when $z$ is a root of unity, the partial sums of the series lie at the vertices of a regular polygon, and $1/(1-z)$ is the center of this polygon. (It is also the Cesàro sum of the series in that case.) Here are the polygons and centers for $z = e^{i\pi/4}$ and $z = e^{i\,3\pi/5}$:

The centers of these polygons all have real part $1/2$, which is a special case of the observation that the function $1/(1-z)$ sends the unit circle to the line $\mathrm{Re}\,z = 1/2$.

When $z = -1$, the polygon collapes to a line segment, and of course the center of this segment is $1/2$, the usual value assigned to the series $1 - 1 + 1 - 1 + \cdots$.

\((acbb)\)	\((a)(c)(bb)\)	\((a)(cb)(b)\)	\((ac)(b)(b)\)
\((a)(cbb)\)	\((ac)(bb)\)	\((acb)(b)\)	\((a)(c)(b)(b)\)