Tuesday, June 24, 2025

a broad context for Möbius functions

This post is about the Möbius function, which is not something I have had to think about very often. It has come up in some combinatorial problems I’ve been working on recently, and so I’ve been reading about it bit by bit. One particular presentation I have found especially helpful, and that’s what I’m going to share here.

The “classical” Möbius function \(\mu\) is defined on the positive integers by \[ \mu(n) = \begin{cases} 1 & \text{if}\ n = 1 \\ (-1)^r & \text{if}\ n=p_1\cdots p_r,\ \text{a product of \(r\) distinct primes} \\ 0 & \text{otherwise.} \end{cases} \] I have never particularly understood this function. Its definition is simple, but to me opaque. Why would this function be of interest? (I’m not interested in answers such as “because it works.”) Maybe it’s just because I’m not a number theorist at heart.

For independent reasons, I recently read the monograph Commutation and Rearrangements by Pierre Cartier and Dominique Foata, and therein they demonstrate the existence of a Möbius function for any monoid \(M\) with finite factorization: that is, given any \(x\in M\), there are only finitely many sequences \((x_1,\dots,x_n)\) of non-identity elements \(x_i\) such that \(x=x_1\cdots x_n\). If \(1_M\) is the identity element of \(M\), then it is assumed to be the product of the empty sequence.

As two examples, think of the positive integers with multiplication as the operation, or strings over a given alphabet with concatenation as the operation. In a sense, these are two “extreme” examples, because the positive integers form a commutative monoid with an infinite set of generators (the primes), while the only elements that commute in the monoid of strings are powers of a single generator (the elements of the alphabet being the generators).

One consequence of the assumption that every element of \(M\) has finitely many factorizations is that \(1_M\) has only the empty factorization, because otherwise if \(1_M=x_1\cdots x_n\) for some nonempty sequence \((x_1,\dots,x_n)\), then this could be used to produce infinitely many factorizations of every element of \(M\).

Here is the definition that Cartier and Foata give for the Möbius function \(\mu_M\) of M: given \(x\in M\),

  • \(d_+(x)\) is the number of ways \(x\) can be written as a product of an even number of factors;
  • \(d_-(x)\) is the number of ways \(x\) can be written as a product of an odd number of factors;
  • \(\mu_M(x) = d_+(x)-d_-(x)\).

This is a definition I can get behind! Now the Möbius function is revealed as the difference between two two natural counting functions. Note that for any \(M\), because the identity \(1_M\) has only the empty factorization, with zero factors, \(d_+(1_M)=1\) and \(d_-(1_M)=0\), which means \(\mu_M(1_M)=1\) always.

Let’s see how the rest of the values of \(\mu_M\) work out for the two examples we have in mind.

First suppose that \(M\) is the monoid of strings over an alphabet \(A\). (This will turn out to be the easier of the two cases.) If \(x\in A\) is a string with only one letter, then it can only be written as \(x\), so \(\mu_M(x) = -1\). If \(x\in M\) has \(n>1\) letters, then it can be factored in \(2^{n-1}\) ways, by introducing “breaks” between adjacent letters. Half of these factorizations have an even number of factors, and half have an odd number of factors, so \(d_+(x)=d_-(x)=2^{n-2}\), which means \(\mu_M(x) = 0\). For example, we can factor the string \(acbb\) in eight ways:

\((acbb)\) \((a)(c)(bb)\) \((a)(cb)(b)\) \((ac)(b)(b)\)
\((a)(cbb)\) \((ac)(bb)\) \((acb)(b)\) \((a)(c)(b)(b)\)
The top row shows the factorizations with an odd number of factors, and the bottom row shows the factorizations with an even number of factors.

Now suppose \(M = \mathbb{Z}_+\) is the monoid of positive integers. If \(p\) is a prime, then it can only be written as the product \(p\), so \(d_+(p)=0\) and \(d_-(p)=1\), which means \(\mu_M(p) = -1\). If \(p\) and \(q\) are distinct primes, then we have \(pq = (pq) = (p)(q) = (q)(p)\), so \(d_+(pq) = 2\) and \(d_-(pq) = 1\), which means \(\mu_M(pq) = 2-1 = 1\). On the other hand, \(p^2 = (p^2) = (p)(p)\), so \(d_+(p^2)=d_-(p^2)=1\), which means \(\mu_M(p^2)=0\). More generally, if \(x=p_1\cdots p_n\) with \(p_1,\dots,p_n\) all distinct primes, then any factorization of \(x\) is given by choosing \(k_1\) prime factors, then \(k_2\) prime factors, and so on, until \(x\) is expressed as a product of \(m\) factors as \[x=(p_1\cdots p_{k_1})(p_{k_1+1}\cdots p_{k_1+k_2})\cdots(p_{k_1+\cdots+k_{m-1}+1}\cdots p_{k_1+\cdots+k_m})\] where \(k_1+\cdots+k_m=n\). Within each parenthetical factor, the order of the primes doesn’t matter, but the order of the parenthetical factors does matter. This is an ordered partition of the prime factors of \(x\).

It is possible to compute \(d_+(x)\) and \(d_-(x)\) directly, based on the ordered partitions of the prime factors of \(x\). (See OEIS A000670 for the total number of ordered partitions of \(n\) elements, A089677 for the number of ordered partitions with an odd number of blocks, and A052841 for the number of ordered partitions with an even number of blocks.) However, here we will instead use induction to show the simpler result we need, that if \(p_1,\dots,p_n\) are \(n\) distinct primes, then \[ d_+(p_1\dots p_n) - d_-(p_1\dots p_n) = (-1)^n \] which recovers the classical Möbius function on numbers of this form. We have already shown the cases \(n=1\) and \(n=2\) in the previous paragraph, so suppose the result is true for some arbitrary value of \(n\), and let \(q\) be a prime distinct from \(p_1,\dots,p_n\). Set \(x=p_1\cdots p_n\). A factorization of \(xq\) can be obtained from a factorization of \(x\) in three ways:

  1. by inserting \(q\) into one of the existing factors;
  2. by introducing \(q\) as a new factor immediately before one of the existing factors;
  3. by introducing \(q\) as a new factor at the end.

The first operation preserves the parity of the number of factors, while the second operation switches the parity of the number of factors. Because there are the same number of these two types of operations, in computing the difference \(d_+(xq)-d_-(xq)\), they cancel out. The third operation also switches the parity of the number of factors, and it is only applied once to each of the factorizations of \(x\). Thus \[ d_+(xq)-d_-(xq) = d_-(x)-d_+(x) = -(-1)^n = (-1)^{n+1} \] and the result follows.

On the other hand, if \(x\in\mathbb{Z}_+\) is divisible by a prime power \(p^k\) with \(k>1\), then we can first examine the ways to split up this power. By treating \(p^k\) as a string over the alphabet \(\{p\}\), we see that there are the same number of factorizations having an even number of factors as an odd number of factors, and so \(d_+(p^k)-d_-(p^k)=0\). Introducing the rest of the prime factors of \(x\) one at a time, the same argument as in the previous paragraph shows that \(d_+(x)-d_-(x)=d_+(p^k)-d_-(p^k)=0\). (The case where more than one prime factor of \(x\) appears with multiplicity higher than \(1\) is left as an exercise to the reader.)

The purpose of the Möbius function of \(M\) is mainly to enable Möbius inversion, a way of transforming between related complex-valued functions on \(M\). I won’t go into that process now, because this post has taken long enough already, but I encourage anyone interested to read Cartier and Foata’s monograph (which has a load of other great ideas!).

I will observe that in the original paper where Möbius introduced his eponymous function ( “Über eine besondere Art von Umkehrung der Reihen”, 1832), he derived the expression \[ x = \frac{x}{1-x} - \frac{x^2}{1-x^2} - \frac{x^3}{1-x^3} - \frac{x^5}{1-x^5} + \frac{x^6}{1-x^6} - \frac{x^7}{1-x^7} + \frac{x^{10}}{1-x^{10}} \cdots \] which can be written more succinctly as \[ x = \sum_{n=1}^\infty \frac{\mu(n)x^n}{1-x^n} \] (the expression on the right is a Lambert series). In this context the definition of \(\mu(n)\) also makes sense to me, as a collection of coefficients, as it can be derived by an application of the inclusion–exclusion principle. Recall that the summation formula for the geometric series yields \[ \frac{x^n}{1-x^n} = \sum_{k=1}^\infty x^{nk}\text. \] So if you start with the series \(x+x^2+x^3+\cdots = x/(1-x)\) and you want to reduce it to \(x\), you can remove all the powers that are multiples of \(2\), then remove all powers that are multiples of \(3\), then add back in all powers that are multiples of \(6\) because they got removed twice, then remove all powers that are multiples of \(5\), then add back powers that are multiples of \(10\) or \(15\), then remove powers that are multiples of \(30\) because they got removed three times and added back in three times, and so on…

No comments: