Friday, August 19, 2016

specifications in analysis

Earlier this week, I wrote about expectations for my analysis class this fall (which also apply broadly to upper-level math classes) and some things I learned about specs grading this summer. In this post, I’ll share the specifications I have created for analysis. (I have taught real analysis before, and last time I tried a standards-based approach. Frankly, that basically turned into a point system, albeit a simplified one, which is why I’m trying something completely different this time.)

The rest of the post is taken verbatim from (the current draft of) my syllabus.


Effective learning requires effective methods of assessment. The assessments should relate as directly as possible to the expectations of the class, and they should provide both feedback on how to improve and opportunities to demonstrate improvement as the semester progresses. In my experience, “traditional” grading schemes based on assigning points or percentages to individual tasks do not serve these functions well. Therefore, this course adopts specifications grading*, in which grades are tied to specific outcomes. This is likely to be different from grading policies in other classes you have taken, so feel free to ask me questions or let me know if you have concerns. I hope that this system will make clear the connections between the expectations stated in the previous section and the ways you will be assessed.

Overall grading. At the end of the semester, I am required to submit to the university a letter grade reflecting your achievement in this class. That grade will be determined on the basis of a set of specifications in four areas: (1) class participation, (2) written proofs, (3) exams, and (4) synthesizing activities. Each of these areas will receive a simple grade of A, B, C, D, or F. The following sections describe how these grades will be determined. Your final grade will depend on your performance in all four areas, according to the following table.

Final grade based on individual grades of
A all As, or 3 As and 1 B
A– two As and two Bs
B+ one A and three Bs
B all Bs, or 3 Bs and 1 C
B– two Bs and two Cs
C+ one B and three Cs
C all Cs, or 3 Cs and 1 D
D– two Cs and two Ds
I will use my discretion to assign a final letter grade to other combinations of individual letter grades.

Class participation. Attendance at every class meeting is required. Most weeks, we will alternate days between discussing reading assignments and presenting solutions to exercises. The end of this syllabus has a schedule of what we will be doing in class each day (with allowance for adjustments, as needed).

Reading. In order to participate effectively on discussion days, you will need to read the textbook before coming to class. Each reading assignment is about 10 pages. The textbook attempts to be very accessible, but that does not mean it is easy. We will be working with ideas that stretch reason and imagination. You should be prepared to spend at least 1–2 hours on each reading assignment; rereading pages, paragraphs, or sentences; working out examples; and writing questions or comments in the margins or on separate paper. You should be especially mindful of definitions. These are not always set apart from the text, so pay attention when new vocabulary is introduced. Start working on a list of definitions and theorems from the start of the semester. The chapter summaries can be an aid in this process.

Collaborating. On days with a reading assignment, you will work in small groups to discuss the material. I will assign these groups at the start of each week. You should bring your own questions and thoughts to these discussions. If there is extra time, you can also discuss the current set of exercises.

Presenting. On the remaining days, you will take turns presenting solutions to exercises distributed previously. The solution you present does not necessarily need to be entirely correct, but it should show evidence of a serious effort. You should also be prepared to answer questions from me or other students. To maintain balance, no one will be allowed to present more than once every two weeks, unless every student in the class has already presented during that time period. In exceptional cases, some of these verbal presentations may be made to me outside of class (no more than one per student).

To earn a you must do the following
D attend at least 75% of class meetings
present at least one proof in class
C attend at least 85% of class meetings and contribute to discussions
present at least three proofs in class
B attend at least 90% of class meetings and contribute to discussions
present at least four proofs in class
A attend all class meetings (2 unexcused absences allowed) and contribute to discussions
present at least five proofs in class

Written proofs. Over the semester, you will develop a portfolio of work that you have submitted for formal assessment. Most of your contributions will be proofs. Each week I will indicate one or more exercises whose solutions could be submitted to your portfolio. You may discuss your work with other students in the class, to have them check whether it meets the standards of the class and give you feedback. A proof for the portfolio is due the Monday after it is assigned. These proofs must be typed using LaTeX, Google docs, Microsoft Word, or another system.

When you submit a written proof for your portfolio, I will judge whether it is Successful, Quasi-successful, or Unsuccessful (see the earlier section on “Proofs” under “Expectations” for details about these ratings), and mark it correspondingly with one of S/Q/U. Proofs marked Q or U will not be counted towards your grade. However, proofs can be resubmitted at the cost of one or two of your allotted tokens; see section on “Tokens” below.

To earn a your portfolio must contain
D at least four successful proofs
C at least six successful proofs
B at least eight successful proofs
A at least ten successful proofs

Exams. There will be two midterm exams and a final exam. Each one will have a take-home portion and an in-class portion. [Dates and times, listed in syllabus, omitted here.]

The take-home portions will consist of two or three proofs that you are to complete on your own, without consulting other students. (You may discuss your work with me before turning in the exam, although I might not answer questions directly.) These will be judged as successful, partially successful, or unsuccessful, like the proofs in your portfolio. They cannot be resubmitted after grading, however.

The in-class portions will test your mastery of definitions and the statements of theorems. You will need to be able to state both definitions and theorems properly. You will also be asked to recognize and provide examples of situations or objects where a definition or theorem does or does not apply.

To earn a you must do the following
D correctly answer 60% of in-class test questions
write at least two successful proofs on take-home exams
C correctly answer 75% of in-class test questions
write at least three successful proofs and one quasi-successful proof on take-home exams
B correctly answer 85% of in-class test questions
write at least four successful proofs and two quasi-successful proofs on take-home exams
A correctly answer 95% of in-class test questions, write six successful proofs on take-home exams

Synthesis. To master the ideas of the class, you must spend time synthesizing the material for yourself. The items in this graded section will be added to your portfolio, to complement the proofs. All materials in this section must be typed using LaTeX, Google docs, Microsoft Word, or another system.

List of definitions and theorems. It should be clear at this point that being able to produce accurate statements of definitions and theorems is essential to success in this class. To encourage you to practice these, I am requiring you to create a list of these statements for the entire course. Your list should be organized in some way that makes sense to you—e.g., alphabetically or chronologically.

The textbook can be used as a reference, as can the internet, but how do you quickly recall what definitions we’ve used and how they're related? How do you find the phrasing of a theorem that’s become most familiar? This list should help you in these situations. More importantly, creating it will help you review and organize the material in your own mind.

I will verify your progress on these lists at each in-class exam.

Papers. Twice during the semester, once in the first half and once in the second half, I will provide a list of topics that we have been discussing, from which you can choose to base a paper on. These will be due approximately two weeks after the midterm exams.

There is a third paper that can be completed at any point in the semester on a topic of your choosing, but you must get the topic approved by me before Thanksgiving.

These papers will for the most part be expository, meaning they will present previously known mathematical results (not original research). Here are the requirements for a paper to be acceptable:

  • It should have 1500–4500 words.
  • It should use correct grammar, spelling, notation, and vocabulary.
  • It should be organized into paragraphs and, if you wish, sections.
  • It should cover the topic clearly and reasonably thoroughly, with an intended audience of other math students (who may be assumed to have studied as much analysis as you).
  • It should contain a proof of at least one major result.
  • The writing should be original to you. Of course, small pieces like definitions may be taken directly from another source, but apart from these the paper should be your own work.
  • Citations are generally not necessary in expository mathematical writing, except for the following: a statement of theorem that you are not proving, a peculiar formulation of a concept/definition, or a creative idea (e.g., an uncommon metaphor or illustration) from another source.
  • You may choose to follow the style of our textbook, or a more formally structured math textbook, or something more journalistic or creative, as long as the previous criteria are met.
Papers that do not meet these criteria will be considered unsatisfactory and will not count towards your grade. An unsatisfactory paper can be revised and resubmitted at the cost of three tokens.

To earn a you must do the following
D create a list of definitions and theorems to include in your portfolio
C create a list of definitions and theorems to include in your portfolio
write a paper on one of the topics provided
B create a list of definitions and theorems to include in your portfolio
write two papers on the topics provided, one during each half of the semester
A create a list of definitions and theorems to include in your portfolio
write two papers on the topics provided, one during each half of the semester
write a third paper on a topic of your own choosing related to the class

Tokens. You start out the semester with seven (7) virtual “tokens,” which can be used in various ways:

  • One token allows you to resubmit a written proof initially judged to be quasi-successful (must be used within one week of initial grading).
  • Two tokens allow you to resubmit a written proof initially judged to be unsuccessful (must be used within one week of initial grading).
  • Three tokens allow you to resubmit an unsatisfactory paper (must be used within one week of receiving paper back).
  • One token gives you a 48 hour extension past the due date for a paper.
Unused tokens may be exchanged for a prize at the end of the semester. [maybe?!?]

*Based on Linda Nilson’s book Specifications Grading: Restoring Rigor, Motivating Students, and Saving Faculty Time.

Thursday, August 18, 2016

taking specs seriously

I’ve been an advocate of standards-based grading since I started using it over three years ago. It has addressed many of the concerns I had about the dominant point-based grading system and encouraged students to move forward in their understanding rather than feeling trapped by past performance.

I’m not solely an SBG proponent when it comes to grading, however. For one thing, I find it hard to adapt SBG to upper-level math courses. For another, the time seems ripe for experimentation in grading practices as more of us realize the shortcomings of what we have inherited from decades past. Not that we should constantly reinvent the grading process, but we should be open to various thoughtful ways of providing authentic assessment.

So I was certainly interested a couple of years ago when several fellow instructors began talking about specifications grading, a method espoused by Linda Nilson in her book Specifications Grading: Restoring Rigor, Motivating Students, and Saving Faculty Time. I adopted some of the ideas I heard and appreciated the increased flexibility it offered.

However, it was not until this summer that I read through Nilson’s book. It was useful because it seems Nilson and I think differently in ways I can’t quite put my finger on, and so the book has lots of ideas I would not have intuited on my own. Here are a few of the things I garnered from reading the book that I hadn’t picked up from online discussions (not that these things weren’t said, but this time they stuck):

  • Sometimes it’s OK to use percentages. I’ve been highly points- and percentages-averse since starting SBG. Percentages, my argument went, were essentially meaningless, because they’re constantly being curved (so they don’t really represent a “percentage” of anything) and the difference between 80% and 81% is essentially a coin toss (so they aren’t as linearly ordered as people like to think). But that argument isn’t uniformly true. In a course where precision is important, it is possible to measure, for instance, how many definitions a student can correctly state. For my upcoming analysis class, I expect “A” students to get definitions right 95% of the time, “B” students 85% of the time, “C” students 75% of the time. This really is quantifiable, and a definition is either correct (with respect to the established conventions of the subject) or not, so each one can be graded yes/no. As long as not everything is forced into a percentages model, this can be an effective way to give feedback.
  • Make students work for an A, but give them some choice in how to get there. As instructors, we want an A to represent mastery, an indication that the student can think nimbly and complexly about the subject. Ideally, students who earn an A will be the ones most invested in the subject. To demonstrate all this, students should have ownership of their work. They should make meaningful choices that reflect their interests and their skills as well as the subject at hand.
  • Not everyone has to do everything. This is closely tied to the previous point. Nilson uses the metaphor of “hurdles”: grade levels can be differentiated by having students clear either higher hurdles (more complex, better quality work) or more hurdles (more extensive work), or a mix of the two. I’m not generally a fan of having students earn higher grades by just proving they can do more—that takes more of my time, and more of theirs. But true mastery requires a measure of initiative. Having a small number of optional assignments that give students opportunities to distinguish themselves makes sense as part of a larger grading scheme.
  • There are good reasons to limit reassessments. Of course, one of these reasons is the subtitular “saving faculty time.” In past upper-level classes where I’ve allowed essentially unlimited resubmission, I’ve been swamped/behind at several points in the semester as students frantically tried to get something accepted. But that’s not even the best reason. By limiting reassessments and grading work pass/fail (or pass/progressing/fail or some other variant), students are encouraged to submit their best work each time, and to spend extra time making sure they check its quality before asking me to do so. The onus is on me to establish clear expectations, and on students to meet them. We’re not negotiating what’s acceptable through repeated revision and grading.
I also found the chapter on cognitive models (Chapter 3, “Linking Grades to Outcomes”) helpful in considering what it means to have a higher level of mastery; previously I wasn’t really familiar with anything beyond Bloom’s Taxonomy.

If this post was of interest to you, I hope you’ll consider joining the Google+ Community on “Standards-Based and Specifications Grading” (SBSG), where teachers of diverse disciplines are meeting to discuss how to implement these two particular alternative forms of grading.

Tomorrow I’ll share my full set of specifications for real analysis.

Tuesday, August 16, 2016

expectations in analysis

I’m working on the syllabus for my (junior and senior level) analysis class this fall, and I’d like to share some parts of it, hopefully thereby eliciting feedback. The main thing I’m concerned about is the type of specifications grading I’m adopting for the class—I’ll share that later this week. This post is about establishing the expectations of the course, on which the specifications will be based. None of these are particular to analysis; they establish what I believe any student in an upper-level mathematics course should achieve.

The rest of the post is taken verbatim from (the current draft of) my syllabus.


To learn mathematics, it is essential to engage actively with the material. This is especially true at this stage in your mathematical careers, as the focus of study shifts from developing computational tools to examining underlying concepts and practicing abstract reasoning. This shift may be described as a move from pre-rigorous thinking, which is informal and intuitive, to rigorous thinking, which is formal and precise. (This terminology has been suggested by mathematician Terence Tao; he also includes a post-rigorous stage, in which professional mathematicians work, where one is able to make intuitive arguments that are grounded by formal training.)

The content of this course resides in definitions, theorems, and proofs. You will be expected to state both definitions and theorems accurately and to illustrate them through examples. Mathematics is not merely a collection of disconnected facts, however, and so you will also develop your logical skills by proving mathematical truths, linking definitions to their profound consequences captured by theorems. All of this will happen in the context of a community—two really, our class and the larger mathematical community.

Definitions. In mathematics, as in other sciences, it is necessary to quantify what is being studied and to be able to identify what is of interest at each moment. This is done by carefully establishing and internalizing definitions. This is not to say that definitions do not involve creativity; as a subject develops, often definitions evolve to encompass more or fewer cases, to be more precise, or to reorganize ideas.

By the end of the course, you should be able to:

  • state definitions accurately and explain any notation or previously-defined terms they contain;
  • identify whether or not an object meets the conditions of a given definition;
  • give examples that satisfy a given definition as well as examples that do not satisfy it;
  • test an unfamiliar definition using examples;
  • create new definitions when needed.

Theorems. A theorem has two parts: the antecedent (its assumptions) and the consequent (its conclusions). To take a familiar example, the equation \(a^2 + b^2 = c^2\) by itself is not a theorem; rather, the Pythagorean Theorem states that “If \(c\) is the length of the hypotenuse of a right triangle, and \(a\) and \(b\) are the lengths of its other two sides, then \(a^2 + b^2 = c^2\).” A theorem may not always include the words “if” and “then,” but you should always be able to determine what are the antecedent and the consequent. Sometimes rephrasing the theorem’s statement can help. For example, “Every differentiable function is continuous” can be rephrased as “If a function is differentiable, then it is continuous.” In most cases, the consequent does not imply the antecedent (e.g., not every continuous function is differentiable). A theorem that says one set of conditions holds “if and only if” another set of conditions holds is logically making two statements (the antecedent and consequent can be reversed), and both must be proved.

By the end of the course, you should be able to:

  • state theorems accurately and identify what are their assumptions and their conclusions;
  • determine whether the conditions of a theorem do or do not hold in a given situation, explain why, and determine what the theorem does or does not imply in that situation;
  • recognize logically equivalent forms of a theorem;
  • formulate and test conjectures.

Proofs. Proofs are how we as individuals and as a community determine the truth of mathematical statements, i.e., theorems. Here is one definition of a proof, due to David Henderson: A proof is “a convincing communication that answers -- Why?” The extent to which a proof succeeds, therefore, depends on how well it embodies these three properties: it should be logical (does it convince?), it should be comprehensible (does it communicate?), and it should be intentional (does it answer why?). Evidently, each of these properties depends somewhat on the others. It is thus reasonable to classify proofs into an S/Q/U system:

  • (S) A successful proof makes an argument for the truth of a mathematical statement that is fully convincing to an informed reader or listener. It employs appropriate vocabulary and carefully chosen notation. It avoids sloppy reasoning. It makes clear use of the theorem’s assumptions and, when necessary, previously known results. The best examples provide motivation for the methods chosen. Minor revisions may be advisable, but they do not hinder the overall effectiveness.
  • (Q) A quasi-successful proof contains most of the ideas necessary to make a complete argument. It may have slips in logic or notation, or it may neglect a special case, or it may be hard to read. It contains sufficient evidence, however, that the argument can be “salvaged” by filling in gaps or clarifying language. Serious revision is necessary. [Not in syllabus: thanks to Dan for suggesting “quasi-”.]
  • (U) An unsuccessful proof does not convince an informed person of the truth of the purported theorem, for one or more of the following reasons: – It makes logical leaps or omits key ideas. – It demonstrates incomplete understanding of definitions or notation. – It fails to reference previous results when appropriate. Complete revision is generally necessary.
In other words, a successful proof is of sufficient quality that it could reasonably be accepted as part of a paper in a professional journal. A quasi-successful proof has some merit, but it requires revision, after which it might or might not be acceptable at a professional level. An unsuccessful proof is sufficiently flawed that it would not be acceptable as part of a professional publication.

By the end of the course, you should be able to:

  • evaluate, on the basis of professional standards, whether a given proof is successful or not;
  • write original, successful proofs.

Community. Our class time will be structured primarily around discussion rather than lecture. The idea is to have a space that promotes sharing ideas, making guesses, taking risks, and sharpening our reasoning abilities. I will guide and facilitate these conversations, but everyone is responsible for contributing to discussions, both in small groups and with the entire class. That is, in this course mathematical authority resides not just with me as the instructor, but with every class member. I will give short lectures (20 minutes) when the entire class agrees it would be beneficial, but not more often than once a week.

By the end of the course, you should be able to:

  • engage in discussions about mathematics by sharing questions, proposals, and insights;
  • evaluate others' contributions critically and respond constructively;
  • present your own work in front of an audience and address their comments and questions.

Wednesday, June 29, 2016

why I do math

I spent the past week on retreat with fellow faculty members. During part of this time, we each shared our “vocational journeys,” or the stories of how we have been led to this job and these fields of scholarship. I thought that part of my essay might have broader appeal, so I’m posting it here.

Why do I study and teach mathematics? My research is in the field of pure mathematics, for which it may seem harder to justify an investment of time than for its adjacent field of applied mathematics. Applied math at least tries to tie itself directly to the needs and concerns of our immediate physical world. Pure math is happy to oblige in improving how well we understand the world, but its primary concern is math for math’s sake. (The boundary between these two types of math is highly permeable, and even pure math almost always starts with inspiration from experience.) I’d like to address the question by comparing math with two other areas represented in academia: music and science.

First, math is like music. The aesthetic element in mathematics is essential, not peripheral. I’m not sure, but I think that in the minds of many people mathematics is reduced to a collection of more-or-less arbitrary facts, like the fact that the area of a circle equals pi times the square of its radius. Each of these facts, however, is like the final cadence of a symphony. It may be thrilling by itself, but it’s missing the indispensable context of “where did we start?” and “how did we get here?”

This is why mathematicians insist on proving things: the proof is a whole symphony, not a single chord. Mathematicians are lauded not for stating facts, but for demonstrating their necessity, the way composers and musicians are praised for the whole course of a piece or a performance, not just its ending. When executed well, a proof has rhythm. It has themes that are developed and interwoven. It has counterpoint. It sets up expectations that are satisfied or subverted. Economy of material is valued, but not exclusively; an argument that wanders into neighboring territory, like a modulation to a neighboring key, can provide fuller appreciation of the main theme.

Proofs have a variety of forms, some as common as sonatas and minuets: direct proof, proof by contradiction, proof by induction, proof by picture, proof by exhaustion. We have computer-generated musical compositions and computer-generated mathematical proofs, and in both communities there is healthy debate about whether these artificial creations are beautiful or desirable in such quintessentially human activities. We return over and over to the same pieces and theorems that have inspired us, whether they be simple or grand, and each performer gives her or his own interpretation and inflection to the presentation.

Second, math is like science. Often mathematics is categorized as a science, and that’s not entirely wrong. Science is built on careful observation, winnowing data from the chaff of noise. Science seeks explanation which can be turned into prediction. It invents new tools for collecting information and improves upon those that already exist. It creates models and theories that encompass and relate as many pieces of knowledge as possible.

Where science and math differ is that science deals with the world in which we live, while the world of math is imagined. Imagine that there are such things as points with no volume and perfectly straight lines that connect them. Imagine that numbers have enough solidity that we can move them around en masse by means of undetermined variables, the x, the y, the z. Imagine that once we start counting upwards 1, 2, 3, 4, 5, a thousand, a million, a trillion, a googol,…, we could never reach an end, not in any number of lifetimes in any number of universes. Or imagine that the filigree of a fractal truly exists at every scale, that we can examine it closer and closer and see the ever-increasing detail, that there is no quantum barrier to our exploration, beyond which sight and measurement cease to be meaningful.

When we imagine these things, we create the worlds in which we make our observations. The rules of these worlds are not completely arbitrary, at least not if we want to be able to know anything about them, but they are ours to choose. Each time we choose anew, we enter an undiscovered country. Once in this country, we must return to scientific methods of study. We look for patterns, try to explain them, and check that our explanations make accurate predictions. We must know when to trust the instruments we have—our minds, computer programs, results proved by other mathematicians—and when not to trust them. Like scientists, we have to winnow out the noise.

Mathematical truth persists across ages and cultures, and so it may seem timeless, but our experience of it certainly isn’t. The channels of logic through which a proof flows may be carved out once and for all in eternity or in the human mind (depending on your view of where mathematical truth lies), but like notes on a page they remain inert until they are brought to life by individual or communal study. Like the tree of life in biology or the standard model in physics, mathematical theories are crystallized around our experience and our perception of the world. As Bill Thurston wrote, “mathematics only exists in a living community of mathematicians that spreads understanding and breathes life into ideas both old and new. The real satisfaction from mathematics is in learning from others and sharing with others.”

Tuesday, May 31, 2016

Homology modulo 2

Last week, I was chirping on Twitter about “homology modulo 2”: how closely it matches my geometric intuition of what homology should measure, despite my never having thought seriously about it before, and how its computational simplicity makes it seem like an ideal way to introduce homology to undergraduates, even those who haven’t studied linear algebra. For a very complete graduate-level introduction to homology (and cohomology) modulo 2, check out Jean-Claude Hausmann’s book. I will instead try to demonstrate how this topic can be introduced at nearly any level, with an appropriate amount of care. For the sake of brevity, I will assume familiarity with linear algebra in this post; however, the necessarily elements (image, kernel, rank, row reduction) can easily be learned in the context of homology, particularly when working modulo 2.

Note: This post got long enough in the writing that I didn’t make any pictures to go with it, so you should draw your own! The idea is to discover how algebra can be used to extract geometric/topological information in a way that is really striking when you see it happen.

The space

For simplicity of exposition, I will only consider spaces \(X\) that are created from finitely many convex polytopes (often simplices or hypercubes) by making some identifications (“gluings”) between their faces. The faces are not necessarily joined in pairs, however; more than two faces of the same dimension may be identified, or some faces might not be joined at all. A more careful definition is possible, but to provide one would get away from the fairly non-technical introduction I’m aiming for. Just assume no funny stuff happens, OK? The polytopes that make up \(X\) are called the cells of \(X\); the collection of cells includes all the faces of all the polytopes we started with (some of which, as noted above, have been identified with each other in pairs or larger groupings). Each cell, being a polytope, has a dimension, and if we wish to specify the dimension of a cell as \(k\), we call it a \(k\)-cell.

For example, \(X\) could just be a single convex polytope. Or it could be a convex polytope with the interior removed (keeping in mind that the boundary of a convex polytope is a union of convex polytopes of one dimension lower). The outside of a cube, for instance, is made up of six 2-cells (the faces), twelve 1-cells (the edges), and eight 0-cells (the vertices). A torus, when made from a rectangle by identifying opposite sides, is also such a space, with one 2-cell (the interior of the rectangle), two 1-cells (the result of identifying the edges in pairs), and one 0-cell (because all corners of the square are identified to the same point).

The data

The homology of \(X\) measures the difference between objects in \(X\) that have no boundary (these are called cycles) and objects that are the boundaries of other objects (called, quite sensibly, boundaries). A \(k\)-dimensional cycle that is not a boundary is supposed to “enclose” a \(k\)-dimensional “hole” in \(X\). The formal definitions are intended to quantify what is meant by “boundary;” the intuitive notion of “hole” floats along, generally defying proper definition (and often even intuition).

By “object” in the previous paragraph, we mean something made up from the cells of \(X\). We restrict ourselves to putting together cells of the same dimension, producing objects called chains. That is, a \(k\)-chain is just a collection of \(k\)-cells in \(X\). We can add together \(k\)-chains, but—and this is the beautifully simple part—we add modulo 2. If a particular cell appears twice, then this pair of appearances cancel each other out. The idea is that, since we’re trying to study “holes” in our space \(X\), if one cell appears twice, the pair of copies can be joined up along their common boundary and safely removed. Formally, a \(k\)-chain is a linear combination of \(k\)-cells, with coefficients in the field with two elements, if you find such a formal description helpful.

We now proceed to the key combinatorial data of our space \(X\) and see how it can be used to extract topological information. Because \(X\) is made up of finitely many cells, for each \(k = 1, \dots, n\), we can construct a boundary matrix \(\partial_k\). (Normally \(\partial_k\) would be defined as a linear map between certain vector spaces; we are fully exploiting the equivalence between linear maps and matrices.) The columns of \(\partial_k\) are labelled by the \(k\)-cells of \(X\), and the rows are labelled by the \((k-1)\)-cells. In each column, we put a 1 in each position where the corresponding \((k-1)\)-cell lies in the boundary of the given \(k\)-cell, and a 0 otherwise. Exception. Sometimes the faces of a single \(k\)-cell may be joined to each other, meaning the resulting \((k-1)\)-cell appears with multiplicity on the boundary of that \(k\)-cell. This multiplicity, modulo 2, is taken into account in the boundary matrix. See the boundary matrices of the torus, near the end, for examples of this phenomenon.

A concrete example: the tetrahedron

The boundary matrix, like most computational objects, is best understood through examples. Let’s start with the empty tetrahedron. Label the vertices \(v_1\), \(v_2\), \(v_3\), \(v_4\), and let \(f_i\) be the triangular face opposite \(v_i\). Let \(e_{ij}\) be the edge joining \(v_i\) to \(v_j\), with \(i < j\). Then we have two boundary matrices,

\( \partial_1 = \begin{bmatrix} 1 & 1 & 1 & 0 & 0 & 0 \\ 1 & 0 & 0 & 1 & 1 & 0 \\ 0 & 1 & 0 & 1 & 0 & 1 \\ 0 & 0 & 1 & 0 & 1 & 1 \end{bmatrix} \)      and      \(\partial_2 = \begin{bmatrix} 0 & 0 & 1 & 1 \\ 0 & 1 & 0 & 1 \\ 0 & 1 & 1 & 0 \\ 1 & 0 & 0 & 1 \\ 1 & 0 & 1 & 0 \\ 1 & 1 & 0 & 0 \end{bmatrix}\).
In \(\partial_1\), the columns are labelled by the edges and the rows are labelled by the vertices. In \(\partial_2\), the rows are labelled by the edges and the columns are labelled by the faces. In both matrices, the edges are listed in the order \(e_{12}\), \(e_{13}\), \(e_{14}\), \(e_{23}\), \(e_{24}\), \(e_{34}\). Notice that each column of \(\partial_1\) has two 1s, because each edge has two endpoints, and each column of \(\partial_2\) has three 1s, because each face is bounded by three edges.

Once we have these matrices, we can use them to find boundaries of more general chains. For instance, when joined together, the edges \(e_{12}\) and \(e_{23}\) form a path from \(v_1\) to \(v_3\), so we expect the boundary to be these two points. Indeed, adding together (modulo 2!) the corresponding entries from the first and fourth columns of \(\partial_1\), we see that the 1s in the second entry cancel (which corresponds to the edges being joined at \(v_2\)), and we are left with 1s in the first and third entries. We can write this relation as \(\partial_1(e_{12}+e_{23}) = v_1 + v_3\). Similarly, if we add together the first three columns of \(\partial_2\), which correspond to \(f_1\), \(f_2\), and \(f_3\), the result is a vector with 1s in the first, second, and fourth entries, which correspond to \(e_{12}\), \(e_{13}\), and \(e_{23}\), producing the equation \(\partial_2(f_1 + f_2 + f_3) = e_{12} + e_{13} + e_{23}\). This demonstrates that the union of three of the faces has the same boundary as the fourth face. The sum of all four columns of \(\partial_2\) has all 0s for its entries, showing that the four faces of the tetrahedron, taken together, have no boundary.

How to extract information from the boundary matrix

Having illustrated some computations with boundary matrices in the above example, let’s codify some definitions. A collection of \(k\)-cells is called a \(k\)-cycle (or closed) if the sum of the corresponding columns of \(\partial_k\) is the zero vector. (This is a formal way of saying “has no boundary.”) A collection of \(k\)-cells is called a \(k\)-boundary (or exact) if it can be obtained as a sum of columns of \(\partial_{k+1}\). In linear algebra terms, a \(k\)-cycle is an element of the kernel of \(\partial_k\), and a \(k\)-boundary is an element of the image of \(\partial_{k+1}\). Again, the benefit of working modulo 2 is that these conditions can be easily checked. The set of \(k\)-boundaries is denoted \(B_k\), and the set of \(k\)-cycles is denoted \(Z_k\) (the notation \(C_k\) generally being reserved for \(k\)-chains).

A fundamental property is that \(\partial_k \partial_{k+1} = 0\), which has the satisfying geometric interpretation that “every \(k\)-boundary is a \(k\)-cycle,” or \(B_k \subseteq Z_k\). This property can be checked directly in the above example of the tetrahedron. In general, it applies because, in a \(k\)-dimensional polytope, each \((k-2)\)-dimensional face appears in two \((k-1)\)-dimensional faces (provided \(k \ge 2\); if \(k=1\), then there are no \((k-2)\)-dimensional faces, so \(\partial_0 = 0\), and the property \(\partial_0 \partial_1 = 0\) holds trivially). From the perspective of homology, this means boundaries aren’t “interesting” cycles. They’re the boundaries of something, after all, so they certainly don’t enclose a “hole.”

What we really want to measure, then, is how many cycles are not boundaries. To determine this, we first need to find out how many cycles and how many boundaries there are. Except we can add cycles together to get new cycles (in linear algebra terms, the kernel of a matrix is a subspace of the domain), and we can add boundaries to get new boundaries (the image of a matrix is also a subspace), so what we really want is to know how many independent cycles there are: that is, we want the dimension or rank of the set of cycles and the set of boundaries. I’ll use rank here, even though we’re working with vector spaces, because that terminology transfers to the case of integral homology.

The rank of the \(k\)-boundaries is the rank of \(\partial_{k+1}\), because by definition this describes the maximal number of independent boundaries of \((k+1)\)-chains. On the other hand, the rank of the \(k\)-cycles is the nullity of \(\partial_k\), because this measures the maximal number of independent \(k\)-chains with no boundary. From linear algebra, we know that the rank of a matrix can be determined by row reducing to echelon form and counting the number of rows (equivalently, columns) that have leading ones.

Homology gets its name from the notion of homologous cycles (“homologous” meaning, etymologically, “having the same position or structure”). Two \(k\)-cycles are homologous if their difference is a \(k\)-boundary. Modulo 2, the difference of two objects is the same as their sum, so this just means that two cycles are homologous if, when we put them together, they form the boundary of an object of one higher dimension. Boundaries are “homologically trivial” because, by definition, they are homologous to the chain consisting of no cells, \(0\). The \(k\)th homology of \(X\) is the quotient (group, vector space, module, etc.) of the cycles and the boundaries: \[ H_k = Z_k/B_k. \] The associated numeric invariant is the \(k\)th Betti number \(\beta_k\) of \(X\), which is the rank of the \(k\)th homology. It can thus be computed as the difference between the rank of the \(k\)-cycles and that of the \(k\)-boundaries: \[ \beta_k = \mathrm{rank}\,Z_k - \mathrm{rank}\,B_k. \] This is the number that “counts” the “\(k\)-dimensional holes” in our space \(X\). Note that this is an ordinary natural number, not an integer modulo 2. However, when working modulo 2, the Betti numbers entirely determine the homology, up to isomorphism. (In ordinary, integral homology, this is not the case: homology may have “torsion” elements, while the Betti numbers only count the “free” part of homology. The integral homology determines the mod 2 homology, but the reverse is not true, so homology modulo 2 is undoubtably “weaker,” and there are certainly times one would want the full theory. However, I hope this post is illustrating the benefits of using homology modulo 2 as a shortcut for introducing the key concepts.)

Examples of homology

Let’s return to the example of the tetrahedron. Using \(\sim\) for row equivalence, we have

\( \partial_1 \sim \begin{bmatrix} 1 & 0 & 0 & 1 & 1 & 0 \\ 0 & 1 & 0 & 1 & 0 & 1 \\ 0 & 0 & 1 & 0 & 1 & 1 \\ 0 & 0 & 0 & 0 & 0 & 0 \end{bmatrix} \)      and      \(\partial_2 \sim \begin{bmatrix} 1 & 0 & 0 & 1 \\ 0 & 1 & 0 & 1 \\ 0 & 0 & 1 & 1 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix}\).
The rank of both matrices is \(3\). The nullity of the first matrix is \(6 - 3 = 3\), and the nullity of the second matrix is \(4 - 3 = 1\). Thus we have \[ \mathrm{rank}\,Z_1 = 3, \qquad \mathrm{rank}\,B_1 = 3, \qquad \mathrm{rank}\,Z_2 = 1, \qquad \mathrm{rank}\,B_2 = 0, \qquad \] the last quantity following from the fact that there are no 3-cells in the empty tetrahedron. We also know that \[ \mathrm{rank}\,Z_0 = 4, \qquad \mathrm{rank}\,B_0 = 3, \qquad \] with the first of this pair of equations coming from the fact that a point has no boundary. The Betti numbers of tetrahedron are \[ \beta_0 = 4 - 3 = 1, \qquad \beta_1 = 3 - 3 = 0, \qquad \beta_2 = 1 - 0 = 1. \] Here is a geometric interpretation of these numbers, in reverse order.
  • The equation \(\beta_2 = 1\) means that there is one independent 2-cycle which is not a boundary. The reduced form of \(\partial_2\) shows that this cycle is \(f_1 + f_2 + f_3 + f_4\), i.e., the sum of all the faces of the tetrahedron. Thus, when we take all the faces together, the result is a closed cycle, and no other combination of faces has an empty boundary. Roughly speaking, the entire tetrahedron encloses a “hole.”
  • The equation \(\beta_1 = 0\) can be read as “every 1-cycle is a 1-boundary.” A stronger form of this statement is that the tetrahedron is simply connected—every loop can be contracted to a point, or every closed loop on the tetrahedron is the boundary of something 2-dimensional. Roughly speaking, there are no holes on the surface of the tetrahedron.
  • The “holes” measured by the 0th homology are of a somewhat different type. Generally speaking, the Betti number \(\beta_0\) measures the number of connected components. Because any point has no boundary on its own (hence is a 0-cycle), two vertices are are boundary if and only if they can be joined by a path of edges. Thus the equation \(\beta_0 = 1\) simply means that the tetrahedron is connected.

Now let’s turn to the example of the torus, formed from a rectangle by identifying opposite sides. This space has one 2-cell \(f\) (the interior of the torus), two 1-cells \(e_1\) and \(e_2\) (the edges of the rectangle, after being identified in pairs), and one 0-cell \(v\) (all four vertices of the rectangle become a single point on the torus). Each edge \(e_i\) appears twice on the boundary of \(f\), and the vertex \(v\) appears at both ends of each edge, so the boundary matrices are \[ \partial_1 = \begin{bmatrix} 0 & 0 \end{bmatrix}, \qquad\qquad \partial_2 = \begin{bmatrix} 0 \\ 0 \end{bmatrix}. \] Thus every \(k\)-chain has an empty boundary for \(k = 0, 1, 2\), and the rank of the \(k\)-cycles equals the number of \(k\)-cells. The interpretations of \(\beta_0 = 1\) and \(\beta_2 = 1\) are the same as in the case of the tetrahedron. In this case, the equation \(\beta_1 = 2\) tells us there are two different, independent 1-cycles, which can be represented by a latitude circle and a longitude circle on the torus.

Footnote on topological spaces

A few words justifying the restriction to polytopal complexes: When I was in Hatcher’s algebraic topology class, he chose to introduce cellular homology first so that we could get to computations quickly; later he introduced singular homology mainly to prove that the homology groups only depend on the underlying topological space. It thus seems entirely reasonable to me, for purposes of introduction, to work directly with CW complexes. The appendix to Hatcher’s book is a standard reference for learning about CW complexes, but in practice a CW complex usually means a topological space that is assembled from convex polytopes, attached along their faces.

Another introductory source on homology for undergraduates

I recently came across Peter Giblin’s book Graphs, Surfaces and Homology, which provides a very thorough introduction to its eponymous topics with only the prerequisite of linear algebra. However, like most treatments of homology, it first deals with integral homology, then comes around to homology modulo 2 late in the book, in Chapter 8, specifically to deal with non-orientable (or at least unoriented) surfaces and simplicial complexes. Gibson describes the theory of homology modulo 2 as “satisfactory” but “weaker than the theory with integral coefficients,” which is absolutely true.

However, if one’s goal is either to learn about homology quickly or to study new spaces (rather than, say, to prove the classification of surfaces), then I think homology modulo 2 is perfectly sufficient, particularly since the contemporary field of persistent homology, applied to study data sets in large dimensions, often works with homology modulo 2. (See this survey, or the remark on page 7 of this overview, for instance.)

Saturday, May 21, 2016

Snell and Escher

A few weeks ago, Grant Sanderson posted a video on the brachistochrone, with guest Steven Strogatz.


The video explains Johann Bernoulli’s solution to the problem of finding the brachistochrone, which is a clever application of Snell’s Law. I immediately wondered if a similar application could be used to explain the behavior of geodesics in the hyperbolic plane, which it turns out is true. I’m not the first to think of this, but it doesn’t seem to be well-known, so that’s what I’ll try to explain in this post. This may become my standard way of introducing hyperbolic geometry in informal settings, i.e., when formulas aren’t needed. (As an example of another exposition that describes hyperbolic geodesics this way, see the lecture notes for this geometry course.)

Snell’s Law, as represented in the above diagram (image source), applies to light traveling from one medium to another, where the interface between the two is horizontal. If light travels at speed \(v_1\) in the first medium and \(v_2\) in the second medium, and its trajectory meets the interface at an angle of \(\theta_1\) and leaves at an angle of \(\theta_2\) (both angles measured with respect to the vertical), then \[ \frac{\sin\theta_1}{v_1} = \frac{\sin\theta_2}{v_2}. \] This is the case of two distinct media. Snell’s Law has a continuous version (derived from the discrete one by a limiting process, as suggested in the video). Suppose light is traveling through a medium with the property that the speed of light at each point depends on the vertical position of the point. That is, the speed of light in this medium at a point \((x,y)\) is a function \(v(y)\), which may vary continuously. At each point of a trajectory of light in this medium, let \(\theta\) be the angle formed by the direction of the trajectory (i.e., the tangent line) and the vertical. Then the quantity \[ \frac{\sin\theta}{v(y)} \] is constant along the trajectory.

So suppose we are looking at a medium that covers the half-plane \(y > 0\), in which light travels at a speed proportional to the distance from the \(x\)-axis: \(v(y) = cy\). (The constant \(c\) may be thought of as the usual speed of light in a vacuum, so that along the line \(y = 1\) light moves at the speed we expect. As we shall see, this is analogous to the fact that distances along the line \(y = 1\) in the hyperbolic metric match Euclidean distances. Of course, it also means that light moves faster than \(c\) above this line, which is physically impossible, but we’re doing a thought experiment, so we’ll allow it.) If we imagine someone living inside this medium trying to look at an object, what direction should they face?

From our outside perspective, it seems that the observer should look “directly at” the object, in a straight (Euclidean) line. However, in this medium light does not travel along Euclidean line segments, but instead along curved arcs, as illustrated below.

Click on the graph to go to an interactive version.

It’s not too surprising that light follows a path something like this if it’s trying to minimize the time it takes to travel from the object to the observer: the light travels faster at higher vertical positions, so it’s worth going up at least slightly to take advantage of this property, and it’s also worth descending somewhat sharply so as to spend as little time as possible in the lower, slower regions.

What may come as a surprise is that the path of least time is precisely a circular arc. With Snell’s Law, however, this fact can be derived quickly. We have that \(v(y) = cy\), and so along a light trajectory \[ \frac{\sin\theta}{cy} = \text{constant}. \] Multiplying both sides by \(c\), we find that \(\frac{\sin\theta}{y}\) is also a constant. If this constant is zero, then \(\theta = 0\) constantly, so the path is a vertical segment. Otherwise, call this constant \(\frac{1}{R}\). Then \(y = R \sin\theta\). Now set \(x = a + R \cos \theta\). The curve \[ (x,y) = (a + R \cos\theta, R \sin\theta) \] parametrizes a circle centered at \((a,0)\) by the angle between the \(x\)-axis and the diameter. It remains to see that this angle \(\theta\) is the same as the angle between the vertical direction and the tangent line at the corresponding point of the circle. This equality can be shown in any number of ways from the diagram below.

Click on the graph to go to an interactive version.

This is not to say that this parametrization describes the speed at which light moves along the path. As previously observed, light slows as it approaches the horizontal boundary, that is, the \(x\)-axis.

But perhaps we’ve been prejudiced in assuming our perspective is the right one. We’ve been looking with our Euclidean vision and supposing light moves at different speeds depending on where it is in this half-plane. Thus it seems to us that light covers Euclidean distances more quickly the further it gets from the \(x\)-axis. But relativity teaches us that distance isn’t absolute: instead, the speed of light is what’s absolute. So perhaps we could gain greater insight by measuring the distance between points according to how long it takes light to travel between them. That is, we assume that the paths determined above are the geodesics of the half-plane, and by doing so we learn to “see” hyperbolically. Then we are not troubled by looking at an image like

(image source) and being told that all of the pentagonal shapes are the same size, because we’ve learned to look at things with our hyperbolic geometry glasses on.

M. C. Escher illustrated (or, more accurately, approximated) the hyperbolic geometry of the upper half-plane with his print Regular Division of the Plane VI (1958), shown below (image source).

This design was created during a time Escher was attempting to visually depict infinity. It was shortly before he had encountered the Poincaré disk in a paper by Coxeter, which discovery led to the Circle Limit series. In this print, the geometry of each lizard is Euclidean, structured around an isosceles right triangle. Each horizontal “layer” has two sizes of triangles, one scaled down from the other by a factor of \(\sqrt{2}\). The side lengths of the triangles in one layer are one-half of those in the layer above, so the heights of layers converge geometrically to the horizontal boundary at the bottom. Some of the triangles are outlined in the next image.

Some questions I have about Escher’s print:

  • How different would this image look if it were drawn according to proper hyperbolic rules, with each lizard having reflectional symmetry, and each meeting of “elbows” having true threefold symmetry? (This would give the tessellation with Schläfli symbol {3,8}, an order-8 triangular tiling.)
  • If we suppose that the right triangles act as prisms, with light moving at a constant speed inside each one, but this speed being proportional to the square root of the triangle’s area, then what will the trajectories of light look like as it moves through the plane? Will they approximately follow circles?
  • How many lizards are in the picture?

Coda: Jos Leys has taken some of Escher’s Euclidean tessellations and converted them to hyperbolic ones, in both the disk and the half-plane model.

Saturday, April 02, 2016

using calculus to understand the world

In my last post, I wrote about how I returned to teaching related rates in my calculus class and ranted a bit about the inanity of most related rates problems. There I mainly discussed the difficulty in reading the statement of such problems and how to make the questions they raise seem more natural. I’d like to expand on this theme with some more examples.

One feature of mathematics that doesn’t get emphasized enough, IMHO, is that it is a science, and as such is based in observation. Often, either we lead students through abstract reasoning to a previously unanticipated result, or we prove things that are so self-evident that the notion they need proof is itself baffling. Now, in the world of professional mathematics, it is true that even apparently obvious facts need proving (remember that “to prove” just means “to test”), and we often do get excited when we are led to something unexpected and beautiful. That is because we have learned how to use and trust our logical skills to examine the truth of something, and we delight in the uncovering of new truth by means of those skills. But even when a result is surprising to an audience, and even if it was at first surprising to the speaker, it is no longer so. A mathematical researcher plays with ideas until she notices something interesting, and then she tries to understand why it is so. That’s the exciting part of math, and that is what I believe we can share with our students through the process of modeling.

My goal in teaching related rates has become to ground as many questions as possible in direct observation. When I ask about the sliding ladder, as I described in my last post, before setting up the math but after asking students what they think will happen, I demonstrate by leaning a ruler against a book and slowly pulling the bottom end away. What one notices in this experiment is that the top end of the ruler moves very slowly at first, and very quickly just before reaching the ground. The speed in the final moment is so great that one is tempted to think the person pulling the bottom end lost control, and gravity took over. (This is even more credible when using the much larger, heavier ladder in a demonstration.) But the math shows that even if the person keeps complete control and moves at precisely the same speed, the same effect will occur. Let’s see why.

The exact length of the ladder doesn’t matter, of course, so call it $L$. If $x$ measures the distance from the wall to the bottom end of the ladder and $y$ measures the distance from the floor to the top of the ladder, then we have $x^2 + y^2 = L^2$. Then we differentiate both sides with respect to time and get $2x\frac{dx}{dt} + 2y\frac{dy}{dt} = 0$, or \[ \frac{dy}{dt} = -\frac{x}{y} \frac{dx}{dt}. \] At this point most related rates problems would ask you about the size of $dy/dt$ for some particular values of $x$, $y$, and $dx/dt$, but look at how much we can determine just from this related rates equation: when $y$ is larger than $x$, the top end is moving more slowly than the bottom end, and conversely when $x$ is greater than $y$, the top end is moving more quickly than the bottom end. There is just one moment when the two ends are moving at the same speed, which is when $y = x$, or in other words, when the ladder is at a 45 degree angle. And as the distance between the top end and the floor approaches zero, the speed of the top end approaches infinity. (Not physically possible, of course, but it explains why there’s such a quick movement at the end of the process.) There; now I feel like I’ve learned something!

I have five more examples to illustrate how much more interesting I think related rates are when tied to direct observation. This will probably belabor the point, but unfortunately these examples are also stripped of any interest by focusing too much on a single moment in time, which is what every standard textbook does with them.

The next example involves inflating a balloon. This, again, is easy to demonstrate. I can’t take a deep enough breath to fill the whole balloon at once, but even with two puffs, exhaled at a near-constant rate, it’s obvious to students that the size (i.e., diameter, or radius) of the balloon grows more quickly at first, then more slowly. Anyone who’s worked with an air or helium tank has surely experienced this phenomenon. Why is this happening? And how much more slowly is the diameter increasing as time goes on? Here there’s very little modeling involved; essentially the entire model is provided by assuming the balloon is a sphere and using the formula for the volume of a sphere in terms of its radius, $V = \frac{4}{3}\pi r^3$. Differentiating with respect to time gives the relation $\frac{dV}{dt} = 4\pi r^2 \frac{dr}{dt}$, so \[ \frac{dr}{dt} = \frac{1}{4\pi r^2}\frac{dV}{dt}. \] Some books reach this equation, but as with the ladder they again jump to plugging in values for a specific time, rather than noting the following: if $dV/dt$ is constant, then $dr/dt$ is inversely proportional to the square of the radius! And even more, $4\pi r^2$ is the surface area of a sphere with radius $r$, so this relationship between rates is directly related to the fact that the derivative of the volume of a sphere with respect to its radius is the surface area! That is, the size of the surface of the balloon is what determines, together with the rate the volume is increasing, how quickly the radius is increasing.

A similar phenomenon happens with the standard filling-an-inverted-cone problem. The demonstration here involves a martini glass and some colored water. (As I promise my students when doing this experiment, it’s just water.) My martini glass is about 12 cm across on top, and about 8 cm deep, giving it a volume of 300 milliliters. (That’s about 10 ounces, the size of two regular martinis—you don’t want to fill this glass with gin and drink it too quickly.) Having a nice big glass is useful for this demonstration: if I pour at a constant rate, the water level rises much more slowly near the top than near the bottom. The math shows just how much more slowly. The volume of a cone with height $h$ and base radius $r$ is $V = \frac{1}{3}\pi r^2 h$. From the geometry of this situation (using similar triangles, for instance), for my glass the radius of the surface of the water is always three-quarters of the water’s depth (here interpreted as height). We could use the relation $r = \frac{3}{4}h$ and substitute into the volume formula to get rid of the variable $r$, but there’s also no harm (as my students taught me) in differentiating first, using the product rule: \[ \frac{dV}{dt} = \frac{\pi}{3} \left( 2rh\frac{dr}{dt} + r^2\frac{dh}{dt}\right). \] Notice that this formula is valid for all cones varying in height, radius, and volume, whether or not the height and radius are linearly related at all times. The most obvious quantity of interest (assuming constant $dV/dt$) is $dh/dt$. From $r = \frac{3}{4}h$ we get $\frac{dr}{dt} = \frac{3}{4}\frac{dh}{dt}$, and also $h = \frac{4}{3}r$. The reason to solve for both of these quantities is that, by keeping both $dh/dt$ and $r$ in the equation and substituting out $h$ and $dr/dt$, we get $\frac{dV}{dt} = \frac{\pi}{3}\left(2r^2\frac{dh}{dt}+r^2\frac{dh}{dt}\right)$, or, after solving for $dh/dt$, \[ \frac{dh}{dt} = \frac{1}{\pi r^2} \frac{dV}{dt}. \] First of all, the ratio between the height and the radius has disappeared, so this formula now works for any inverted cone, not just my martini glass. And second of all, just as with the balloon, the rate at which the height increases depends on the “surface area” that is expanding, which in this case is just the base of the cone! Thus, again, the reason the water level rises more slowly near the top of the glass has a clear geometric interpretation. (Here’s a real-world application: I argue this works to the benefit of bartenders, who can pour into a martini glass fairly quickly without risk of overflowing, because the beverage level rises slowly near the top of the glass.)

I took the next two examples from Cornell’s Good Questions Project; come to think of it, it may be these questions that first planted in my head the idea of looking at related rates problems over time, without numbers. The situations are again standard for related rates problems, but the conclusions are much more interesting than a single rate at a single moment.

Consider an actor (say, Benedict) on a stage, illuminated by a light at the foot of the stage. Benedict casts a shadow on the back wall; how does the length of his shadow vary if he walks towards the light at a constant speed? The demonstration of this situation is particularly exciting, because you get to turn off the classroom lights, pull out a flashlight and a doll or figurine, and watch what happens to the shadow of the doll/figurine/actor on the wall as it moves towards the flashlight. Students observe that at first the shadow grows slowly (when the figure is close to the wall), then more quickly as he approaches the light. Modeling this situation generally provides the first major geometric hurdle for my students, because it involves the imagined line that emanates from the light, passes by Benedict’s head, and finally reaches the back wall, thereby determining the height of the shadow. (I wonder if many of them have never thought about the geometry of how shadows relate to the objects that cast them.) I’ll let the reader work out the fact that, if Benedict’s height is $h$, the distance from the light to the back wall is $D$, the distance from Benedict to the light is $x$, and the height of the shadow is $s$, then $\frac{s}{D} = \frac{h}{x}$. (Hint: use similar triangles.) Here the only variables are $x$ and $s$, so the related rates equation is \[ \frac{ds}{dt} = -\frac{hD}{x^2}\frac{dx}{dt}. \] Students are at first perplexed by the negative sign: shouldn’t the shadow be increasing? If so, why does its derivative appear to be negative? Then they realize: ah, if Benedict is walking towards the light, then $dx/dt$ is negative, so $ds/dt$ is in fact positive! And so it becomes clear that the height of the shadow increases much more rapidly when Benedict is near the light than when he is near the wall. (I generally give specific values for the height of the actor and the distance from the wall to the light in this question, so that it’s more obvious which values are constant.)

I don’t have a standard demonstration for this next problem, because I use it as a quiz question (although maybe not anymore, now that I’ve written about it here), but it’s easy enough to devise an experiment. This situation is similar enough to the previous one that its result is a bit surprising. Suppose a streetlight at height $L$ is the only source of illumination nearby, and a woman (say, Agatha) of height $h$ walks at a constant speed away from the light. As she gets farther away from the light, does her shadow grow more quickly, more slowly, or does it grow at a constant rate? If $x$ again denotes the distance to the light (well, really from Agatha’s feet to the base of the lamp, which is not the same as her distance to the source of illumination), and $s$ is the length of Agatha’s shadow, then similar triangles produce the relation $\frac{s}{h} = \frac{s + x}{L}$. We can rearrange this into a simple proportion between $s$ and $x$: $s = \frac{h}{L-h} x$. (Here’s an interesting feature of this equation already: it only makes sense if $h < L$, that is, if Agatha is shorter than the lamppost!) Now we differentiate to get \[ \frac{ds}{dt} = \frac{h}{L - h} \frac{dx}{dt}. \] So if Agatha’s speed is constant, then her shadow’s length is also increasing at a constant rate. This example shows especially well why it’s dumb to look at related rates at a single moment in time. Most book exercises of this sort ask how quickly the shadow is growing when Agatha is at a particular distance from the lamp. But it doesn’t matter how far away she is, and the math proves that it doesn’t matter.

There’s a risk in related rates exercises to always resort to problems that only involve differentiating polynomials, so here’s an example that uses trigonometric functions. The demonstration I use: I walk back in forth in front of the class and tell the students to be mindful of what their heads do as they follow my movement. After a couple of times, several of them observe that their heads must turn more quickly when I’m closer to them. I point out that this is something anyone who’s had to run a video camera at a race must be aware of. (It’s also apparent to someone riding in the passenger seat of a car, keeping their gaze fixed on a single tree or other immobile object: for a long time, your head turns little, but when you’re close to the object, you have to turn quickly to keep it in view.) I generally set up the problem on the board as though it is taking place at a racetrack. Suppose a runner is moving along a track (let’s assume it’s straight for simplicity) at $v$ feet per second. You’re watching from a position $D$ feet away from the track. How quickly does your head need to turn to keep following the runner? The answer depends on how far away the runner is. One has to introduce a reasonable coordinate system and some useful variables: good choices are the position $x$ of the runner relative to the point of the track closest to you, and the angle $\theta$ by which your head is turned from looking at this closest point. Then we get the relation $\tan\theta = \frac{x}{D}$, and differentiating with respect to time results in the equation $\sec^2\theta \frac{d\theta}{dt} = \frac{1}{D} \frac{dx}{dt}$, or \[ \frac{d\theta}{dt} = \frac{v}{D} \cos^2\theta \] (using the assumption that $dx/dt = v$). When $\theta = 0$, so that the runner is closest to you, the rate at which your head turns is $v/D$, which depends only on how fast the runner is going and how far away from the track you are. (Notice that the units work out: the radian measure of an angle is technically dimensionless, and so we expect its rate of change not to have any dimension other than 1/time. Since $v$ has dimension of distance/time and $D$ has the dimension of distance, $v/D$ has the dimension 1/time.) As $\theta$ increases (in this scenario, $\theta$ is never greater than a right angle), the change in the angle of your head to follow the runner happens more slowly, because $\cos^2\theta$ is closer to zero.

These are just a few examples of standard situations involving related rates that become much more interesting when the myopic attention to a single moment in time is removed. I’m sure most readers of this post can do the calculations I’ve shown on their own, but the tendency to hone in on a single rate at a single point in time is so entrenched that I wanted to show how much more interesting related rates become when that element is removed. I don’t know that my students are better at solving related rates problems than other students, but I have noticed that they’re much less likely to insert specific quantities into a relation before it’s necessary than when I taught the subject years ago. I haven’t had time to strip all such problems of the detritus that comes with wanting a numeric answer, but I believe our understanding (and our calculus students’ understanding) of the world will be much improved by making the effort to transform these problems into meaningful questions.

Here are two other examples that I won’t work out in detail. One scenario has a boat being pulled into a dock by a rope attached to a pulley elevated some distance above the boat. If the rope is pulled at a constant rate, the boat in fact speeds up as it approaches the dock! (I tried demonstrating this once with a string tied to a stuffed animal pulled across a desk, with moderate success.) Another common type of problem considers two boats moving in perpendicular directions (or cars moving along perpendicular roads), and asks at a certain point in time whether the distance between them is increasing or decreasing. That’s silly. Why not establish the relation between them, and ask at what times the distance is increasing, and at what times the distance is decreasing? If there’s a time when the rate of change in distance is zero, then the boats (or cars) are at their closest (or farthest) positions, which connects to the study of optimization, which has its own set of issues…


P.S. I should have known better than to look at Khan Academy’s treatment of related rates. His videos show all the marks of what is classically wrong with these problems: the irrelevant information of what variables equal at a single moment in time is presented up front along with everything that’s constant in the situation, and in the end the answer is a single, uninformative number. Even when an interesting equation is present on the screen, Khan rushes past it to get to the final number. How can we get our students to ask and answer more interesting questions than these, about the same situations?