Monday, May 27, 2013

SBG: what could have gone better

A couple of weeks ago, I posted about many great things that came out of my experiment with standards-based grading in multivariable calculus. I think the experiment was a success in that there was a lot more good than bad, and the bad things weren’t so bad. Nonetheless, things could have gone better, even in ways I don’t think the students realized, and I would be remiss not to mention them. Whence this post.

Arrangement of standards: There is a tough balance to achieve here, and I didn’t quite make it. The twenty-four standards included

  • 7 “common standards” that could relate to any college-level math class
  • 3 standards related to vectors and their geometry,
  • 3 standards related to graphing and parametrization,
  • 5 standards related to differentiation and its applications,
  • 4 standards related to integration and its applications,
  • 2 standards related to the classical theorems of vector calculus.
I wanted each standard to present a unified concept to be mastered. However, I didn’t want the number of standards to proliferate. I think the total number of standards is about right (anywhere in the range 20–30 would have worked), but the distribution of topics was a little off.

First, seven common standards is too many. Four or five would be more appropriate. Algebra and presentation are essential to treat separately. The others, while distinct in my mind before the class began, became somewhat muddled in their distinction during the semester, and some even overlapped a fair amount with the content-specific standards. The first six or seven standards scored on each homework assignment (often only six, because one was essentially “use of technology”, which rarely figured directly into the homework) became a blurry wash, occasionally used to try to indicate some general, but ill-defined, skill needed attention.

Second, I was too clever in collecting related topics, to the point that certain standards were only partially covered for several weeks at a time. For instance, “line integrals” included both arc length computation—covered in the first chapter—as well as integrals of vector fields—covered in chapter 4. This was perhaps the most egregious example. The topics covered by each standard should have been collected not only by commonality, but also chronologically.

Third, the relative importance of the standards, or at least the relative emphasis that was given to each during class, was not as balanced as I would have like. Setting up and evaluating double and triple integrals—a single standard on the syllabus—takes over a week of class time (although part of that time includes changing coordinates, which was a separate standard). Visualizing vector fields—the only standard that was only tested once, on the final exam—was dealt with sporadically in class. Is it as important to be able to interpret the visual information carried by a graphical vector field as it is to find integrals of several variables? Arguably, hence the separate and equal standards. Was that equality reflected in the amount of attention it was given during the semester? Again, not as much as I would have liked. Not sure this is a challenge of standards per se, but more of course design. Having the standards just highlights the inequity.

Finally, on this topic, even some of the content-specific standards overlapped more than I had intended. I mostly managed to avoid the obvious pitfalls: for instance, computing integrals and finding parametrizations were handled separately, so if a question asked students to find the surface area of a figure, say, and someone set up the wrong integral but computed it correctly from that point, they could get credit for integration but not for parametrization. But what exactly are the skills that go in to setting up an integral? There were standards for describing objects in 2 or 3 dimensions, as well as an “analysis” standard that, in part, required finding the domain of a function. When a double or triple integral is needed, one has to draw on one or more of these skills to find appropriate limits of integration. When it seemed to me like a problem could have been solved by several different approaches, does a failure to find any solution reflect a lack of mastery of all those skills? Hard call. No one’s scores suffered seriously from this ambiguity, but occasionally I found myself judging a surprising number of standards on the basis of one or two exercises.

Grading scale: Having a four-point scale for each standard worked well, on the whole. It was sufficiently refined to target both areas of success and areas needing work. However, the overall quality of work was so good that I had trouble distinguishing among the highest levels of performance. What should I do with a solution that reflects a clear understanding of the skills involved, but has one or two minor errors? Do those reflect some genuine misunderstanding, or simply a slip? How can I judge between a score of 3 (“generally good accuracy”) and 4 (“complete mastery”) in that case? I think I may have to move to a five-point scale, as described here, where 4 and 5 both indicate mastery, but 4 allows for small mistakes.

On the other hand, I think I could have been more exigent in what level of mastery should be reached across the spectrum. On the syllabus, I stated that attaining 4 on 80% of the standards with no scores lower than 3 was sufficient for an A, and I believe I could have raised that percentage to 90% to better reflect complete mastery of the course material. The end result of this is that perhaps a few final grades were more elevated than they might otherwise have been. But seriously—these students worked extremely hard, I am extremely proud of them and have full confidence in their calculus skills, and they deserve some recognition for working with me on this grading experiment and making it a success.

Assessments: I was exceedingly grateful to have a grader with whom I had worked before, and who I trusted to help me implement this SBG system as effectively as possible. I could not have made this first attempt work without her aid. Each week, she would mark the homework, making particular note of places that raised concern or showed exceptional mastery, and then we would meet together to assign scores. I don’t think this method is sustainable across terms. I need to shift to a model that depends less on explaining the grading system to a new assistant each semester, but also that will not vastly increase the amount of time I have to spend grading. (I don’t think having the only graded assignments be two midterm exams and the final is sufficient for me to trust those assessments, nor does it communicate with the students in the way I always hope SBG will.) This is perhaps the area I have to think most about revising as I move forward with SBG in future classes.

Re-assessments: About a fourth of the class took advantage of the opportunity to re-assess any standards. Not such a bad number, especially considering how well they were doing on the whole. But I feel more could have benefitted from this feature of SBG, had the process of reassessment been clearer, and had some of the above obstacles been removed. I do need to find a way to cut down on the time required for reassessment, however: I always tried to claim it would take 10–15 minutes, but often it was much longer than that. No student ever complained about the length of time, which arose both because I gave multiple chances to explain themselves and because of the relative complexity of the material. Nonetheless, I think retesting will have to be made more efficient for it to work in other classes.

Compiling scores: The students received regular updates on their scores in the form of score sheets attached to their homework and exams, but there was no established system by which they could see what their current scores on all the standards was. (Having had some troubles using our LMS in a much simpler grade book setting, I’m averse to the idea of using that or any other online reporting system.) Fortunately, I think this problem is easily solved. Most likely, I’ll handle it in the future by passing out sheets on which students can record their own scores, so that they don’t have to consult with me to find out their current standing. (This is a suggestion I got from Bret Benesh. I suspect some students were already doing this on their own.)


That’s probably not everything that needs improvement, but it’s what came to the forefront of my attention. I have some ideas, some listed above, on how to make my system better next time around. As I’m working on future syllabi, I’ll jot these ideas down and post them here.

Friday, May 10, 2013

SBG: what worked well

I’ve been lax all spring in blogging (other things, too, but that’s beside the point here), and now that the end of the semester has arrived, it’s time I settled down with a cup of coffee to share some of my thoughts on how standards-based grading went. I keep reading blog posts by other teachers and acquiring such cool ideas thereby (this blog brims with excitement about the possibilities for improvement SBG brings to both instruction and assessment; hat tip to Dan Meyer, who recently linked to it), but it is still the case that few teachers at the college/university level are writing about SBG in that context, so hopefully this will be a productive exercise. It’s a good thing qualifications aren’t a prerequisite for blogging, ’cause I ain’t got ’em. Which means this is at least as much about benefitting from the community as trying to contribute to it.

To make this a little more manageable, I’m going to deal with three topics in three (or more) separate posts: what worked well, what worked not so well, and how I plan to move forward.

The Backstory: When last we saw our intrepid blogger, he was heading off into the spring semester to teach multivariable calculus at a small liberal arts college in New England. Having spent several weeks thinking about how he might structure SBG in this class, he had settled on a 4-point system (where 0 represents “complete unfamiliarity” and 4 represents “complete mastery”) with 24 standards: 7 common standards and 17 content-specific standards (an early version of this list was posted here).

What came next: When I met with my class, I explained the system and why I was using it. Assessment should be about giving students the chance to demonstrate what they’ve learned, I said, and providing sufficient opportunity for them to show they’ve mastered the material during the course, even if it doesn’t happen in time for the first test on the material. A point-based system confounds this process. How many students really know what they got each “point” for? (How many of us teachers do?) And a point, once lost, cannot be regained, unless some system of “extra credit” is established, which just creates more work for everyone. I explained that the homework and tests would both create opportunities for them to demonstrate their understanding, and that there was no “weighting” of grades, just regularly-updated scores for the standards. A few expressed surprise, but overall they were accepting that this was how things would work. I explained that the process required honesty from all involved. For my part, I would give scores that I believed accurately reflected each student’s prowess with the various skills they were to learn. For theirs, since I was going to be assessing homework using the same system as the tests, they needed to present their own work each week. (From what I saw, this worked. Students worked together to tackle the problems, but they did not turn in assignments copied from each other. Had I not been at a private liberal arts college with a stringent honor code, I would definitely have had to find another way to handle this. Fortunately, my academic setting allowed me to try SBG this way without worrying about cheating.)

During the semester: We had weekly homework sets and two mid-semester exams. The students have just taken the final exam, and I’ll grade it over the weekend. The homework exercises were primarily taken from the textbook, Michael Corral’s Vector Calculus—available for free download here—and I also wrote some additional exercises to cover other material. (Side note, tangentially related: I chose this textbook because it seemed ridiculous to me to pay $150 for a book that covers material which is available for free almost everywhere. This book basically has the outline I wanted to use, and it has the additional benefit that the exercises are on the whole quite straightforward. I’m realizing that lots of books, and lots of instructors, like “clever” exercises that seem to students only distantly related to the material they’re learning. I’m often tempted that way myself. But if I’m going to assess standards rather than cleverness, a collection of direct applications is invaluable. More on this another time.) The tests were open-book and open-notes. While memorizing definitions, formulas, and theorems is an important step towards forming a coherent picture of the subject, I wanted to emphasize that in the Information Age one can use myriad tools to recall these facts, so that what’s really important is using them intelligently. (Tip: students are afraid of open-book tests, because they assume they’ll be harder. Does “more conceptual” equal “harder”? Possibly in their minds. They did well on the tests, however.)

In addition to the seven “common” standards, each homework covered between three and six other standards, so that many were assessed multiple times. None of the content-specific standards appeared on every assignment; most showed up 2–5 times, although some only once, and some only on the exams. Once a standard had been tested (not just appeared on homework), students could schedule appointments with me to reassess specific standards, up to two per week. To emphasize the importance of mastery, I told the students that I would guarantee an A for anyone who reached (and maintained) 4s in 80% of the standards, with no scores below 3; a B for anyone who reached 3s in 80% of the standards, with no scores below 2, and so on. Scores could be revised up or down, but to alleviate concerns that a fluke of a bad performance at the end of the semester would ruin their scores, I would average their highest and their latest score at the end of the semester.

Student response: When elicited, this was generally positive, which is the most important measure from my perspective. Several students said SBG reduced the stress of test-taking. Others liked how it affirmed their understanding in certain areas while pointing to areas that needed work. A handful took it as a personal challenge to reach all 4s by the end, even though having a couple of 3s wouldn’t change their grade. In the middle of the semester I used an online poll to get anonymous feedback. A couple complained that they didn’t know how their performance was compared with the rest of the class; I view this as part of the purpose of SBG (albeit a minor part)—the striving is against self, not in competition. One said she worked harder to master the material, but appreciated not having to worry about a single bad performance wrecking her grade. The consensus of more than half the students who responded was that SBG reflected their progress and communicated my expectations very well; other responses were at worst neutral. (I wish I had a comparison poll from my non-SBG classes to see if my expectations were being clearly communicated. But if I had done that, I probably would have been using standards anyway.) Even at this level (third-semester calculus), when one might think students’ feelings towards mathematics are firmly set, several students told me that they either had thought they were bad at math or didn’t like it, and now they’ve changed their minds.

My impressions: Mostly I have the sense that standards-based grading was freeing for the students. Far fewer worried about their grades than seems typical (though a few still did), knowing that the way to improve their final grade was the only sensible way: improving their understanding. I was glad to target my feedback, which was the main reason I started considering SBG to begin with. For example, most of the students were adept with algebra, but not all. Some had trouble moving between formulas and visual representations of graphs or objects. Some couldn’t quite grasp how to come up with parametrizations. No student, however, could come out saying “I’m not good at calculus.” They almost always knew which areas they struggled with, and by separating out the different skills, this method of assessment provided confirmation and encouragement at the same time. Each student could look at her scores and say, “Hey, I’m pretty good at a lot of this. I see an area where I’m having trouble, so I guess I’ll work on that.”

In the end, I have tried to be guided by the principle that it is not what I do, but what the students do that contributes the most to their learning. (I picked this up from somewhere, probably several places, and I’ll try at some point to elaborate on how else I applied it.) From that perspective, I would call SBG a success in this class. The participation and performance throughout the class was more uniform across all topics than I have ever seen before. By which I mean, each student knew she was responsible for a certain collection of skills, not just for an accumulation of points or a certain average letter grade, and so they all stepped up to learn all the skills. (Of course, this work ethic is characteristic of students at my school.)

Those are the upsides. In my next post (probably next week, after I’m done grading), I’ll discuss what didn’t go quite so well and why.