## Monday, May 27, 2013

### SBG: what could have gone better

A couple of weeks ago, I posted about many great things that came out of my experiment with standards-based grading in multivariable calculus. I think the experiment was a success in that there was a lot more good than bad, and the bad things weren’t so bad. Nonetheless, things could have gone better, even in ways I don’t think the students realized, and I would be remiss not to mention them. Whence this post.

Arrangement of standards: There is a tough balance to achieve here, and I didn’t quite make it. The twenty-four standards included

• 7 “common standards” that could relate to any college-level math class
• 3 standards related to vectors and their geometry,
• 3 standards related to graphing and parametrization,
• 5 standards related to differentiation and its applications,
• 4 standards related to integration and its applications,
• 2 standards related to the classical theorems of vector calculus.
I wanted each standard to present a unified concept to be mastered. However, I didn’t want the number of standards to proliferate. I think the total number of standards is about right (anywhere in the range 20–30 would have worked), but the distribution of topics was a little off.

First, seven common standards is too many. Four or five would be more appropriate. Algebra and presentation are essential to treat separately. The others, while distinct in my mind before the class began, became somewhat muddled in their distinction during the semester, and some even overlapped a fair amount with the content-specific standards. The first six or seven standards scored on each homework assignment (often only six, because one was essentially “use of technology”, which rarely figured directly into the homework) became a blurry wash, occasionally used to try to indicate some general, but ill-defined, skill needed attention.

Second, I was too clever in collecting related topics, to the point that certain standards were only partially covered for several weeks at a time. For instance, “line integrals” included both arc length computation—covered in the first chapter—as well as integrals of vector fields—covered in chapter 4. This was perhaps the most egregious example. The topics covered by each standard should have been collected not only by commonality, but also chronologically.

Third, the relative importance of the standards, or at least the relative emphasis that was given to each during class, was not as balanced as I would have like. Setting up and evaluating double and triple integrals—a single standard on the syllabus—takes over a week of class time (although part of that time includes changing coordinates, which was a separate standard). Visualizing vector fields—the only standard that was only tested once, on the final exam—was dealt with sporadically in class. Is it as important to be able to interpret the visual information carried by a graphical vector field as it is to find integrals of several variables? Arguably, hence the separate and equal standards. Was that equality reflected in the amount of attention it was given during the semester? Again, not as much as I would have liked. Not sure this is a challenge of standards per se, but more of course design. Having the standards just highlights the inequity.

Finally, on this topic, even some of the content-specific standards overlapped more than I had intended. I mostly managed to avoid the obvious pitfalls: for instance, computing integrals and finding parametrizations were handled separately, so if a question asked students to find the surface area of a figure, say, and someone set up the wrong integral but computed it correctly from that point, they could get credit for integration but not for parametrization. But what exactly are the skills that go in to setting up an integral? There were standards for describing objects in 2 or 3 dimensions, as well as an “analysis” standard that, in part, required finding the domain of a function. When a double or triple integral is needed, one has to draw on one or more of these skills to find appropriate limits of integration. When it seemed to me like a problem could have been solved by several different approaches, does a failure to find any solution reflect a lack of mastery of all those skills? Hard call. No one’s scores suffered seriously from this ambiguity, but occasionally I found myself judging a surprising number of standards on the basis of one or two exercises.

Grading scale: Having a four-point scale for each standard worked well, on the whole. It was sufficiently refined to target both areas of success and areas needing work. However, the overall quality of work was so good that I had trouble distinguishing among the highest levels of performance. What should I do with a solution that reflects a clear understanding of the skills involved, but has one or two minor errors? Do those reflect some genuine misunderstanding, or simply a slip? How can I judge between a score of 3 (“generally good accuracy”) and 4 (“complete mastery”) in that case? I think I may have to move to a five-point scale, as described here, where 4 and 5 both indicate mastery, but 4 allows for small mistakes.

On the other hand, I think I could have been more exigent in what level of mastery should be reached across the spectrum. On the syllabus, I stated that attaining 4 on 80% of the standards with no scores lower than 3 was sufficient for an A, and I believe I could have raised that percentage to 90% to better reflect complete mastery of the course material. The end result of this is that perhaps a few final grades were more elevated than they might otherwise have been. But seriously—these students worked extremely hard, I am extremely proud of them and have full confidence in their calculus skills, and they deserve some recognition for working with me on this grading experiment and making it a success.

Assessments: I was exceedingly grateful to have a grader with whom I had worked before, and who I trusted to help me implement this SBG system as effectively as possible. I could not have made this first attempt work without her aid. Each week, she would mark the homework, making particular note of places that raised concern or showed exceptional mastery, and then we would meet together to assign scores. I don’t think this method is sustainable across terms. I need to shift to a model that depends less on explaining the grading system to a new assistant each semester, but also that will not vastly increase the amount of time I have to spend grading. (I don’t think having the only graded assignments be two midterm exams and the final is sufficient for me to trust those assessments, nor does it communicate with the students in the way I always hope SBG will.) This is perhaps the area I have to think most about revising as I move forward with SBG in future classes.

Re-assessments: About a fourth of the class took advantage of the opportunity to re-assess any standards. Not such a bad number, especially considering how well they were doing on the whole. But I feel more could have benefitted from this feature of SBG, had the process of reassessment been clearer, and had some of the above obstacles been removed. I do need to find a way to cut down on the time required for reassessment, however: I always tried to claim it would take 10–15 minutes, but often it was much longer than that. No student ever complained about the length of time, which arose both because I gave multiple chances to explain themselves and because of the relative complexity of the material. Nonetheless, I think retesting will have to be made more efficient for it to work in other classes.

Compiling scores: The students received regular updates on their scores in the form of score sheets attached to their homework and exams, but there was no established system by which they could see what their current scores on all the standards was. (Having had some troubles using our LMS in a much simpler grade book setting, I’m averse to the idea of using that or any other online reporting system.) Fortunately, I think this problem is easily solved. Most likely, I’ll handle it in the future by passing out sheets on which students can record their own scores, so that they don’t have to consult with me to find out their current standing. (This is a suggestion I got from Bret Benesh. I suspect some students were already doing this on their own.)

That’s probably not everything that needs improvement, but it’s what came to the forefront of my attention. I have some ideas, some listed above, on how to make my system better next time around. As I’m working on future syllabi, I’ll jot these ideas down and post them here.

Bret Benesh said...

Hi Joshua,

I am happy to hear that SBG is worth doing again. I definitely think that it is an improvement over what I did before, although I am not convinced that it is the last grading scheme I will ever use.

I have been wondering how to use graders for SBG, too. The flavor of SBG I use in calculus, stats, etc works really well with graders, but the version I used with the elementary education majors does not work well with a grader (yet).

Do you have ideas now on how you could eventually use a grader effectively with SBG? I am going to have this problem in the spring.
Bret

Joshua Bowman said...

Bret,

My initial idea is to have smaller, more frequent assessments (read: weekly quizzes), outside of class (like a language lab, which I remember well from my language classes), which I could grade quickly, taking no more time than I did this semester meeting with my grader. Those exercises would be highly targeted. Homework would then not figure into the grading scheme, but would truly be practice. I would give the grader a small set of focused standards to score on each homework, and that feedback would help the students as well as my monitoring of the class. I would not quiz anything until it had been covered in a homework set, so that students could get the grader's feedback, but I wouldn't have to monitor that feedback too closely. The quizzes that I grade would give an indication of how exams will be scored.

The problems I see with this plan: I still have to grade something every week; homework could potentially become (or be perceived as) busywork for either the students or the grader; the whole system could just become confused. So I'm not really happy with this idea yet.