Not All Curves Are the Same: Left-of-Center Grading and Student Motivation
Author(s) -
Joanna Wolfe,
Beth Powell
Publication year - 2015
Language(s) - English
Resource type - Conference proceedings
DOI - 10.18260/p.24527
Subject(s) - grading (engineering) , norm (philosophy) , psychology , mathematics education , pedagogy , medical education , medicine , engineering , epistemology , philosophy , civil engineering
Despite a substantial body of research that criticizes norm-referenced (i.e., “curved”) grading for fostering a competitive climate, the practice remains a staple in STEM education and is unlikely to change. One reason educational critiques of the practice may not have hit home is that not all normreferenced grading is the same. There is likely a big difference between what we refer to as left-ofcenter grading, where exam means are in the 20 or 30 percentiles and a score of 40% can translate into an A, and exams where means are near 60% and a score of 80% translates into an A. This study tests the hypothesis that students will distinguish between different types of norm-referenced grading practices. One hundred and seventy seven engineering students at a private, research I university completed surveys asking about their perceptions of norm-referenced exams with means in the 20’s vs. those with means in the 60’s. The results overwhelmingly show that students found exams with means in the 20’s—but not those with means in the 60’s—discouraging and as evidence of bad and uncaring teaching. Students receiving an “A” for exam scores in the 30’s were unlikely to feel proud of their accomplishment and were highly unlikely to feel that they had learned what the instructor expected. These same students, however, did feel proud when an “A” was based upon an exam score in the 80’s. Students were also more likely to consider cheating and were less motivated to study when the median score was in the 20’s. Over 90% of students indicated that a primary purpose of exams should be to measure mastery of concepts, and nearly 80% indicated that measuring what a student had learned should also be a primary purpose. By contrast, only 12% of students indicated that “distinguishing exceptional students from others” should be a primary purpose. These results are at odds with the assumptions of left-of-center grading, which prioritizes distinguishing among different groups of students and only indirectly seems to measure a student’s mastery of course content or learning Introduction In the course of interviewing students for a project on gender and interpersonal communication in engineering, we began to observe a trend of negative reactions to a common educational practice that we have come to call left-of-center (LOC) grading: exams with class means below 50 percent. Curious about this trend, we modified our interview protocol to systematically ask students to comment on the pros and cons of this practice. Over 60% of the women and 15% of the men we interviewed emphatically saw the negatives as outweighing the positives. This trend was particularly common among minority women, over three-quarters of whom described the practice as highly discouraging. The quotations below reflect some of their viewpoints. P ge 26190.2 We'll have like a 30 percent average [on exams]....When you take the exam, it makes you feel horrible. You come out of there like, “I answered a fifth of that right, at most.” It’s sort of like, “Well, gee, what did I learn?” (Hispanic Female; elite private university) You don't feel like you learned it. I mean you get 50 percent on a test, and you get an A, I mean that's horrible because there were so many others that you didn't get right. (African American Female; public research university) To me if I made a 30, even if 30 is the highest grade in the class, I still failed. I think it’s very demoralizing (White Female; public, non-research university) It's not necessarily about grades, because you can get a 40 percent and still get an A, so it's not really about the grade, but.... you feel like you're failing even if you get an A on the exam. (Native American Female; elite private university) I ended up with over a 3.6 GPA, so I obviously didn’t do that bad, but there were a lot of tests that I would end up like leaving in tears, frustrated...It makes you feel like, “Why am I in engineering school? I don’t understand what I’m doing. I’m not learning anything.” (White Female; public, non-research university) As the quotes above indicate, tests in STEM classes frequently have means as low as 20 or 30 percent where a grade of 40 or 50 percent becomes an A. We term this practice left-of-center grading to distinguish it from exams in which the majority of students are able to complete the majority of problems. The practice is quite common: out of the 83 engineering undergraduates and alumni we interviewed, all but three had experienced the practice. 1 And, as we indicated above, our research has also found that female students are particularly troubled by left-of-center grading, suggesting that the practice may have major implications for the retention of diverse populations. 1 LOC grading is a subset of norm-referenced grading. Norm-referenced grading, popularly known as grading on a “curve,” involves grading students on the basis of their rankings within a particular cohort. It is typically contrasted with criterion-referenced grading, which involves comparing students’ achievements with clearly stated criteria for learning outcomes and clearly stated standards for particular levels of performance. Although most grading mixes norm-referenced and criterion-referenced components, there is a strong consensus among educational researchers that criterion-referenced grading is an ideal that should be aspired toward because it provides students meaningful feedback on their mastery of objective core competencies. 2–8 Criterion-referenced grading has been found to increase students’ trust in the grading process, 9 increase use of effective learning strategies, 10,11 encourage intrinsic interest in what is being studied, 4,7,10,12 and discourage the counterproductive competition 4,11 that turns many students away from STEM fields. 13 In fact, ABET now specifies that programs should evaluate students based upon explicitly stated criteria. 14 P ge 26190.3 One reason we propose that students find LOC grading so frustrating is because it violates so many of the feedback principles of criterion-referenced grading. Because the range of scores tends to be narrow when means are very low, students do not receive meaningful feedback about what competencies they have and have not mastered. LOC grading measures students but does not provide information about learning. Since women are more likely than men to have a mastery-orientation (vs. performance-orientation) towards learning, 15 it makes sense that they find LOC particularly discouraging. If the reason that students object to LOC exams is that they expect exams to provide feedback on what they have mastered, then we might hypothesize that students would be more positive about exam scores when the means are in the 60’s. A student who achieves a score of 60 might infer that she or he at least partially mastered most of the core competencies the exam was designed to test, but still has substantial room to improve. At the same time, calibrating exams so that class means fall in the 60’s would still provide plenty of opportunity for the strongest students in a class to distinguish themselves. (Previous research found that the need to distinguish and challenge the brightest students was the primary benefit instructors and students gave in support of LOC grading). Instructors could arrange for approximately half of the items to test student mastery of core concepts covered in class (i.e., half of the items would be criterion-based) and the other half could test students’ abilities to apply information in new ways. In the course of studying LOC grading, we also became aware of other instructional strategies that seem to violate the assumption that exam grades should provide students with information about their progress in the course. For instance, students told us that they often did not have enough information about the instructors’ curving practices—or about the means and ranges of exam scores—to know what their grade in the course was. In fact, early in their academic careers, students often assumed they were failing the course when in fact, their grades were above average. We also wondered, given the large number of items students miss in these exams, whether instructors would review these items in class so that the exam could be used to inform instruction rather than just serve to measure and classify students. To find out more about LOC grading and related exam practices and their effects on students, we designed a survey to address the following questions: Assuming that their final grades remain the same, to what extent does the raw score of an exam affect students’ motivation, self-efficacy, learning strategies, or perception of the instructor? To what extent do students believe that exams should be criterion-based? What other assumptions do students hold about the purpose of the exam? How common is it for instructors to provide students with information about exam scores and grading methods? How common is it for instructors to treat exams as a learning instrument by reviewing problems that the majority of the class missed? To what extent do student perceptions of LOC grading vary by gender, ethnicity, or achievement level (i.e., GPA)? P ge 26190.4
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom