Student Grades And Course Evaluations In Engineering: What Makes A Difference | Zendy

Kara M. Kockelman | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Student Grades And Course Evaluations In Engineering: What Makes A Difference

Author(s) -

Kara M. Kockelman

Publication year - 2020

Language(s) - English

Resource type - Conference proceedings

DOI - 10.18260/1-2--9809

Subject(s) - test (biology) , mathematics education , psychology , medical education , medicine , paleontology , biology

This research investigates the impact of different instructor, course, and student attributes on student grades and course evaluations. The data come from undergraduate courses given at the University of Texas at Austin during the 1992 through 1998 calendar years. Instructor experience, standing, and gender; course department and credit hours; and student classification, test scores, gender, and other variables are used to explain variation in both grades and evaluation scores. The results of multivariate weighted-least-squares regressions of average grades given across a sample of over 2,500 courses suggest that the average male instructor assigns lower grades than female instructors, while lecturers and teaching assistants assign higher grades than full, associate, assistant, and adjunct faculty. Instructors teaching chemical, mechanical, and petroleum-and-gas engineering courses assign higher grades, on average, than those teaching aerospace, architectural, civil, and electrical engineering, and engineering mechanics. The results also indicate that non-Asian and non-foreign males taking lower-division courses for more credit hours receive lower grades, after controlling for student entrance-test scores and year in school. Weighted-least-squares analyses of average evaluation scores given to instructors were conducted over five different qualities: course organization, instructor communication, instructor teaching skill, the instructor overall, and the course overall. Evaluations from over 2,500 courses comprised the data set. In general, female students and African Americans rated their courses and instructors higher; and male faculty rated somewhat lower than female faculty. There are interesting gender interaction effects – between students and their instructors – evident as well. Instructors who had received their PhDs relatively long ago (which is expected to be highly correlated with instructor age and teaching experience) rated lower, except in the area of course organization. Senior lecturers consistently rated higher than full faculty, and assistant and adjunct faculty rated lower (on two and four of the five questions, respectively). Students in engineering mechanics and aerospace, architectural, civil, mechanical, and petroleum engineering rated their courses and instructors higher, on average, than did electrical and chemical engineering students. Also of interest to educators are the consistently positive and statistically significant associations between student ratings of a course and student GPAs – and the lack of any statistically significant relation between evaluations and grades biases (as captured by an average grade minus average GPA variable). These results are apparent only after controlling for other factors, including instructor, course, and student attributes and suggest that educators need not be very concerned about the biasing effects of “easy grading” on instructor evaluations. INTRODUCTION Student grades and course evaluations are important descriptors of student and faculty performance. Student grades represent instructor evaluation of students and have been used pervasively for probably as long as there have been universities. In contrast, the acquisition and dissemination of student evaluations of their instructors and courses have arisen relatively recently, from student-based efforts in the 1960’s. Many universities now incorporate evaluation results in faculty salary and promotion decisions, and nearly all major U.S. universities regularly collect such data. (Ory 1990, Seldin 1993) Students are probably the best resource universities have to assess instructor performance; they experience all aspects of many courses and thus can compare and contrast such experiences. Moreover, aggregation of their responses provides a large data set, helping to minimize any variation in estimation of average response. However, their use in merit, promotion, and other decisions engenders some controversy. Student ratings of courses are not perfectly reflective of student learning. For example, laboratory studies have suggested that while instructor enthusiasm significantly impacts student ratings, it does not much affect student learning. In contrast, lecture content appears to have a much greater effect on student learning than on ratings. (Abrami, Leventhal, and Perry 1982) And correlations between average ratings and average learning (based on standardized test results across multiple course sections) generally fall well below 0.5. For example, Cohen’s meta-analysis (1981) deduced that the highest correlations relate test performance to overall course and overall instructor ratings; these were estimated to be 0.47 and 0.43, respectively. In contrast, his correlation of performance and instructor-student interaction ratings was just 0.22. Therefore, in terms of a linear goodness of fit (R ), course and instructor ratings explain just 22% and 18% of the variance associated with average student achievement, and instructor-student interaction explains less than 5%. After controlling for other variables (for example, student intelligence and course time of day), it is very possible that the explanatory contributions of such ratings will fall to even lower levels. Clearly, much more information on students, their instructors, and their courses is needed – in a single model. While the relationships between student learning and ratings are not very strong, they are significant enough that many (e.g., Cohen 1990, Franklin and Theall 1990, Cuseo 2000, Wankat and Oreovicz 1993) place great value on their collection and use. Cohen’s “research-based refutations” to myths concerning student ratings include the following: student ratings are reliable, stable, and “not unduly influenced by the grades” received; and “students are qualified to rate certain dimensions of teaching.” (1990, p. 124) However, most researchers agree that ratings are just one component of a comprehensive assessment of faculty teaching (see, e.g., McKnight 1990). The mechanisms linking student grades and instructor evaluation to student, instructor, and course characteristics are complex; but, thanks to a large set of detailed data, the investigation described here illuminates many such relationships. The work analyzes University of Texas at Austin grades given to undergraduate students and course evaluations received by their instructors in the College of Engineering. It relies on weighted-least-squares (WLS) regression models for estimates of different instructor, course, and student attribute effects on student grades and course/instructor evaluations. What follows is a discussion of the data sets, the models, and the analytic results. DATA SETS Instructor experience, standing, and gender, course department and credit hours, and student classification, test scores, gender, and other variables are used here to explain variation in grades and evaluation scores. A description of all variables used is provided in Table 1. The primary data represent grades, evaluations, and student information collected and maintained by U.T. Austin’s College of Engineering from the spring semester of 1992 through the fall semester of 1998 (21 semesters total). The instructor-attribute data set was produced only recently, based on personnel databases and publicly available lists maintained by the University. Tables 1 and 2 present the definitions and some basic descriptive statistics of the variables used. Course and Instructor Evaluations The initial course data set contained 8,458 records. However, instructor gender could not be reliably matched to approximately 2,100 of these cases, and the year of an instructor’s PhD (or coming PhD, in the case of teaching assistants and some others) could not be computed for another 1,300 cases. The remaining course records were merged with student attribute data and information on grades given, and another set of records was removed (due to a lack of grades data and other, student attributes). In general, the removed course records were for very small courses and/or courses taught by temporary instructors whose information did not enter the University database. The resulting, complete data set had close to 2,700 observations, and these were used here for analysis. The survey instrument and its administration are very important in producing reliable results. The course and instructor ratings data analyzed here permit only five responses, which, according to Cashin (1990), may be ideal for analytical distinction of student satisfaction and dissatisfaction. Moreover, a category for a response of “don’t know” or “not sure” is not provided, helping avoid other issues (see, e.g., Arreola 1983). The five questions, as shown in Table 3, do not suffer from ambiguity, though one may argue that there is some ambiguity in the description of permitted responses, shown in Table 4. There is an opportunity for more openended responses on the right side of the Scanton survey sheets, where two questions are clearly posed: “What did you like most about this course?” and “How might this course be improved?” While helpful for overall assessment (Cuseo 2000), these two textual responses are non-numeric and are not analyzed here. In terms of survey administration, instructors are told to have students acquire the survey forms from departmental administrative offices, distribute these to classmates, and provide standardized instructions to all classmates present for survey completion. These students are then to collect all responses and return these to administrative offices. The survey’s administration takes approximately ten to fifteen minutes, and instructors are required to leave the classroom during this time. In general, such administrative methods comply with the core standards promoted in the student-evaluation literature (see, e.g., Cuseo 2000). Student Grades Instructor reporting of student grades is quite standardized a

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research