User Testing with Assessors to Develop Universal Rubric Rows for Assessing Engineering Design
Author(s) -
Nikita Dawe,
Lisa Romkey,
Susan McCahan,
Gayle Lesmond
Publication year - 2016
Language(s) - English
Resource type - Conference proceedings
DOI - 10.18260/p.27118
Subject(s) - rubric , computer science , usability , context (archaeology) , quality (philosophy) , modular design , engineering design process , mathematics education , psychology , engineering , human–computer interaction , programming language , paleontology , philosophy , epistemology , biology , mechanical engineering
This paper describes the process of testing and refining modular rubric rows developed for the assessment of engineering design activities. This is one component of a larger project to develop universal analytic rubrics for valid and reliable competency assessment across different academic disciplines and years of study. The project is being undertaken by researchers based in the Faculty of Applied Science and Engineering at the University of Toronto. From January 2014 to June 2015, we defined and validated indicators (criteria) for engineering design, communication, and teamwork learning outcomes, then created descriptors for each indicator to discriminate between four levels of performance: Fails, Below, Meets, and Exceeds graduate expectations. From this rubric bank, applicable rows can be selected and compiled to produce a rubric tailored to a particular assessment activity. Here we discuss these rubrics within the larger context of learning outcomes assessment tools for engineering design. We tested draft rubrics in focus group sessions with assessors (teaching assistants and course instructors who assess student work in engineering design). We followed the testing with structured discussions to elicit feedback on the quality and usability of these rubrics, and to investigate how the assessors interpreted the language used in the indicators and descriptors. We asked participants to identify indicators they believed were irrelevant, redundant, or missing from the rubric. We also asked them to identify and discuss indicators and descriptors that were confusing. Finally, we asked them what changes they would recommend and what training materials they would find useful when using rubrics of this design. We analysed the consistency of assessor ratings to identify problematic rows and used qualitative feedback from follow-up discussions to better understand the issues that assessors had with the rubrics. Three of the six engineering design rubric items we tested showed evidence of rater inconsistency, uncertainty, and indecision. While some rubric rows received similar criticism from most participants, we also identified differences in assessors' rubric design preferences and in how they apply rubrics to evaluate student work. It also emerged that assessors have different conceptions of engineering design and the design process, and are confused when presented with unfamiliar terminology. Based on our interpretations, we identified changes we should make to improve the rubric structure and content. Our aim is to inform other educators who may be developing tools for the assessment of engineering design by sharing our methodology and discussing the feedback received. Engineering educators could adopt or adapt our user testing methodology to improve the usability of similar assessment tools. Our discoveries about rubric structure improvements could be explored further to define best practices in the design of universal rubrics. Our next steps include applying what we have learned to refine the rubrics and develop accompanying training materials. The refined rubric rows will be evaluated for inter-rater reliability, trialed in focus groups with undergraduate students, and deployed in academic courses. Background: Learning Outcomes Assessment and the DARCA Project There is a need for valid and reliable tools for assessing learning outcomes in engineering education. In the United States the Accreditation Board for Engineering and Technology (ABET) defines curricular requirements for academic programs. Similarly in Canada, the Canadian Engineering Accreditation Board (CEAB) requires that a set of graduate attributes be assessed and reported on for undergraduate engineering program accreditation. Learning outcomes assessment also generates useful data for continuous improvement initiatives within institutions. The University of Toronto is a member of the Higher Education Quality Council of Ontario (HEQCO) Learning Outcomes Assessment Consortium. We are working on the Development of Analytic Rubrics for Competency Assessment (DARCA), a project attempting to produce universal analytic rubrics that authentically assess learning outcomes in engineering design and four other competency areas. The rubrics must be "universal" in the sense that they can be applied to multiple assessment activities to measure performance across different academic disciplines and years of study. In an analytic rubric (as opposed to a holistic rubric) a set of descriptors is provided for each indicator (learning outcome criterion) to define discrete levels of mastery. Students, administrative staff, accreditation agencies, course instructors, and assessors all play roles in learning outcomes assessment, however their expectations for the rubrics do not entirely overlap. The scope of this paper is limited to feedback from the assessor stakeholder group. Assessors include teaching assistants and course instructors who provide feedback and assign grades to students based on deliverable artifacts as well as performance in the classroom. Assessors may or may not be involved in the design of the assessment activities themselves; some assessors are provided with a marking scheme or a rubric and little further guidance when assessing student work. This study draws from many areas of literature on learning outcomes assessment and assessment tools as well as engineering design education. The primary focus is on the use of rubrics for assessing engineering design. Draft Rubric Development Between September 2014 and June 2015 the DARCA team completed the process of developing draft rubrics. Based on the literature and existing collections of learning outcomes (e.g. ), we created an extensive hierarchical list of learning outcomes and their more specific, measurable indicators. This list of outcomes and indicators was validated through expert consultation with engineering design instructors at the university. To produce analytic rubric rows we then formulated descriptions of quality for each indicator to discriminate between four levels of performance: Fails, Below, Meets, and Exceeds graduate expectations. For consistency in description, the levels were defined to mean the following: ● Fails: Indicator is not demonstrated (Not Demonstrated) OR complete lack of quality and/or demonstration of a fundamental misunderstanding of the concept (Misconception). ● Below Expectations: Lacks quality; work must be revised significantly for it to be acceptable. ● Meets Expectations: Definition of quality; work is acceptable and demonstrates some degree of mastery. ● Exceeds Expectations: Student goes over and above the standard expectations to produce superior work. From the extensive list of measurable indicators, specific ones can be selected and compiled to produce a rubric tailored to a particular assessment activity (see Appendix A for an example). The content of the draft Design Rubric descriptors was produced based on a review of the literature on engineering design competency definitions and assessment with additional input from engineering design instructors. To determine whether the rubric rows are suitable for use by assessors, we developed a focus group research methodology for user testing. Focus Group Testing Methodology To explore how assessors will use the compiled rubrics and interpret the indicators and descriptors, we conducted focus groups in which participants with experience as teaching assistants and communication instructors for engineering design courses used pre-compiled rubrics to assess sample student deliverables. Between July and October 2015, we conducted three focus groups in which a total of 11 participants assessed interim design report samples from a first year engineering design course. For these sessions we selected 6 Design indicators and 9 Communication indicators to produce an assessment-specific rubric. This paper focuses on the following Design indicators: ● Outcome D1: Find and state an engineering design problem o Indicator D1B: Accurately state the engineering design problem and summarize key details (interpret a problem statement if provided) ● Outcome D2: Gather information to understand an engineering design problem o Indicator D2A: Identify stakeholders with interest or influence and accurately describe stakeholder profiles (e.g. characteristics, perspectives, needs) o Indicator D2B: Identify and describe engineering design priorities (i.e. Design for X) and/or social and professional concerns relevant to the problem o Indicator D2C: Extract and integrate information from stakeholders and other appropriate (reliable, diverse, credible) sources to enhance understanding of the problem ● Outcome D3: Frame a problem in engineering design terms o Indicator D3B: Document appropriate engineering design requirements using a suitable model (e.g. goals-functions-constraints or objectives-metrics-criteriaconstraints) o Indicator D3D: Describe the intended engineering design process and provide a plan/timeline that anticipates the tasks and resources required Indicators D2B, D2C, and D3B were also tested in another session with four participants who assessed sample design proposal assignments for a second year electrical and computer engineering (ECE) course. One of the purposes of this repetition of indicators with a different course assessment piece was to investigate whether the rubric items are generic enough to apply universally across the curriculum. Focus group sessions began with an overview of the project and an explanation of the agenda and expectations for the session. Participants completed demographic surveys and then were given time to use the rubrics to assess the sample assignments at their own pace. In the first focus group participants were provided with five samples; in subsequent sessions we limited it to two samples to allow more time for discussion. After assessing the assignments participants completed individ
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom