Open Access
Developing Content Knowledge for Teaching Assessments for the Measures of Effective Teaching Study
Author(s) -
Phelps Geoffrey,
Weren Barbara,
Croft Andrew,
Gitomer Drew
Publication year - 2014
Publication title -
ets research report series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.235
H-Index - 5
ISSN - 2330-8516
DOI - 10.1002/ets2.12031
Subject(s) - mathematics education , psychology , set (abstract data type) , sample (material) , medical education , computer science , medicine , chemistry , chromatography , programming language
This report documents the development of assessments of content knowledge for teaching ( CKT ) as part of the Measures of Effective Teaching ( MET ) study, funded by the Bill and Melinda Gates Foundation. The MET study was designed to develop a set of measures that together serve as an accurate indicator of teaching effectiveness. The study was implemented during the 2009–2010 and 2010–2011 school years with more than 3,000 teachers in 6 predominately urban school districts. A total of 5 assessments of CKT were developed, piloted, and then administered as part of the MET study. The CKT assessments focused on the content knowledge used in recognizing, understanding, and responding to the content problems that teachers encounter as they teach a subject. In English language arts ( ELA ), 2 assessments were developed: 1 for teachers of Grades 4–6 and 1 for Grades 7–9. In mathematics, 3 assessments were developed: 1 for teachers of Grades 4–5, 1 for Grades 6–8, and 1 algebra I. A total of 2,080 final assessments were administered to 1,718 teachers in the 6 participating MET study districts. Assessment results for 194 teachers were excluded due to evidence that assessments were either completed together by 2 or more participants or that insufficient time was devoted to represent a good faith effort at answering the assessment questions. The final sample included 1,886 assessments. Assessment scores included both selected‐response and constructed‐response ( CR ) questions. We used information from item level statistics, including percent correct and biserial correlations, to systematically remove poorly performing items in order to improve assessment reliabilities. Item level statistics for each assessment are presented. Descriptive statistics and histograms indicate that participants are well distributed over the range of possible score responses. Assessments had moderate to strong levels of reliability, ranging from 0.69 to 0.83.