Why It’s a Bad Idea to Release Teacher Ratings
by Norm Fruchter
The United Federation of Teachers has gone to court to block the NYC Department of Education’s (DOE) release of teacher performance ratings based on value-added measures of their students’ test scores. The DOE agreed to hold off on publishing the ratings until a court hearing on November 24th.
The New York City Department of Education (DOE) has a contract with the Wisconsin Center for Education Research to produce reports on value-added results based on the student test scores of more than 12,000 NYC teachers who teach ELA and Math in the fourth through eighth grades. The union, a partner in this research, had negotiated an agreement with the DOE not to make the results public. Instead, the results were to be shared with school principals and used in teacher evaluations and tenure decisions. When the DOE violated their agreement by deciding to make the results public and release them to the news media, the union decided to go to court to block that release. (When the Los Angeles Times published the value-added ratings of thousands of teachers in the Los Angeles Unified School District, one L.A. teacher committed suicide.)
Should such value-added teacher ratings be publicly released? I’d argue absolutely not, for several reasons. First, in NYC, these measures are based on the results of state standardized testing. As the New York State Education Department has recently demonstrated, that testing is so unreliable that the state had to completely rescale the exams correct deficiencies. Second, many teachers do not receive any value-added score because they do not teach students within the grade ranges and subject areas tested. Why single out only some teachers for public praise or condemnation?
Third, current value-added measures have an enormous range of uncertainty. Using the NYC model, NYU’s Sean Corcoran found that using several years of available data, a teacher’s performance could range from the 46th to the 80th percentile. Using only one year of data, a teacher’s average performance range extends from the 30th to the 91st percentile. This enormous variation is clearly of little use in rating and more importantly, in helping teachers improve their teaching.
Fourth, value-added measures have a very high degree of year-to-year variability – this year’s highly rated teacher performance may plummet on next year’s ratings. That variability suggests the fifth problem – students’ testing outcomes are responsive to many more factors than just a teacher’s instructional capacity. To take one current example, sudden family homelessness, an increasing result of our national recession, can cause students to change school, miss days of schooling, or come to school suffering from severe psychological stress. All these factors can easily contribute to severe reductions in student test performance.
Finally, if all these significant problems with current value-added testing could be resolved (and they may in the future), should we publish individual teachers’ ratings? I think not. Why transform what might someday become a useful tool for improving teachers’ practice into a ritual of public shaming?
Norm Fruchter is a senior policy analyst at the Annenberg Institute for School Reform