Are Principal Evaluations of Teacher Scores Trustworthy?

September 9, 2022

Teacher evaluation is ubiquitous in US public schools.  Typically, it involves a principal observing a teacher several times over the course of a school year.  In an effort to standardize ratings, a scoring rubric is followed; however, the ratings are ultimately subjective, and the items on the rubric are subject to interpretation.  One of the primary functions of teacher evaluations is to provide accurate feedback to teachers and encourage improvement when needed.   A persistent question regarding teacher evaluation is if evaluation scores are inflated?  There is some research suggesting this is the case; however, little is known about the reasons for inflating the scores.  A recent study by Jones, Bergin, and Murphy (2022) attempted to determine if principals inflated scores and, if so, their motivation for doing so.  Using a mixed method approach that utilized both focus groups and a survey of a large group of principals, principals identified several goals in addition to providing accurate ratings.  Those additional goals were to (1) keep teachers open to growth-promoting feedback, (2) support teachers’ morale and foster positive relationships, (3) avoid difficult conversations, (4) maintain self-efficacy as an instructional leader, and (5) manage limited time wisely.  These additional goals were offered as reasons to inflate scores, even if by small amounts.  For the most part, these are worthy goals and suggest that teacher evaluation is more complicated than simply applying a scoring rubric while observing a teacher.

In general, principals are more likely to inflate ratings if they are linked to high-stakes outcomes such as requiring an improvement plan for the teacher or making retention decisions.  Principals are reluctant to give lower ratings if it results in them having to engage in activities that require more time, such as additional meetings to develop improvement plans or to carefully document the reasons for recommending against retention.  Also, by inflating ratings, principals avoid having difficult conversations with a teacher.

The principals’ worry was that if they gave a lower rating, teachers would become defensive and less open to feedback and growth.  They also feared that low ratings would lover staff morale and positive relationships would be harmed.  These concerns are not without merit.  On a rating scale that ranges from 1-7, a rating of 4 is considered a low rating by the teacher, but a 5 is considered acceptable.  The difference of one point is considered small by the principal.  Since there is room for judgment in the scoring rubric giving a more positive rating will do no harm from the principal’s perspective. 

Based on the research by Jones, Bergin, and Murphy (2022), these situations are relatively common.  Overlooked in the principals’ perspective is that there is little consideration given to the impact these decisions have on students.  It is unknown what effect these decisions are having on student outcomes.  For a complete understanding of the evaluation of teachers, it is important to understand all of the effects of evaluations of teachers. 

