Teacher Evaluation Overview
Teacher Evaluation PDF
Cleaver, S., Detrich, R. & States, J. (2018). Overview of Teacher Evaluation. Oakland, CA: The Wing Institute. https://www.winginstitute.org/assessment-summative.
As students progress through school, many elements—home experiences, classroom instruction, and internal factors—influence their eventual outcomes. In the school environment, a teacher’s skills, strengths, and abilities have as much of an influence on student learning as student background (Wenglinsky, 2002). Put another way, teachers matter; teachers who are effectivecontribute to positive student outcomes and achievement (Johnson & Zwick, 1990; Nye, Konstantopoulus, & Hedges, 2004; Sanders, Wright, & Horn, 1997), so it is important to understand what effective teachers do that influence student outcomes. Equally important is to provide teachers with information and feedback they can use to become better practitioners. That’s where teacher evaluation comes in.
Teacher evaluation is conducted to ensure teacher quality and to promote professional learning with the goal of improving future performance (Danielson, 2010). A basic definition of teacher evaluation is the formal process used to review teacher performance and effectiveness in the classroom (Sawchuk, 2015). However, this definition is an oversimplification. In practice, teacher evaluation involves understanding and agreeing on the inputs (e.g., the practices that define quality teaching), outputs (e.g., student achievement measures), and methods of evaluation (e.g., student assessment data, teacher observation rubrics). The elements of evaluation are rarely agreed on (Goe, Bell, & Little, 2008). This overview provides information about teacher evaluation as it relates to collecting information about teacher practice and using it to improve student outcomes.
Teacher Evaluation for Improvement and Accountability
Teacher evaluation serves two purposes: improvement and accountability. Evaluation provides teachers with information that can improve their practice and serve as a starting point for professional development; for example, using information from teacher evaluations to set a plan of study for professional learning community (PLC) meetings. Evaluation provides accountability when information gained from the evaluation is used to guide decisions regarding bonuses, firing, and other human resource decisions (Santiago & Benavides, 2009).
There is an inherent tension between these two purposes. On one hand, when teachers feel they are focused on improvement, accountability can feel incongruent and teachers may not want to provide accurate information because of the risk of revealing weaknesses. On the other hand, when the focus is on accountability, teachers may feel insecure about their work (Santiago & Benavides, 2009). Goals around improvement may hinder the ability to use evaluation for accountability decisions, while goals around accountability may prevent or obfuscate improvement efforts. If the teacher evaluation process becomes too cumbersome or aversive for either the teacher or evaluator, the process will be in jeopardy.
Summative and Formative Evaluation
Teacher evaluation can serve a summative or formative purpose. Summative evaluation provides conclusive evaluation of a teacher’s performance to determine how well that individual has done his or her work (Marzano, 2012). In this type of evaluation, a supervisor evaluates a teacher using a combination of measures that may include student test scores, lesson plans and artifacts, and rating scales or rubrics. Teachers are not involved and the results are used for accountability decisions such as pay awards or dismissal (Marzano, 2012).
Formative evaluation provides ongoing information about teacher practice with the goal of providing feedback that helps teachers improve. Teachers are often involved in the process through self-reflection or self-assessment. The results of the evaluation may be used to give teachers feedback, and to make decisions regarding the professional development or coaching support that teachers receive (Sayavedra, 2014).
History and Current State of Teacher Evaluation
In the early 20th century, the framework of scientific management, or the idea that every task can be broken down into its best and most efficient method, was applied to education (Marzano, Frontier, & Livingston, 2011). This started a focus on examining teacher behavior, providing suggestions for feedback, and evaluating effectiveness in the classroom (Marzano et al., 2011). Since World War II, the role of evaluation has evolved. Clinical supervision, popular in the 1960s and 1970s, was the first major trend. It involved a pre-observation conference, teacher observation, reflection, and analysis with a focus on classroom behaviors that directly impacted learning. In the 1980s, the Hunter lesson design, also called mastery teaching, was incorporated into observation and evaluation so that administrators observed a specific lesson sequence: anticipatory set, objective and purpose, input, model, checking for understanding, guided practice, and independent practice (Hunter, 1984).
In the mid-1980s, alternatives to clinical supervision and mastery teaching were proposed. In these alternatives, the teacher became a core element in evaluation and principals were expected to differentiate observation and evaluation depending on teachers’ needs and experience (Marzano et al., 2011). Throughout the 1980s and 1990s, there was a shift away from structured observation, along with a move toward formal teacher evaluation (Marzano et al., 2011).
One of these shifts was prompted by a RAND group study of 32 districts across the United States (Wise, Darling-Hammond, McLaughlin, & Bernstein, 1984). The RAND study concluded that there were four primary concerns regarding then-current evaluation: (a) Principals were not committed or able to provide accurate evaluations, (b) teachers were not open to receiving feedback, (c) evaluation practices were not uniform, and (d) evaluators were not trained (Wise et al., 1984). The RAND study also outlined the following recommendations for evaluation:
- Evaluation systems should align with goals without being overly prescriptive.
- Principals need time, training, and oversight to implement evaluations effectively.
- An evaluation system should align with the overarching purpose (and a district may need multiple evaluations to align with multiple goals).
- Resources need to be provided and allocated effectively.
- Teachers need to be involved in the design, monitoring, and implementation of evaluation systems.
Throughout the 20th century, teacher evaluation was a district-level initiative, more focused on teacher behavior and administrative supervision. In the 21st century, teacher evaluation has become a focus of national policy, and the emphasis has shifted to evaluation of teacher quality and student achievement (Marzano et al., 2011).
In the late 2000s, two reports critiqued the teacher evaluation system and set the stage for the current conversation. First, Toch and Rothman’s report Rush to Judgment critiqued teacher evaluation as “superficial and capricious” (2008, p. 1) and ascertained that it did not measure student learning. And, despite No Child Left Behind requirements, Toch and Rothman found only 14 states that required annual teacher evaluations. Similarly, Weisberg, Sexton, Mulhern, and Keeling (2009), in The Widget Effect,found that fewer than 1% of 15,000 teachers in 12 districts and four states were rated “unsatisfactory” and that little action was taken based on results from teacher evaluations. The authors argued that districts were treating teachers as widgets, or interchangeable parts in a system, not as individual professionals with the potential to have an important impact on instructional effectiveness and student outcomes.
This increased concern about how teacher evaluations were being conducted and used, along with legislation around teacher quality, focused state legislature attention on teacher evaluation (Goe, Holdheide, & Miller, 2011). The current conversation still focuses on how teacher evaluations are conducted; the impact of teacher evaluation on teacher effectiveness and student outcomes; and how results are used, for example, in professional development (Sawchuk, 2015).
Relevant Issues in Teacher Evaluation
Current issues in teacher evaluation revolve around core questions on how to design and implement an evaluation, including what framework to use, what to measure, and how to collect data.
A framework outlines the guiding principles for a teacher evaluation. It provides credibility in the system, and assurance that evaluators can confidently ascertain the quality of teachers (Danielson, 2010). That framework should include:
- A clear definition of good teaching that is agreed on by everyone involved (Danielson, 2010).
- An understanding of the purpose of the evaluation, which may be information gathering, accountability, or improvement, or any combination of the three (Goe et al., 2008).
- A clear purpose that provides information about whether the evaluation is formative or summative, and how the results will be used (Goe et al., 2008).
- An understanding of who is involved and how, the tools that will be used, and the stakeholders involved (Santiago & Benavides, 2009).
Teacher quality is measured both quantitatively (e.g., student test scores) and qualitatively (e.g., notes on teacher professionalism). An analysis of 120 studies (Goe et al., 2008) identified qualitative elements of effective teachers:
- Positive contribution to academic, attitudinal, and social outcomes for students
- Comprehensive lesson planning, progress monitoring, and instruction adaption and evaluation capacity
- Diversity and civic-mindedness
- Collaboration with stakeholders (e.g., parents, administrators), particularly for students who are at risk (e.g., those with individualized education programs, or IEPs)
Once the elements that will be measured are clear, how to measure each aspect must be considered. While summative evaluations should include a comprehensive variety of measures that can provide a full picture of a teacher’s effectiveness, formative evaluations may include any range of measures used to collect enough information to serve the purpose of the evaluation. The measures used in formative evaluation may also be more teacher focused, including self-assessment, observation, peer mentoring, and coaching. When coaching and peer mentoring are used, it is important to consider training evaluators in how to deliver feedback that leads to improved teacher performance.
Another consideration for measurement is the reliability and validity of tools. Reliability of a tool is how well it produces consistent and stable results. Tools that are used to measure teacher practices must be reliable and valid; they must provide information that is consistent across multiple evaluators and that measure teacher practice without measuring any other factors at the same time. Also, tools used to gauge student outcomes must be valid, meaning that the scores must accurately measure the outcome without measuring anything else (Goe et al., 2008).
Blanton et al. (2003) outlined additional criteria that inform the usefulness of a measurement tool:
- The ability to capture all aspects of a teacher’s effectiveness
- The ability to capture the range of activities in a teacher’s work
- Usefulness of the scores to be used for a specific purpose
- Feasibility, including the cost, training required, and other considerations
- Credibility or the trust that the stakeholders have in the measure
Charlotte Danielson Framework for Teaching.
A common measure used for teacher evaluation is the Charlotte Danielson Framework for Teaching (Danielson, 1996, 2007), which includes an extensive rubric over four domains: planning and preparation, classroom environment, instruction, and professional responsibilities. Across these four domains, the rubric incorporates 76 elements of teaching broken into four levels of performance (unsatisfactory, basic, proficient, and distinguished). Over time and two iterations (1996 and 2007), the Danielson framework has become the primary tool for capturing teaching and learning (Marzano et al., 2011). The Danielson Framework for Teaching (Danielson, 1996) was intended to do three things:
- Acknowledge the difficulty and complexity of teaching as a profession.
- Create a language for professional engagement.
- Provide a structure for teacher assessment and reflection.
Research conducted on the Danielson framework indicates acceptable reliability and validity (Lash, Tran, & Huang, 2016). When there is score variance, it is attributable to the teacher, not other variables (Kane & Staiger, 2012; Kane, Taylor, Tyler, & Wooten, 2011). This means that when a score differs from one evaluation to the next, such as when a teacher advances in the area of planning and preparation from fall to winter, the difference between the two scores occurs because the teacher changed his or her practice, not because the tool was unclear. The reliability of achievement growth scores varies (Kane & Staiger, 2012; Lash et al., 2016). One study that used evaluations from 156 teachers across 18 high-poverty charter schools in the mid-Atlantic concluded that using multiple measures across a school year (in this case, three separate observations using the Danielson framework) provided a reliable measure (Kettler & Reddy, 2017).
Value-added measures are a way to take into account the various conditions and factors that contribute to student achievement, across multiple years of teaching, and in comparison with other teachers . This way of calculating a teacher’s effectiveness was developed in the 2000s using statistical models that could determine how much one teacher contributed to student learning (Goe et al., 2008).
Because they are removed from the immediate classroom experience and seem disconnected from what happens in classrooms, value-added measures are controversial (Goe et al., 2008). However, these measures do have reliability. A study by the Bill and Melinda Gates Foundation (2010) found that teachers whose students showed gains in one assessment were likely to show gains in related assessments that measured conceptual understanding. For example, a math teacher whose students scored high on the state math assessment was likely to have students who also demonstrated a deep knowledge of the core principles of math. The correlation between teacher value-added measures on state tests and deeper understanding were higher for math (0.54) than for reading (0.37). However, it is important to consider that teachers who produce strong value-added scores on state tests may also develop students’ overarching skills and depth of knowledge about the subject.
As a summative measure, value-added measures provide an overarching look at a teacher’s impact over time. Yet, as a formative tool, value-added measures do not provide information about what high-performing teachers do that make a difference in student learning (Goe et al., 2008). While value-added models are useful for identifying trends that can be used to make system improvements, multiple reports have recommended against using them for individual personnel decisions (American Statistical Association, 2014; Darling-Hammond et al., 2012; Polikoff & Porter, 2014). Specifically, the American Statistical Association cautioned against using value-added measures because, among other reasons, they are based on only one measure (standardized test scores), and the models may not capture all the factors that contribute to the effect a teacher may have on student outcomes.
Continuum of Research and Impact on Student Outcomes
Teacher evaluation is an established practice directed by state and federal law. However, we do not know the exact or full impact of teacher evaluation practices on student outcomes (e.g., Stecher et al., 2018). Some research has attempted to connect the practice of teacher evaluation with changes in student outcomes. In three notable large-scale studies, teacher evaluation was the practice of assessing teachers using a valid and reliable tool and providing feedback. These studies produced mixed results on student or school-level outcomes.
A quasi-experimental study of mid-career elementary and middle school teachers in the Cincinnati Public Schools Teacher Evaluation System (TES) examined teachers before, during, and after a year-long evaluation. The 105 teachers involved in the study taught fourth- through eighth-grade math. Evaluations conducted using multiple, structured classroom observations by trained peers and administrators were conducted between the 2003–2004 and 2009–2010 school years. The observations were conducted using a rubric based on the Danielson Framework for Teaching (Danielson, 1996, 2007). Student achievement was compared before, during, and after the teacher’s evaluation year. Teachers were more effective in advancing student achievement in math the year they were evaluated and the years afterward. Specifically, a student who was taught by a teacher who had been through TES scored 11% of a standard deviation (4.5 percentile points for a median student) higher in math compared with a student taught by the same teacher before the evaluation. The study did not identify what about teacher practice accounted for the difference in student achievement. This study supports the use of teacher evaluation to encourage continued growth in mid-career teachers’ performance and a connection to student achievement. Also, performance improvement was greatest for teachers who were weakest at the start of the evaluation (those who received low initial scores or who were ineffective in improving student test scores the year prior to evaluation). Teacher evaluation was a way for teachers who needed the most support, those that scored the lowest on initial evaluations and likely received the most critical feedback, to receive development (Taylor & Tyler, 2012a, 2012b).
In another large-scale study, the Chicago Public Schools’ Excellence in Teaching Project was a teacher evaluation program focused on increasing student learning through principal-teacher conversation. A pilot study included 44 elementary schools in 2008–2009 and an additional 48 schools in 2009–2010. Principals in the first cohort received a total of 50 hours of support across the school year, with training and development in the Danielson framework, best practices in teacher observation and evidence collection, coaching, and implementation. Principals who joined the project in the second year received significantly less support. This difference in support across the two cohorts may have impacted the results. Short-term positive effects on reading performance were found in high-achieving, low-poverty schools, and schools that were in the first cohort performed higher in reading and math than schools in the second cohort. This study suggests that teacher evaluation systems produce different effects at different schools, and that teacher observation can have an impact on school performance (Steinberg & Sartain, 2015).
The Gates Foundation has been extensively involved in teacher evaluation as it relates to student achievement outcomes (Barnum, 2018). In 2018, the Gates Foundation released a cumulative study that reflected its work in three districts (Stecher et al., 2018). The Intensive Partnerships for Effective Teaching initiative was focused on increasing student performance by improving teaching effectiveness. The project started in 2009–2010 in three school districts (Hillsborough County Public Schools in Florida, Memphis City Schools, and Pittsburgh Public Schools) and four charter management organizations. Across multiple years, teaching effectiveness measures collected using a rubric were used to improve staffing, identify areas of development, strengthen professional development, and structure teacher advancement and compensation. The researchers hypothesized that with a strong teaching effectiveness evaluation system in place, teaching quality would increase and lead to greater academic outcomes for students in low-income, minority schools. The final report (Stecher et al., 2018) noted that school sites had implemented the teacher effectiveness practices (evaluation using an observation rubric and subsequent decision-making), but the advancement in student achievement or graduation rates was not realized, particularly for low-income minority students. At the end of the project (2014–2015), student achievement, access to effective teaching, and graduation rates in sites that had participated in the initiative did not differ from those in sites that had not participated. The reason why there was no difference was unclear, although the researchers hypothesized that a focus exclusively on teacher effectiveness may not be enough to improve student outcomes and that other factors may need to be addressed to produce dramatic improvements in student outcomes.
Teacher evaluation is a best practice that can be used to inform decisions when implemented with transparent processes and strong measures. The process of teacher evaluation produces some change in teacher practice that can impact student outcomes during and after the evaluation period (Taylor & Tyler, 2012a, 2012b). However, teacher evaluation may have different impacts on schools with varying demographics and baseline achievement levels (Steinberg & Sartain, 2015). Finally, formative evaluation can provide clear, objective feedback and a structure for collecting and using data to show teachers how they are changing performance, and, in that way, serve as professional development to support low-performing teachers (Taylor & Tyler, 2012a, 2012b).
Cost-Benefit of Teacher Evaluation.
The cost-benefit of teacher evaluation encompasses many considerations including student learning outcomes, information gathered, and the ability to make decisions with the information (Peterson, 2000). It is likely that the benefits and costs will be specific to a school or district.
For example, one study of the cost to start a teacher evaluation system across three districts found that it ranged from $8 to $115 per student, which equated to between 0.4% and 0.5% of total district spending, and between 1% and 1.3% of teacher compensation (Chambers, Brodziak de los Reyes, & O’Neil, 2013). The researchers concluded that their figures did not reflect all potential costs and that the cost of actual implementation might be higher.
Currently, teacher evaluation is understood as a form of professional development. The goal is to establish a rigorous and fair system that can be used to make decisions related to hiring, firing, and promotion, and that can improve teacher practice and student learning (Bill and Melinda Gates Foundation, 2012). This is no easy task as evidenced by the mixed results for large-scale studies that have examined the impact of teacher evaluation on student achievement (Stecher et al., 2018; Steinberg & Sartain, 2015; Taylor & Tyler, 2012a, 2012b).
As a practice, teacher evaluation is an established way to gather information about how teachers are performing in the classroom and is already incorporated into the expectations and day-to-day work of school administrators. With current measures (e.g., the Danielson Framework for Teaching), it is possible to collect reliable and valid data related to teacher performance and use that data to design professional development targeted at teacher needs. With rigorous measures and quality implementation, teacher evaluation, especially formative evaluation, is a tool that, ideally, can be used to improve teacher quality over time.
American Statistical Association. (2014, April 8). ASA statement on using value-added models for educational assessment. Retrieved from https://www.scribd.com/document/217916454/ASA-VAM-Statement-1
Barnum, M. (2018, June 21). The Gates Foundation bet big on teacher evaluation. The report it commissioned explains how those efforts fell short. Chalkbeat.Retrieved from https://www.chalkbeat.org/posts/us/2018/06/21/the-gates-foundation-bet-big-on-teacher-evaluation-the-report-it-commissioned-explains-how-those-efforts-fell-short/
Bill and Melinda Gates Foundation. (2010). Learning about teaching: Initial findings from the measures of effective teaching project.Retrieved from https://docs.gatesfoundation.org/documents/preliminary-findings-research-paper.pdf
Bill and Melinda Gates Foundation. (2012). Gathering feedback on teaching: Combining high-quality observation with student surveys and achievement gains.Retrieved from http://k12education.gatesfoundation.org/resource/gathering-feedback-on-teaching-combining-high-quality-observations-with-student-surveys-and-achievement-gains-2/
Blanton, L. P., Sindelar, P. T., Correa, V., Harman, M., McDonnell, J., & Kuhel, K. (2003). Conceptions of beginning teacher quality: Models for conducting research(COPSSE Doc. No. RS-6). Gainesville, FL: Center on Personnel Studies in Special Education (COPSSE), University of Florida. Retrieved from http://copsse.education.ufl.edu//docs/RS-6/1/RS-6.pdf
Chambers, J., Brodziak de los Reyes, I., & O’Neil, C. (2013). How much are districts spending to implement teacher evaluation systems? Case studies of Hillsborough County Public Schools, Memphis City Schools, and Pittsburgh Public Schools. Santa Monica, CA: RAND Corporation. Retrieved from: https://www.rand.org/content/dam/rand/pubs/working_papers/WR900/WR989/RAND_WR989.pdf
Danielson, C. (1996, 2007). Enhancing professional practice: A framework for teaching (1st and 2nd eds).Alexandria, VA: ASCD.
Danielson, C. (2010). Evaluations that help teachers learn. Educational Leadership, 68(4), 35–39. Retrieved from http://www.ascd.org/publications/educational-leadership/dec10/vol68/num04/Evaluations-That-Help-Teachers-Learn.aspx
Darling-Hammond, L., Amrein-Beardsley, A., Haertel, E., & Rothstein, J. (2012). Evaluating teacher evaluation: Popular modes of evaluating teachers are fraught with inaccuracies and inconsistencies, but the field has identified better approaches. Phi Delta Kappan, 93(6), 8–15.Retrieved from https://www.edweek.org/ew/articles/2012/03/01/kappan_hammond.html
Goe, L., Bell, C., & Little, O. (2008). Approaches to evaluating teacher effectiveness: A research synthesis. Washington, DC: National Comprehensive Center for Teacher Quality. Retrieved from https://eric.ed.gov/?id=ED521228
Goe, L., Holdheide, L., & Miller, T. (2011). A practical guide to designing comprehensive teacher evaluation systems: A tool to assist in the development of teacher evaluation systems.Washington, DC: National Comprehensive Center for Teacher Quality. Retrieved from https://files.eric.ed.gov/fulltext/ED520828.pdf
Hunter, M. (1984). Knowing, teaching, and supervising. In P. Hosford (Ed.), Using what we know about teaching.(pp. 169–192). Alexandria, VA: ASCD.
Johnson, E. G., & Zwick, R. (1990). Focusing the new design: The NAEP 1988 technical report. Journal of Educational and Behavioral Studies, 17,95–109.
Kane, T. J., & Staigler, D. O. (2012). Gathering feedback for teaching: Combining high-quality observations with student surveys and achievement gains.Seattle, WA: Bill and Melinda Gates Foundation.
Kane, T. J., Taylor, E. S., Tyler, J. H., & Wooten, A. L. (2011). Identifying effective classroom practices using achievement data. Journal of Human Resources, 46(3), 587–613.
Kettler, R. J., & Reddy, L. A. (2017). Using observational assessment to inform professional development decisions: Alternative scoring for the Danielson Framework for Teaching. Assessment for Effective Intervention,1–12.
Lash, A., Tran, L., & Huang, M. (2016). Examining the validity of ratings from a classroom observation instrument for use in a district’s teacher evaluation system(REL 2016-135). Washington, DC: U.S. Department of Education, Institute of Education Sciences, National Center for Education Evaluation and Regional Assistance, Regional Educational Laboratory West.
Marzano, R. J. (2012). Teacher Evaluation: What’s fair? What’s effective? The two purposes of teacher evaluation. Educational Leadership, 70(3), 14–19. Alexandria, VA: ASCD. Retrieved from http://www.ascd.org/publications/educational-leadership/nov12/vol70/num03/The-Two-Purposes-of-Teacher-Evaluation.aspx
Marzano, R., Frontier, T., & Livingston, D. (2011). Effective supervision: Supporting the art and science of teaching. Alexandria, VA: ASCD.
Nye, B., Konstantopoulos, S., & Hedges, L. V. (2004). How large are teacher effects? Educational Evaluation and Policy Analysis, 26(3),237–257.
Peterson, K. D. (2000). Teacher evaluation: A comprehensive guide to new directions and practices(2nd ed.).Thousand Oaks, CA: Corwin Press.
Polikoff, M. S, & Porter, A. C. (2014). Instructional alignment as a measure of teacher quality. Education Evaluation and Policy Analysis, 64(3), 212–225. Retrieved from http://www.aera.net/Newsroom/Recent-AERA-Research/Instructional-Alignment-as-a-Measure-of-Teaching-Quality
Sanders, W. L., Wright, S. P., & Horn, S. P. (1997). Teacher and classroom context effects on student achievement: Implications for teacher evaluation. Journal of Personnel Evaluation and Education, 11(1), 57–67.
Santiago, P., & Benavides, F. (2009). Teacher evaluation: A conceptual framework and examples of country practices.Organisation for Economic Cooperation and Development (OECD). Retrieved from http://www.oecd.org/education/school/44568106.pdf
Sawchuk, S. (2015, September 3). Teacher Evaluation: An issue overview. Education Week. Retrieved from www.edweek.org/ew/section/multimedia/teacher-performance-evaluation-issue-overview.html
Sayavedra, M. (2014). Teacher evaluation. ORTESOL Journal, 31, 1–9.
Stecher, B. M., Holtzman, D. J., Garet, M. S., Hamilton, L. S., Engberg, J., Steiner, E. D.,…Chambers, J. (2018).Improving teaching effectiveness: Final report: The intensive partnerships for effective teaching through 2015–2016.Santa Monica, CA: RAND Corporation. Retrieved from https://www.rand.org/pubs/research_reports/RR2242.html
Steinberg, M. P., & Sartain, L. (2015). Does teacher evaluation improve school performance? Experimental evidence from Chicago’s Excellence in Teaching project. Education Finance and Policy, 10(4), 535–572.
Taylor, E. S., & Tyler, J. H. (2012a). Can teacher evaluation improve teaching? Evidence of systematic growth in the effectiveness of midcareer teachers. Education Next, 12(4). Retrieved from http://educationnext.org/can-teacher-evaluation-improve-teaching/
Taylor, E. S., & Tyler, J. H. (2012b). The effect of evaluation on teacher performance. American Economic Review, 102(7), 3628–3651.
Toch, T., & Rothman, R. (2008). Rush to judgment: Teacher evaluation in public education.Washington, DC: Education Sector.Retrieved from https://eric.ed.gov/?id=ED502120
Weisberg, D., Sexton, S., Mulhern, J., & Keeling, D. (2009). The widget effect: Our national failure to acknowledge and act on differences in teacher effectiveness. New York, NY: The New Teacher Project. Retrieved from https://tntp.org/publications/view/the-widget-effect-failure-to-act-on-differences-in-teacher-effectiveness
Wenglinsky, H. (2002). The link between teacher classroom practices and student academic performance. Education Policy Analysis Archives, 10(12).
Wise, A. E., Darling-Hammond, L., Tyson-Bernstein, H, & McLaughlin, M. W. (1984). Teacher evaluation: A study of effective practices. Santa Monica, CA: RAND Corporation. Retrieved from https://www.rand.org/pubs/reports/R3139.html