Education DriversOverview
Decision Making- Overview
- Best Available Evidence- Continuum of Evidence
- Types Of Evidence
- Sources Of Evidence
- Professional Judgment- Sources Of Bias
- Improvement
- Practice Based Evidence
- Contextual Fit
- Client Values
- Treatment Integrity- Dimensions
- Strategies
- Problem Solving
Implementation- Overview
- Exploration/Adoption
- Installation
- Initial Implementation
- Full Implementation
- Social Influence
- Culture Change
- Values
Monitoring- Overview
- Student- Student Formative Assessment
- End of Course Exams
- Standardized Tests
- Grades
- Early Indicators
- Staff- Value Added
- Formal Evaluation
- Informal (Walk Throughs)
- Feedback (Coaching)
- Treatment Integrity
- Systems- Short Term (Formative)
- Long Term (Summative)
- Program Fidelity
External Influences- Overview
- Home- Parenting
- Home Schooling
- Homework
- Poverty- Impact
- Interventions
- Cultural Diversity- Issues
- Solutions
- Policy- Political Policy
- Governmental Agencies
- Special Interests
- Society- Communities
- Media
- Common Frames
- Standards- Societal Outcomes
- Academic Standards
- Evaluation
Quality Teachers- Overview
- Competencies- Formative Assessment
- Classroom Management
- Instructional Delivery
- Soft Skills
- Outreach- Projected Need
- Teacher Standards
- Outreach
- Selection
- Retention- Teacher Turnover Impact
- Teacher Turnover Analysis
- Retention Strategies
- Teacher Preparation- Curriculum Content
- Instructional Effectiveness
- Student Teaching
- Teacher Preparation: Models
- Program Accountability
- Professional Development- Induction
- Inservice
- CEUs / Advanced Degrees
- Certification / Licensing
- Coaching
- Evaluation- Formal Evaluation
- Performance Feedback
- Coaching
Quality Leadership- Overview
- Principal Impact
- Principal Competencies- Teacher Development
- Goals and Expectations
- Ensuring Quality Teaching
- Resourcing Strategically
- Orderly Safe Environment
- Principal Outreach- Needs Analysis
- Principal Standards
- Principal Outreach
- Principal Selection
- Principal Retention- Turnover Impact
- Turnover Analysis
- Retention Strategies
- Principal Preparation- Curriculum Content
- Instructional Effectiveness
- Clinical Practice
- Program Models
- Program Accountability
- Professional Development- Induction
- In-Service Professional Development
- CEUs / Advanced Degrees
- Certification / Licensing
- Coaching
- Evaluation- Formal Evaluation
- Performance Feedback
- Coaching
- School Metrics
- Leadership Models- Teams
- Distributed Leadership
- District/State Leadership- School Districts
- State Education Agencies
Effective Instruction- Overview
- Assessment- Formative Assessment
- Summative Assessment
- Instructional Delivery- Planning
- Active Student Responding
- Corrective Feedback
- Differential Reinforcement
- Mastery Learning
- Quantity Of Instruction
- Classroom Management- Appropriate Behaviors
- Inappropriate Behaviors
- Rules And Procedures
- Structured Environments
- Active Supervision
- Soft Skills/Personal Competencies- Teacher-Student Relationships
- Personal Organization
- Personal Problem Solving
- Evidence-Based Curriculum- Instructional Standards
- Generic Practice Elements
- Subject Practice Elements
- Systematic Reviews
- Treatment Integrity- Dimensions
- Strategies
- Problem Solving
- Remote Learning- Infrastructure
- Curriculum
- Training And Support
- School Programs- Multi-Tiered Systems / Support
- Early Childhood Education
- Vocational Education
- Special Education
- School Climate
Education Resources- Overview
- Staff- Staff Quality
- Staff Support
- Staff Equitable Distribution
- Funding- Amount
- Return On Investment
- Equitable Distribution
- Materials- Curriculum Supplies
- Facilities
- Computer / IT

Formal Evaluation

Performance appraisals must begin by identifying the purpose of the evaluation. Two different goals are frequently ascribed to performance appraisals: to measure teacher competence and to help improve performance. These two goals are often in conflict. An appraisal that effectively promotes professional development requires trust between the evaluator and the teacher being mentored. Trust is very challenging to achieve when an appraisal is used as a high-stakes measure that can determine future employment, compensation, or advancement. Research suggests that schools are better served by severing the link between these conflicting and necessary objectives. Separating the two goals increases the likelihood that both can be achieved. Annual performance appraisals are summative; they are snapshots that inform both principal and teacher of satisfactory performance and inform the teacher of any need to improve. For continuous improvement, however, formative assessment is required. Research suggests that effective formative assessment should include systematic classroom observation, student achievement gains, and student and peer performance surveys.

OVERVIEW WING RESOURCES RESEARCH ORGANIZATIONS

Overview: Formal Teacher Evaluation

Formal Teacher PDF

Cleaver, S., Detrich, R. & States, J. (2018). Overview of Teacher Formal Evaluation. Oakland, CA: The Wing Institute. https://www.winginstitute.org/teacher-evaluation-formal.

Teacher evaluation has been a regular part of the teacher and principal experience since the early 20th century (Shinkfield & Stufflebeam, 1995). In recent years, teacher evaluation has gained focus as a way to improve teacher practice and student outcomes (Marzano, Frontier, & Livingston, 2011).

In general, teacher evaluation is the process of reviewing teacher performance in the classroom (Sawchuk, 2015). It is a combination of inputs such as teacher behaviors, outputs such as student test data, and methods of evaluation such as teacher observation rubrics (Goe, Bell, & Little, 2008).

Teacher evaluation can be formative or summative. Formative evaluation involves collecting information that can be used to shape instruction; it is often thought of as evaluation to be used for something, in this case improving teaching. Summative evaluation is the evaluation of teaching, often conducted at the end of a school year or other specified period. The information from summative evaluations is used to make decisions about teacher promotion or retention and other modifications.

Formal teacher evaluation, a summative measure, has two goals: (a) to assess teacher competency across a time period (often a school year), and (b) to provide feedback on teacher practice. This feedback may be in the form of communication with the teacher, for example, in an end-of-year-review, or it may be in the form of a pay benefit. Either way, the feedback is unlikely to be used in real time as it would be in formative evaluation.

The purpose of this overview is to provide information about the role of formal teacher evaluation, the research that examines the practice, and its impact on student outcomes.

Why Is Formal Teacher Evaluation Important?

A high-quality teacher evaluation system can provide important information about education, work to ensure teacher quality, create a common language around quality instruction, and provide a structure for accountability (Danielson, 2010). First and foremost, it is important because what teachers and principals do each day has a direct impact on students.

Teachers and Principals Matter

When it comes to student achievement, teachers matter (Chetty, Friedman, & Rockoff, 2011) and so do principals (Marzano, Waters, & McNulty, 2005). Examining the impact of factors on student outcomes, Marzano et al. (2005) calculated that 33% of student achievement can be contributed to teachers, and 25% to principals.

Evaluation Drives Good Teaching and Decision Making

For teacher evaluation to be effective, the methods, or how the evaluation is conducted, and purpose must be explicit (Darling-Hammond, Wise, & Pease, 1983). The process of achieving effective evaluation forces conversations and conclusions about what defines good teaching and expected student outcomes (Danielson, 2010). These conversations and the resulting alignment can help orient a teacher evaluation system around best practices and an understanding of quality education.

Once the teacher evaluation system is in place, the results from teacher evaluations drive decisions that range from retention and bonuses to professional development (McDougald, Griffith, Pennington, & Mead, 2016). Originally, advocates of teacher evaluation hoped that a strong teacher evaluation system would force the removal of ineffective teachers, but this has not been the case (Griffith & McDougald, 2016). However, evaluation continues to shape the broader conversation about what is important in education and how it impacts students (McDougald et al, 2016).

Background: Formal Teacher Evaluation

Starting in the 20th century, evaluating teachers became an increasingly important function of the principal’s role (Shinkfield & Stufflebeam, 1995). Throughout the 20th century, teacher evaluation was seen either as a way to decisively evaluate teachers as effective or ineffective, or to support and shape teacher practice (Hazi & Arredondo Rucinski, 2009). For example, in the 1960s and 1970s, evaluation shifted from evaluation to teacher support and improvement through clinical supervision models (Hazi & Arredondo Rucinski, 2009).

In the 21st century, collective bargaining agreements, legislation, and national reports (e.g., The Nation’s Report Card) have influenced the development of teacher evaluation (Shinkfield & Stufflebeam, 1995). The No Child Left Behind (NCLB) act focused on highly qualified teachers and evaluation as a way to improve instruction. In response to NCLB, Hazi and Arredondo Rucinski (2009) summarized how states were implementing the teacher evaluation aspect of the law. In general, state education leaders took steps to define teacher quality and created indicators for teacher quality. The level of control that states placed on evaluations varied, but the four general types of teacher evaluation activity included:

Adopting National Governors Association strategies (e.g., training evaluators)
Engaging in increased oversight and involvement in local evaluation practices
Decreasing the frequency of evaluation of veteran teachers
Increasing data used in evaluation

In the late 2000s, teacher evaluation came under scrutiny when researchers such as Toch and Rothman (2008) indicated that the majority of teachers were rated “above average” and suggested that teacher evaluations did not correlate with student outcomes, meaning that teachers rated “above average” did not produce strong student outcomes. In addition, feedback provided by principals may not be helpful or even accurate as many principals are not effective at identifying quality instruction (Fink & Markholt, 2011). These findings prompted increased scrutiny of how teachers are evaluated and how the information is used (Goe, Holheide, & Miller, 2011).

Although teacher evaluation was embedded into state law and school district practice during NCLB, states adjusted benchmarks to align with the Common Core State Standards established in 2009 (Aragon, 2018). Then policy was adjusted again with the Every Student Succeeds Act (ESSA), passed in December 2015. Specifically, ESSA rolled back incentives for states and districts to change evaluation policies and altered funding streams (Pennington & Mead, 2016). Under ESSA, states are required to define teacher ineffectiveness but are not required to implement teacher evaluation systems. In response, in 2017, a total of 16 states enacted bills related to the purpose, design, authority over, or progress of teacher evaluation. These bills related to funding, the core focus of evaluation in the state, and the types of data used (Aragon, 2018). In addition, there has been a shift in the type of data incorporated into evaluations; the number of states requiring student growth measures in teacher evaluation decreased to 39 (Aragon, 2018).

In the current policy climate, districts and states continue to prioritize teacher evaluation, and relevant issues remain around the collection and use of teacher data.

Relevant Issues in Formal Teacher Evaluation

The core issues driving formal teacher evaluation are:

What should be measured?
How do we measure it?
What should principals and districts do with formal evaluation information once it has been collected?

What Should Be Measured?

The work that teachers do varies from instruction to engaging with parents, so an important consideration is the type of data that best reflects a teacher’s practice.

Student Assessment Data. One type of data incorporated into teacher evaluation is student test scores as a measure of achievement or mastery (RAND, 2012). Student test data can provide an objective measure linked to student achievement, in contrast to supervisor judgments, which can be subjective (Steele, Hamilton, & Stecher, 2010) or inaccurate (Fink & Markholt, 2011). When used, student test scores must (a) support valid and reliable inferences about how teachers contribute to student achievement and (b) attempt to include teachers that do not teach courses that are directly assessed (e.g., music, gym, and grade levels not included in state testing; Steele et al., 2010).

Alternative Measures.Alternatives to student test data, such as teacher evaluation and student learning objectives, can be used to evaluate teacher performance. These measures have benefits such as increased collaboration and perceived fairness, as well as drawbacks such as cost and implementation challenges (McCullough, English, Angus, & Gill, 2015). Alternative measures also require attention to validity (whether the test is an accurate tool for measuring what it claims to measure) and reliability (how consistent the test is when used multiple times by a variety of assessors) (McCullough et al., 2015). Still, evaluation systems that included alternative measures, particularly those that were able to identify student growth, demonstrated a wider range of teacher performance than systems without alternate measures (McCullough et al., 2015).

How Do We Measure?

Tools for collecting and using data for teacher evaluations range from value-added statistical models to observation rubrics.

Value-Added Measures.These measures evaluate teachers based on their impact on student test scores, or the value the teacher adds to student achievement (Hanushek, 1971; Rockoff, 2004). Value-added models estimate a teacher’s impact on student test performance using statistical techniques. Recent research found that value-added measures are a good way to demonstrate how teachers can raise student test scores and are significantly correlated with some long-term effects such as the probability of attending college and higher earnings by age 28 (Chetty et al., 2011). Important questions related to value-added measures include:

Do the differences across teachers capture the impact of teachers or differences among students? That is, do value-added measures capture the right information?
What are the lasting impacts of being taught by a teacher with a high value-added score on student outcomes?
How valid are the measures used in value-added analysis? If invalid measures are used, then the analysis is untrustworthy.

Supporters of value-added measures argue that when decisions are made based on value-added measures, students benefit (Gordon, Kane, & Staiger, 2006; Hanushek, 2009). On the other hand, critics argue that value-added measures do not capture teacher quality (Baker et al., 2010; Corcoran, 2010) and that bias may limit the usefulness of value-added measures (Kane & Staiger, 2008; Rothstein, 2010). Finally, there are concerns about the stability and trustworthiness of value-added measures (David, 2010; Goldhaber & Hansen, 2008).

Teacher Observations.Classroom observation provides a measure of what is happening during instruction and aligns individual classroom practice with broader quality teaching practices (Danielson, 2010; RAND, 2012). To be effective, observation frameworks must be subject-specific, must be created in collaboration with content experts, and must provide accurate and useful information (Hill & Grossman, 2013). The Danielson Framework for Teaching (FFT) is a commonly used framework for teacher evaluation (Danielson, 1996, 2007).

Danielson Framework for Teaching.FFT includes an extensive rubric over four domains: planning and preparation, classroom environment, instruction, and professional responsibilities. Across these four domains are 76 elements of teaching broken into four levels: unsatisfactory, basic, proficient, and distinguished. Over time and two iterations (1996 and 2007), FFT has become a widely used tool to capture teaching and learning (Marzano et al., 2011).

Research indicates that FFT has acceptable reliability and validity (Lash, Tran, & Huang, 2016). Specifically, there is a positive correlation between teacher evaluation scores with FFT and value-added measures at the classroom level; the range for average validity across three years is -0.6 to 0.35 (Milanowski, 2011). When there is variability in scores, it is attributable to the teacher and not other variables (Kane & Staiger, 2012; Kane, Taylor, Tyler, & Wooten, 2011). In short, FFT has proved to be tool that has established reliability and validity as a way to capture teacher practice.

What Should Principals and Districts Do With Formal Evaluation Results?

Districts have attempted to tie teacher evaluation results to job security and bonuses. For example, incentive programs have been implemented in districts ranging from Houston to Memphis with mixed results (Atkinson et al., 2008; Blumenthal, 2016; Springer et al., 2010). School leaders should be mindful of how teacher evaluation results are used. Studies have found that merit pay was not connected to improvements in student outcomes or instruction (Fryer, 2011). In addition, creating an atmosphere of competition by connecting evaluation results to sanctions and punishments had a negative effect on workers (Pink, 2011). Instead, school leaders should use results from formal teacher evaluations to bolster student learning by focusing on how teacher actions impact student results within each building or district, and incorporating teacher evaluation findings into a culture of collaboration (DuFour & Mattos, 2013).

Continuum of Research: Does Formal Teacher Evaluation Have a Positive Impact on Student Outcomes?

Currently, much of the research on teacher evaluation focuses on helping us understand what goes into and results from teacher evaluation, less research directly ties formal teacher evaluation to student outcomes. Studies that have addressed the specific impact of teacher evaluation systems on student outcomes have found mixed results.

For example, a study of the mid-career elementary and middle school teachers in the Cincinnati Public Schools Teacher Evaluation System (TES), which used FFT across seven consecutive years, found that teachers were more effective in advancing math achievement during the year in which they were evaluated. The study did not draw conclusions about what in teacher practice influenced the differences in student achievement (Taylor & Tyler, 2012a, 2012b).

The Gates Foundation studied the effects of teacher evaluation systems across three districts (Hillsborough County Public Schools in Florida, Memphis City Schools, and Pittsburgh Public Schools) and four charter management organizations. The information collected from teacher evaluation systems were used to make decisions about staffing, areas of development, and teacher advancement and compensation. The researchers hypothesized that when the right teacher evaluation system was in place, teaching quality would improve and lead to an increase in student achievement. The final report showed no impact on student achievement or graduation rates, particularly for low-income minority students. One possible explanation for this was teacher buy-in; across the sites, 50% of teachers agreed that the evaluation system would benefit students, a percentage that declined over the years of the initiative. Impacts on student achievement were mixed across the schools, perhaps because it takes longer than the time frame of the study to see a clear impact on student outcomes. Also, there were external changes (e.g., changes in state-level policy) that impacted the implementation (Stecher et al., 2018).

Implications of Research

Teacher evaluation is established as a function of education (Shinkfield & Stufflebeam, 1995). Recent studies has established problems and recommendations for best practice in teacher evaluation.

The Widget Effect(Weisburg, Sexton, Mulhern, & Keeling, 2009) reported that teacher evaluation has been:

Infrequent, with teachers going for years without meaningful feedback
Not focused on classroom behaviors or practices that are directly tied to student learning
Limited in scope, with teachers identified as “unsatisfactory” or “satisfactory”
Unhelpful in the type of information provided to teachers
Insignificant, providing information that is not used to shape teachers’ work experience or opportunities

In response, the New Teacher Project (2010) proposed that teacher evaluations should:

Occur annually, to provide feedback over the course of a teacher’s career
Be conducted using clear, rigorous performance expectations based on student learning
Include multiple measures (e.g., value-added models, classroom observation data) and ratings to ensure that the range of a teacher’s work is represented
Provide a range of achievement levels, such as the four summative ratings of the Danielson Framework for Teaching
Provide information that can be incorporated into ongoing conversation and development throughout the year
Produce information relevant to teachers and with implications, both positive and negative, for the overall development of the system and individual classrooms

Best practices for using student achievement data in teacher evaluation are also being established. Steele, Hamilton, and Stecher (2010) determined that evaluation systems should:

Incorporate multiple measures of teacher effectiveness to increase validity, reduce measurement error, and capture the range of teaching roles beyond the ones regularly tested
Ensure that assessment data is reliable and valid, particularly in high-stakes contexts
Ensure consistency by providing clear parameters for the selection of measures and using the same measures across classrooms
Use multiple years of student data for value-added estimates to increase accuracy and precision of the estimates
Find ways to incorporate all students, including those who are not with the teacher for the full year, and teachers who are not easily incorporated into value-added models.

Cost-Benefit

The cost-benefit of formal teacher evaluation will vary from district to district, depending on such considerations as type of information gathered, tools used, and staffing involved (Peterson, 2000). One study found that the cost to start a teacher evaluation system across three districts ranged from $8 to $115 per student, which amounted to 0.4% to 0.5% of total district spending (Chambers, Brodziak de los Reyes, & O’Neill, 2013).

Conclusion

Formal teacher evaluation is integrated into many state and district policies, and, even with shifts in federal focus under ESSA, is likely to remain common practice. The goal of formal teacher evaluation is to collect data that accurately represents teacher practice and the connection to student achievement in a valid and reliable way, and use that information to improve the system for teaching and learning. Although conclusions about the impact of teacher evaluation on student achievement are mixed (Stecher et al., 2018; Taylor & Tyler, 2012a, 2012b), ideally collecting and using information about teacher practice can advance the conversation about quality instruction and teaching potential.

Citations

Aragon, S. (2018). Teacher evaluations: What is the issue and why does it matter? Policy snapshot.Denver, CO: Education Commission of the States. Retrieved from https://www.ecs.org/wp-content/uploads/Teacher_Evaluations.pdf

Atkinson, A., Burgess, S., Croxon, B., Gregg, P., Propper, C., Slater, H., & Wilson, D. (2008). Evaluating the impact of performance-related pay for teachers in England. Labour Economics, 16(3), 251–261. doi.org/10.1016/j.labeco.2008.10.003

Baker, E. L., Barton, P. E., Darling-Hammond, L., Haertel, E., Ladd, H. F., Linn, R. L., Ravitch, D., …Shepard, L. A. (2010). Problems with the use of student test scores to evaluate teachers(Briefing Paper 278). Washington, DC: Economic Policy Institute.

Blumenthal, R. (2016, January 13). Houston ties teachers’ pay to test scores. The New York Times.Retrieved from https://www.nytimes.com/2006/01/13/us/houston-ties-teachers-pay-to-test-scores.html

Chambers, J., Brodziak de los Reyes, I., & O’Neil, C. (2013). How much are districts spending to implement teacher evaluation systems: Case studies of Hillsborough County Public Schools, Memphis City Schools, and Pittsburgh Public Schools.Santa Monica, CA: RAND Corporation. Retrieved from www.rand.org/content/dam/rand/pubs/working_papers/WR900/WR989/RAND_WR989.pdf

Chetty, R., Friedman, J. N., & Rockhoff, J. E. (2011). The long-term impacts of teachers: Teacher value-added and student outcomes in adulthood(Working Paper 17699). Cambridge, MA: National Bureau of Economic Research. Retrieved from https://standardizedtests.procon.org/sourcefiles/the-long-term-impacts-of-teachers-teacher-value-added-and-student-outcomes-in-adulthood.pdf

Corcoran, S. P. (2010). Can teachers be evaluated by their students’ test scores? Should they be? The use of value-added measures for teacher effectiveness in policy and practice. Providence, RI: Annenburg Institute for School Reform at Brown University.

Danielson, C. (1996, 2007). Enhancing professional practice: A framework for teaching (1st and 2nd eds).Alexandria, VA: ASCD.

Danielson, C. (2010). Evaluations that help teachers learn. Educational Leadership, 68(4), 35–39.

Darling-Hammond, L., Wise, A. E., & Pease, S. R. (1983). Teacher evaluation in the organizational context: A review of the literature. Review of Educational Research, 53(3),285–328. doi: 10.3102/00346543053003285

David, J. L. (2010). What research says about using value-added measures to evaluate teachers. Educational Leadership, 67(8), 81–82. Retrieved from http://www.ascd.org/publications/educational_leadership/may10/vol67/num08/Using_Value-Added_Measures_to_Evaluate_Teachers.aspx

DuFour, R., & Mattos, M. (2013). How do principals really improve schools? Education Leadership, 70(7), 34–40.

Fink, S., & Markholt, A. (2011). Leading for instructional improvement: How successful leaders develop teaching and learning expertise.Hoboken, NJ: John Wiley & Sons.

Fryer, R. G. (2011).Teacher incentives and student achievement: Evidence from New York City schools(Working Paper 16850). Cambridge, MA: National Bureau of Economic Research. Retrieved from http://www.nber.org/papers/w16850.pdf

Goe, L., Bell, C., & Little, O. (2008). Approaches to evaluating teacher effectiveness: A research synthesis. Washington, DC: National Comprehensive Center for Teacher Quality. Retrieved from https://eric.ed.gov/?id=ED521228

Goe, L., Holdheide, L., & Miller, T. (2011). A practical guide to designing comprehensive teacher evaluation systems: A tool to assist in the development of teacher evaluation systems.Washington, DC: National Comprehensive Center for Teacher Quality. Retrieved from https://files.eric.ed.gov/fulltext/ED520828.pdf

Goldhaber, D., & Hansen, M. (2008). Is this just a bad class? Assessing the stability of measured teacher performance(Working Paper 2008-5). Seattle, WA: Center on Reinventing Public Education, University of Washington.

Gordon, R., Kaine, T. J., & Staiger, D. O. (2006). Identifying effective teachers using performance on the job(Hamilton Project Discussion Paper). Washington, DC: The Brookings Institute.

Griffith, D., & McDougald, V. (2016). Undue process: Why bad teachers in twenty-five diverse districts rarely get fired.Washington, DC: Thomas B. Fordham Institute. Retrieved from https://edexcellence.net/publications/undue-process

Hanushek, E. A. (1971). Teacher characteristics and gains in student achievement: Estimation using micro-data. American Economic Review, 61(2), 280–288.

Hanushek, E. A. (2009). Teacher deselection.In D. Goldhaber & J. Hannaway (Eds.), Creating a new teacher profession(pp. 165–180). Washington, DC: Urban Institute Press.

Hazi, H. M., & Arredondo Rucinski, D. (2009). Teacher evaluation as a policy target for improved student learning: A fifty-state review of statute and regulatory action since NCLB. Education Policy Analysis Archive, 17(5).

Hill, H., & Grossman, P. (2013). Learning from teacher observations: Challenges and opportunities posed by new teacher evaluation systems. Harvard Educational Review, 83(2),371–384. doi.org/10.17763/haer.83.2.d11511403715u376

Kane, T. J., & Staigler, D. O. (2008). Estimating teacher impacts on student achievement: An experimental evaluation (Working Paper 14607). Cambridge, MA: National Bureau of Economic Research.

Kane, T. J., & Staigler, D. O. (2012). Gathering feedback for teaching: Combining high-quality observations with student surveys and achievement gains.Seattle, WA: Bill and Melinda Gates Foundation.

Kane, T. J., Taylor, E. S., Tyler, J. H., & Wooten, A. L. (2011). Identifying effective classroom practices using achievement data. Journal of Human Resources, 46(3), 587–613.

Lash, A., Tran, L., & Huang, M. (2016). Examining the validity of ratings from a classroom observation instrument for use in a district’s teacher evaluation system(REL 2016-135). Washington, DC: U.S. Department of Education, Institute of Education Sciences, National Center for Education Evaluation and Regional Assistance, Regional Educational Laboratory West.

Marzano, R. J., Frontier, T., & Livingston, D. (2011). Effective supervision: Supporting the art and science of teaching. Alexandria, VA: ASCD.

Marzano, R. J., Waters, T., & McNulty, B. A. (2005).School leadership that works: From research to results. Alexandria, VA: ASCD.

McCullough, M., English, B., Angus, M. H., & Gill, B. (2015). Alternative student growth measures for teacher evaluation: Implementation experiences of early-adopting districts (REL 2015-093). Washington, DC: U.S. Department of Education, Institute of Education Sciences, National Center for Education Evaluation and Regional Assistance, Regional Educational Laboratory Mid-Atlantic.

McDougald, V., Griffith, D., Pennington, K., & Mead, S. (2016). What is the purpose of teacher evaluation today? A conversation between Bellwether and Fordham. Retrieved from https://edexcellence.net/articles/what-is-the-purpose-of-teacher-evaluation-today-a-conversation-between-bellwether-and

Milanowski, A. T., (2011, April). Validity research on teacher evaluation systems based on the framework for teaching.Paper presented at the annual meeting of the American Educational Research Association, New Orleans, LA. Retrieved fromhttps://files.eric.ed.gov/fulltext/ED520519.pdf

The New Teacher Project. (2010). Teacher Evaluation 2.0.New York, NY: Author. Retrieved from: https://tntp.org/assets/documents/Teacher-Evaluation-Oct10F.pdf

Pennington, K., & Mead, S. (2016). For good measure? Teacher evaluation policy in the ESSA era. Washington, DC: Bellwether Education Partners. Retrieved from https://bellwethereducation.org/publication/good-measure-teacher-evaluation-policy-essa-era

Peterson, K. D. (2000). Teacher evaluation: A comprehensive guide to new directions and practices (2nd ed.).Thousand Oaks, CA: Corwin Press.

Pink, D. H. (2011). Drive: The surprising truth about what motivates us.New York, NY: Riverhead Books.

RAND Education. (2012).Teachers matter: Understanding teachers’ impact on student achievement, Santa Monica, Calif.: Author. Retrieved from https://www.rand.org/pubs/corporate_pubs/CP693z1-2012-09.html

Rockoff, J. E. (2004). The impact of individual teachers on student achievement: Evidence from panel data. American Economic Review, 94(2), 247–252.

Rothstein, J. (2010). Teacher quality in educational production: Tracking, decay, and student achievement. Quarterly Journal of Economics, 125(1),175–214.

Sawchuk, S. (2015, September 3). Teacher evaluation: An issue overview. Education Week.Retrieved from: www.edweek.org/ew/section/multimedia/teacher-performance-evaluation-issue-overview.html

Shinkfield, A. J., & Stufflebeam, D. L. (1995). Teacher evaluation: Guide to professional practice.New York, NY: Springer.

Springer, M. G., Ballou, D., Hamilton, L., Le, V., Lockwood, J. R., McCaffrey, D. F., …Stecher B.M. (2010).Teacher pay for performance: Experimental evidence from the project on incentives in teaching (POINT).Nashville, TN: National Center on Performance Incentives at Vanderbilt University.

Stecher, B. M., Holtzman, D. J., Garet, M. S., Hamilton, L. S., Engberg, J., Steiner, E. D., …Chambers, J. (2018). Improving teaching effectiveness: Final report: The intensive partnerships for effective teaching through 2015–2016.Santa Monica, CA: RAND Corporation.

Steele, J. L., Hamilton, L. S., & Stecher, B. M. (2010). Incorporating student performance measures into teacher evaluation systems.Santa Monica, CA: RAND Corporation. Retrieved from: https://www.rand.org/pubs/technical_reports/TR917.html

Taylor, E. S., & Tyler, J. H. (2012a). Can teacher evaluation improve teaching? Evidence of systematic growth in the effectiveness of mid-career teachers. Education Next, 12(4), 79–84. Retrieved from http://educationnext.org/can-teacher-evaluation-improve-teaching/

Taylor, E. S., & Tyler, J. H. (2012b). The effect of evaluation on teacher performance. American Economic Review, 102(7), 3628–3651.

Toch, T., & Rothman, R. (2008). Rush to judgment: Teacher evaluation in public education.Washington, DC: Education Sector. Retrieved from https://eric.ed.gov/?id=ED502120

Weisburg, D., Sexton, S., Mulhern, J., & Keeling, D. (2009). The widget effect: Our national failure to acknowledge and act on the difference in teacher effectiveness. New York, NY: The New Teacher Project. Retrieved from https://tntp.org/assets/documents/TheWidgetEffect_execsummary_2nd_ed.pdf

Publications

TITLE

SYNOPSIS

CITATION

LINK

Seeking the Magic Metric: Using Evidence to Identify and Track School System Progress

This paper discusses the search for a “magic metric” in education: an index/number that would be generally accepted as the most efficient descriptor of school’s performance in a district.

Celio, M. B. (2013). Seeking the Magic Metric: Using Evidence to Identify and Track School System Quality. In Performance Feedback: Using Data to Improve Educator Performance (Vol. 3, pp. 97-118). Oakland, CA: The Wing Institute.

Education Drivers

Formal Evaluation

Publications

Data Mining

Presentations