Educators are always looking for instructional methods that are effective and efficient. Effective interventions can vary with respect to how rapidly content is learned. Efficient methods result in rapid learning of content. Part of determining learning is establishing a mastery criterion (i.e., 90% correct over a specified number of days). The most common method for determining mastery is to establish a mastery criterion for a set of instructional content (i.e., sight words, math facts). Mastery is assumed when the percent correct score is at or above the mastery level (i.e., 90% correct). This approach may obscure that some items in the set have not been mastered, but the aggregate score is at mastery. Another way to determine mastery is to calculate it at the level of the individual item (individual sight words). Once an item is mastered, it is removed from the list, and a new item is added. The question is which approach results in greater learning? A recent study by Wong and Fienup (2022) was designed to answer the question, at least for sight words. Their results suggest that the individual item approach resulted in greater acquisition and required less time to achieve mastery of an item. An additional analysis in this small study was to compare the retention of items four weeks following the end of teaching. There were very small differences between the two approaches to instruction. For one participant, maintenance was 100% for both approaches. For a second participant, the individual item approach resulted in better maintenance scores. For the third participant, the set approach produced a slightly higher maintenance score.
The results of this study are important in that they suggest that the commonly used set approach is less efficient at producing initial acquisition and has no advantage with respect to the maintenance of mastered items. Implementing the individual item approach could be relatively simple. The only real change would be to analyze responding at the level of the individual item rather than aggregating data at the set level. As the student progresses through additional set lists and more difficult items are added, the student’s failure to have mastered all of the content may lead to more errors and failure experiences. If we can accelerate learning by making mastery decisions at the individual item level, consider how much more can be learned over the course of a school year. These simple changes may result in great benefit to students.
Wong, K. K., & Fienup, D. M. (2022). Units of analysis in acquisition‐performance criteria for “mastery”: A systematic replication. Journal of Applied Behavior Analysis.
Teacher evaluation is ubiquitous in US public schools. Typically, it involves a principal observing a teacher several times over the course of a school year. In an effort to standardize ratings, a scoring rubric is followed; however, the ratings are ultimately subjective, and the items on the rubric are subject to interpretation. One of the primary functions of teacher evaluations is to provide accurate feedback to teachers and encourage improvement when needed. A persistent question regarding teacher evaluation is if evaluation scores are inflated? There is some research suggesting this is the case; however, little is known about the reasons for inflating the scores. A recent study by Jones, Bergin, and Murphy (2022) attempted to determine if principals inflated scores and, if so, their motivation for doing so. Using a mixed method approach that utilized both focus groups and a survey of a large group of principals, principals identified several goals in addition to providing accurate ratings. Those additional goals were to (1) keep teachers open to growth-promoting feedback, (2) support teachers’ morale and foster positive relationships, (3) avoid difficult conversations, (4) maintain self-efficacy as an instructional leader, and (5) manage limited time wisely. These additional goals were offered as reasons to inflate scores, even if by small amounts. For the most part, these are worthy goals and suggest that teacher evaluation is more complicated than simply applying a scoring rubric while observing a teacher.
In general, principals are more likely to inflate ratings if they are linked to high-stakes outcomes such as requiring an improvement plan for the teacher or making retention decisions. Principals are reluctant to give lower ratings if it results in them having to engage in activities that require more time, such as additional meetings to develop improvement plans or to carefully document the reasons for recommending against retention. Also, by inflating ratings, principals avoid having difficult conversations with a teacher.
The principals’ worry was that if they gave a lower rating, teachers would become defensive and less open to feedback and growth. They also feared that low ratings would lover staff morale and positive relationships would be harmed. These concerns are not without merit. On a rating scale that ranges from 1-7, a rating of 4 is considered a low rating by the teacher, but a 5 is considered acceptable. The difference of one point is considered small by the principal. Since there is room for judgment in the scoring rubric giving a more positive rating will do no harm from the principal’s perspective.
Based on the research by Jones, Bergin, and Murphy (2022), these situations are relatively common. Overlooked in the principals’ perspective is that there is little consideration given to the impact these decisions have on students. It is unknown what effect these decisions are having on student outcomes. For a complete understanding of the evaluation of teachers, it is important to understand all of the effects of evaluations of teachers.
Citation for article:
Jones, E., Bergin, C., & Murphy, B. (2022). Principals may inflate teacher evaluation scores to achieve important goals. Educational Assessment, Evaluation, and Accountability, 34(1), 57-88.
Teachers report that behavior management is one of the greatest challenges in the profession and they feel unprepared to deal with difficult behavior. One of the questions to be answered is where do teachers get information about behavior management? Recently, Beahm, Yan, and Cook (2021) conducted a mixed methods study to answer this question. It is important that teachers rely on practices that have a good empirical base. Failure to do so may have no effect or make the problem worse. If we understand the resources teachers rely on and why, then more systematic, informed approaches can be taken to assure they are relying on credible information. This may help us close the research-to-practice gap. Beahm et al. surveyed 238 teachers to learn about the resources they relied on for behavior management information. They also did focus groups with 10 of the teachers to gain insight into why they preferred some resources more than others. Teachers preferred getting information from colleagues by a large margin (91%) relative to any other source, including research articles, the internet, administrators, and professional development. Ninety-two percent reported the information from colleagues was understandable. Teachers had a positive perception of all attributes of the information from colleagues (trustworthiness, usability, accessibility, and understandability). Participants in the focus group reported that colleagues were understandable because they used familiar language and avoided jargon. In addition, colleagues were perceived to provide exact details on implementing the recommended practice.
Participants in the focus group indicated colleagues were more trustworthy because they were going to only describe practices they had used successfully. The participants also thought that colleagues had knowledge of their classrooms and students.
Finally, colleagues were perceived as providing information that was usable because they likely had developed easy-to-use forms and data collection systems. In other words, the colleagues were an efficient source of information, saving the classroom teacher from the extra work of developing forms and data sheets for themselves.
These data are consistent with the recommendations of Rogers (2003), who reported that practices were more likely to be adopted if they were recommended by a credible source. Colleagues use language that is already familiar and have in-depth knowledge of the circumstances that the teacher is concerned with.
Researchers will be well served to attend to these data if they want to close the research-to-practice gap. They should develop materials that rely on the language teachers already use, create step-by-step user guides, and provide video samples of the practice in actual application. Finally, researchers should recruit teachers to be champions for a research-based practice rather than relying on researchers to disseminate practices. This would represent a change in the way researchers go about doing business. It will be worth the effort because the research-to-practice gap has been persistent for decades. It is time we try new ways to disseminate effective practices.
Beahm, L. A., Yan, X., & Cook, B. G. (2021). Where Do Teachers Go for Behavior Management Strategies? Education and Treatment of Children, 44(3), 201-213.
Link to article:
References: Rogers, E. M. (2003). Diffusion of Innovations (5th ed.). New York: Free Press
The measurement of treatment integrity is important any time an intervention is implemented. The measurement of treatment integrity is complex when assessing it at the level of universal intervention for an entire school. Should we measure integrity at the level of the school or at the level of the individual classroom? When assessed at the level of the whole school, we know how the school, in general, is performing; however, this may obscure how well an individual classroom is implementing the universal intervention. Assessment at the level of the classroom is important for making decisions regarding an individual student’s need for more intensive interventions (Tier 2 or Tier 3). If the universal intervention has not been implemented well, then it is difficult to know if the student’s failure to perform is a function of poor implementation or if the student requires more intensive support.
Given the importance of decision-making in multi-tiered systems of support, little is known about how integrity is measured. In a recent study by Buckman et al. (2021), treatment integrity of universal interventions was mapped in terms of frequency of measuring, the method used to assess, and unit of analysis (whole school or individual classroom). A systematic review of the published literature since 1996 resulted in 42 articles being included in this review. Over 86% of the articles reported procedures for monitoring integrity, and 76% reported quantifiable data for Tier 1 treatment integrity. These are encouraging data. The most common method for assessing treatment integrity was self-report (90%). Self-report measures are efficient, but there is the risk of the reports being inflated over what actually occurred. It is easy to understand why self-report is utilized so commonly, given the resource demands associated with measuring integrity across an entire system; however, much more research needs to be done to establish conditions for the self-reports to be valid measures. Direct observation was used least often to assess treatment integrity (18.75%). The resource demands make it very difficult to use even though it is most likely to yield the most valid data. Procedures to balance the efficiency and effectiveness of different methods for assessing integrity have yet to be fully developed.
Monitoring of treatment integrity occurred 81% of the time at the school level. 40% of the studies assessed treatment integrity at the level of the individual classroom. These measures are not mutually exclusive. In some instances, integrity was measured at both levels. Of the studies reviewed, 57% measured integrity one time per year. This raises questions about the representativeness of the data, especially when the data were most often collected at the level of the entire school. School-wide measurement obscures implementation at the classroom level, and measuring only one time per year may further obscure variables that influence the obtained data point. There is no established standard for the frequency of measuring integrity at the universal level of intervention. It could be argued that these measures should be employed at the same frequency at which decisions are made regarding students’ need for additional services. For example, if school-wide data are reviewed three times per year, then integrity measures should occur three times per year. This would allow decision-makers to track changes in integrity across time and determine if student performance reflects changes in integrity. All of this is done to increase the validity of decisions regarding the level of support required for individual students.
There are challenges to assessing integrity at the universal level. Considerable resources are required to assess across an entire school, especially when measuring at the level of the individual classroom. Efficient and effective systems that can be employed by existing school resources are necessary and have yet to be developed. The importance of these systems cannot be overstated. High-stakes decisions about students’ futures are being made based on their performance at the universal level of instruction. It is essential that the decisions are based on valid data, including treatment integrity data.
Citation: Buckman, M. M., Lane, K. L., Common, E. A., Royer, D. J., Oakes, W. P., Allen, G. E., … & Brunsting, N. C. (2021). Treatment integrity of primary (tier 1) prevention efforts in tiered systems: Mapping the literature. Education and Treatment of Children, 44(3), 145-168.
Education decision makers have to consider many variables when adopting an intervention. In addition to evidence of effectiveness, they must consider local context, the capacity of the school to implement the program, resource availability, and stakeholder values. The complexity of the decision-making makes it likely that without a decision-making framework the decision-making task is so complex it is probable that some decision-makers will rely on processes that are influenced by personal biases rather than a systematic approach. There are several decision-making frameworks available to guide the process but many have not been empirically evaluated. Hollands and colleagues (2019) evaluated a cost-utility framework as a tool to guide decisions. This approach relies on multiple sources of evidence to identify values of the decision-makers, the “experiential evidence” of stakeholders that have implemented similar interventions, the problem the alternative solutions are to solve, and the criteria for evaluating each dimension of a decision.
In this project, the users evaluated the framework in three phases. In the first phase principals, assistant principals, teacher leaders, and teachers enrolled in a principal preparation program, were assigned to small groups to implement the first six steps of the decision-making framework. Although performance on each of the steps ranged considerably, approximately one-third of the groups completed each of the steps in the decision-making framework within the available time. The authors suggested that factors such as complexity of the decision, alignment of the vision of the group members, and the emergence of a leader to keep the process moving forward, influenced performance on each of the six steps in the framework.
The second phase of the project was to survey participants about the usefulness of the cost-utility decision-making framework. A large majority of the participants had a positive view of the process and thought it would be valuable to apply in their day-to-day work. A few participants identified that the process was time consuming and may limit the application of the framework.
The final phase of the study selected three assistant principals to apply the cost-utility framework in their work in their schools. Two of the three participants reported that although it was time-consuming, it helped clarify decision options, and the stakeholders to be involved in the decision. The third participant was not able to reach a decision problem within the available time. This participant also reported that some decisions were imposed by district administration, subverting the cost-utility decision-making process.
It seems that this framework has the potential value to guide decision-making in the complex environments of public schools. The time-consuming feature of the process suggests that educators may need additional coaching and support as they develop competencies in applying the framework. Streamlining the steps in the process will be a significant step toward increasing the usability of the tool.
Citation: Hollands, F., Pan, Y., & Escueta, M. (2019). What is the potential for applying cost-utility analysis to facilitate evidence-based decision making in schools? Educational Researcher, 48(5), 287-295.
Evidence-based interventions have the potential to improve educational outcomes for students. Often these programs are introduced with an initial training but once the training has been completed often there is no additional follow-up support available. This can result in the educational initiative not being fully adopted and frequently abandoned soon after initial adoption. To change this cycle, on-going coaching or implementation support has been suggested as an alternative. The current study by Owen and colleagues evaluated the impact of implementation supports on student outcomes who participated in the implementation of Say All Fast Minute Every Day Shuffled (SAFMEDS). This program is designed to promote fast and accurate recall. In this instance, the goal was to increase fluency with math facts. This was a large randomized trial in which teachers received training on implementing SAFMEDS, and following training were assigned to either a No Support group, or an Implementation Support Group. Implementation Support consisted of three face-to-face meetings with a teacher and email contact initiated by the teacher. All of the students in the study had been identified as performing below standards for their age. The results suggest that across grade levels (Grade 1-2 and Grades 3-5) Implementation Supports resulted in small effect size improvements compared to the No Support Group. For Grades 1-2, the effect size was d=0.23 and for Grades 3-5 d=0.25. These are relatively small effect sizes; however, they are larger than the average effect sizes reported in the professional development literature that apply coaching elements to math programs. It should also be noted that the Implementation Supports consisted of three hours across a school year. This is a relatively low intensity dose of support and one that is likely to be practical in most school contexts.
The important take-away from this research is that some level of Implementation Support will likely be necessary to gain benefit from empirically-supported interventions such as SAFMEDS. The challenge for researchers is to identify the minimum dosage of Implementation Support to improve outcomes and the critical components of the Implementation Support so that it is efficient and effective.
Citation: Owen, K. L., Hunter, S. H., Watkins, R. C., Payne, J. S., Bailey, T., Gray, C., … & Hughes, J. C. (2021). Implementation Support Improves Outcomes of a Fluency-Based Mathematics Strategy: A Cluster-Randomized Controlled Trial. Journal of Research on Educational Effectiveness, 14(3), 523-542.
At the core of evidence-based education is data-based decision making. Once an empirically-supported intervention has been adopted, it is necessary to monitor student performance to determine if the program is being effective for an individual student. Educators report needing assistance in determining what to do with the student performance data. Often, external support for educators to successfully navigate the decision-making process is necessary because many training programs are not sufficient.
A recent meta-analysis by Gesel and colleagues (2021) examined the impact of professional development on teaches knowledge, skill, and self-efficacy in data-based decision making. The knowledge was assessed by a multiple-choice test to determine if teachers understood the concepts of data-based decision making. It was not a measure of teachers’ application of that knowledge. Skill was the direct measure how how well teachers applied their knowledge of data-based decision making. In most instances, this was assessed under ideal conditions with intense support from researchers and consultants. Self-efficacy was a measure of the teachers’ confidence to implement data-based decision making. The overall effect size for the combined measures was 0.57 which is generally considered a moderate effect; however, the effect sizes for the individual items varied significantly (Knowledge range of effect size from -0.02 to 2.28; Skill range -1.25 to 1.96; self-efficacy range -0.08 to 0.78). The ranges for each of the measures suggests that the average effect size of 0.57 does not adequately reflect the effects of professional development. The variability could be a function of the specific training methods used in each of the individual studies but the training methods were not described in this meta-analysis. It should be noted that all of the studies in this meta-analysis was conducted with intensive support from researchers and consultants. It is not clear how the results of this meta-analysis are generalizable to more standard conditions found in teacher preparation programs and professional development.
Given the importance of data-based decision making to student progress, there is considerable work to be done to identify effective and efficient training methods. It appears that we are a long way from this goal. Ultimately, the goal is for data-based decision making to be standard practice in every classroom in the United States. This will require identifying the critical skills necessary and the most effectiveness method for teaching those skills.
Citation: Gesel, S. A., LeJeune, L. M., Chow, J. C., Sinclair, A. C., & Lemons, C. J. (2021). A meta-analysis of the impact of professional development on teachers’ knowledge, skill, and self-efficacy in data-based decision-making. Journal of Learning Disabilities, 54(4), 269-283.
Contextual fit refers to the extent that procedures of the selected program are consistent with the knowledge, skills, resources, and administrative support of those who are expected to implement the plan. Packaged curricula and social programs are developed without a specific context in mind; however, when implementing that program in a particular context, it will often require some adaptations of the program or the setting to increase the fidelity of implementation. One challenge to improving contextual fit is to determine which features of the program or the environment need to be adapted to improve fit.
A recent study by Monzalve and Horner (2021) addressed this question. The authors developed the Contextual Fit Enhancement Protocol to identify components of a behavior support plan to adapt. The logic of the study was that by increasing contextual fit, fidelity of implementation would increase, and student outcomes would be improved. Four student-teacher dyads were recruited. To be included in the study, the student had an existing behavior support plan that was judged technically adequate but was being implemented with low fidelity. During baseline, no changes were made to the plan. The percentage of the support plan components implemented was measured as well as student behavior. Following baseline, researchers met with the team responsible for implementing the plan and reviewed the Contextual Fit Enhancement Protocol. During this meeting the goals and procedures of the plan were confirmed, the contextual fit of the current plan was assessed, specific adaptations to the plan were made to increase contextual fit, and an action plan for implementing the revised plan was developed. Researchers continued to measure fidelity of implementation and student behavior. After at least 5 sessions of implementing the revised plan, the implementation team met with the researcher to re-rate the original plan and the revised plan for contextual fit. Items that were rated low were again reviewed and adapted. Following the review of the Contextual Fit Enhancement Protocol and revised plan, fidelity of implementation increased substantially and student problem behavior decreased.
There are two important implications from this study. First, there is no reason to assume that the initial version of the plan or even a revised version of the plan will get everything right because intervention is complex. This is an iterative process. Periodic reappraisal of the plan is necessary. The second important point is that student behavior is a function of the technical adequacy of the plan and how well that plan is implemented. If a plan is technically adequate, is a good contextual fit, and is implemented with high levels of fidelity (even with less than 100%), then positive student outcomes will most likely be achieved.
Monzalve, M., & Horner, R. H. (2021). The impact of the contextual fit enhancement protocol on behavior support plan fidelity and student behavior. Behavioral Disorders, 46(4), 267-278.
Kendra Guinness of the Wing Institute at Morningside provides an excellent summary of the importance of contextual fit and how it can enhance the implementation of evidence-based practices. Practices are often validated under very different conditions than the usual practice settings. In establishing the scientific support for an intervention, researchers often work very closely with the research site providing close supervision and feedback, assuring that all necessary resources are available, and training the implementers of the components of the intervention. In the usual practice settings, the intervention is often implemented without all of the necessary resources, and training and feedback are limited. As a result, the program as developed is not a good fit with the local circumstances within a school or classroom. In this overview, Ms. Guinness defines contextual fit, describes the key features of it, and summarizes the empirical evidence supporting it.
Briefly, contextual fit is the match between the strategies, procedures, or elements of an intervention and the values, needs, skills, and resources available in the setting. One of the best empirical demonstrations of the importance of contextual fit is research by Benazzi et al. (2006). Behavior support plans were developed in three different ways: (1) behavior support teams without a behavior specialist (2) behavior support teams with a behavior specialist, and (3) behavior specialists alone. The plans were rated for technical adequacy and contextual fit. The plans developed by the behavior specialist alone or teams with a behavior specialist as part of the team were rated highest. When the behavior support plans were rated for contextual fit, plans developed by teams, with or without a behavior specialist, were rated higher than plans developed by behavior specialists alone.
Additional evidence of the importance of context fit comes from research by Monzalve and Horner (2021). They evaluated the effect of the Contextual Fit Enhancement Protocol. First, they had teachers implement a behavior support plan without feedback from researchers and measured fidelity of implementation and the level of student problem behavior. Subsequently, the researchers met with the implementation team and reviewed the goals of the plan, the procedures, identified adaptations to improve the contextual fit, and planned next steps for implementing the revised behavior support plan. Before the team meeting, the intervention plan was implemented with 15% fidelity and student problem behavior occurred during 46% of the observation period. Following the meeting, fidelity of implementation increased to 83% and problem behavior was reduced to 16% of the observation period.
These data clearly suggest that intervention does not occur in a vacuum and there are variables other than the components of the intervention that influence its implementation and student outcomes. Much more needs to be learned about adapting interventions to fit a particular context without reducing the effectiveness of the intervention.
Guinness, K. (2022). Contextual Fit Overview. Original paper for the Wing Institute.
Benazzi, L., Horner, R. H., & Good, R. H. (2006). Effects of behavior support team composition on the technical adequacy and contextual fit of behavior support plans. Journal of Special Education, 40(3), 160–170. Monzalve, M., & Horner, R. H. (2021). The impact of the contextual fit enhancement protocol on behavior support plan fidelity and student behavior. Behavioral Disorders, 46(4), 267–278. https://doi.org/10.1177/0198742920953497
Performance feedback is often considered a necessary part of training educators. the challenge is to provide the feedback in a timely manner so that it positively impacts skill acquisition. Often times, the feedback is delayed by hours, or days, which may limit the impact of the feedback. Real-time performance feedback is considered optimal, but may be considered unfeasible in many educational contexts.
One option is to provide feedback utilizing technology such as “bug in the ear” to deliver feedback in real-time. Sinclair and colleagues (2020) conducted a meta-analysis to determine if feedback delivered via technology could be considered to empirically-supported. In the review, 23 studies met inclusion criteria. Twenty-two of the studies were single case designs and one was a group design. The reported findings were that real-time performance feedback is an effective method for increasing skill acquisition of educators. The authors cautioned that this type of feedback is an intensive intervention and suggested that it is not feasible to use for training all teachers. They suggest that it should be considered an intervention when other training methods have not proven effective.
In this context, it becomes feasible to support those educators that have not benefitted from less intensive interventions. If it is considered part of a multi-tiered system of support for educators, it can play an important role in training. It can improve the performance of educators and perhaps reduce turnover because it allows educators to develop the skills to be successful.
Sinclair, A. C., Gesel, S. A., LeJeune, L. M., & Lemons, C. J. (2020). A review of the evidence for real-time performance feedback to improve instructional practice. The Journal of Special Education, 54(2), 90-100.