Evaluating the Evaluators: The Utility of SETs in Educational Leadership

A very difficult question in educational management today is how to accurately evaluate teacher performance. The most commonly used means of evaluating instructor performance are the evaluations filled out by the students at the end of each semester. This article will examine the utility of student evaluations in determining teacher success and as the basis of coaching to improve.

The starting point of this article is an organization based on love as opposed to fear. Fear of failure and job loss can indeed be a strong motivator, but quality work is unlikely to ensue. In a fear-based environment, good employees will quit, and “even if people stay with the organization, they typically don’t perform up to their real capabilities” (Daft, 2015). In a fear-based environment, evaluations will be viewed as another way the employee can be “got.”

With love comes the ability for honest feedback. Feedback is critical for successful coaching. Followers require timely, specific feedback targeted at future improvement. This includes negative feedback, but done impersonally with an eye on improvement rather than a threat overhead. Coaching also includes communicating candidly, with leaders being crystal clear about what they expect moving forward (Daft, 2015).

At the university level, feedback is largely based on Student Evaluations of Teaching (SETs). These evaluations typically consist of a series of closed-ended questions given at semester’s end. How useful are these evaluations in assessing performance? Diette and Kester (2015) found that student evaluations do measure quality of instruction to some degree. They found that five variables impact the perceived quality of instruction: clear communication, meaningful and conscientious evaluation of student work, instructor enthusiasm, course organization and the approachability of the instructor.

On the other hand, Pounder’s literature review found that students punish instructors who give significant homework, give more quizzes and grade hard. Students with higher grades give higher SET scores accounting for between nine and 20% of the variance in responses. Relationship and rapport with students is also an extremely important variable. He also found that smaller classes and classes scheduled toward the end of the week result in higher scores (Pounder, 2007). Some of these points can be a sign of high ethics and quality instruction, but more work for the students, thus lower scores for the instructor.

SETs can also be compared to the rating system employed on Professors are rated on a five point Likert scale in terms of “overall quality” and “level of difficulty” along with a chilli pepper for “hot” teachers and no chilli pepper for teachers who are “not hot.” Constand and Pace (2014) found that there is a significant positive relationship between easiness and “overall quality” of the professor. Additionally, there is a positive correlation between “hotness” and “overall quality.” Obviously, no serious institution will limit its hiring to “hot” teachers who give easy classes, but evaluations do tend to favour this kind of professor.

On a cynical note, Uijtdehaage and O’Neal slipped a fictitious lecturer into the stack of evaluations given to pre-clinical medical students at UCLA. They found that only 34% of students gave the decoy an “N/A” with 66% assigning a score from 1 to 5. Even when they added a picture that matched no one on faculty, 49% of students gave a score to the decoy (Uijtdehaage and O’Neal, 2015). With half to two-thirds of students scoring a person who doesn’t exist, this draws questions onto the entire exercise of SETs.

My research has outlined something I have noticed oftentimes in my work around the world. Teacher scores truly are really linked to easiness and grades. I try to challenge my students to work hard and stretch themselves and I mark honestly, but some of my colleagues inflate grades and give very easy assignments, so I tend to get lower scores than them. I have also noticed as an administrator that the teachers with the highest class averages tend to make the easiest quizzes and give the most questionable materials, even including in some cases incorrect points in their materials. Yet, they are the “best” teachers according to the students.

All that being said, how can SETs be used effectively? All of the sources noted above and others researched by the author agree that SETs are useful in evaluating teaching performance. However, they also are in universal agreement that most institutions over-rely on them, having SETs make up the majority of the teaching component of instructor’s performance appraisals. Given the problems outlined above, having a “one-shot” evaluation at term’s end gives little chance for responding to criticisms and puts an undue amount of weight on a measurement that is not completely reliable.

In the business world today, 360° feedback is gaining a lot of traction. This means that a variety of sources will provide feedback to employees so that they can improve their performance. In the field of education, this would involve Deans, TAs and RAs, students, colleagues and relevant outside actors.

In their research on 360° feedback, Rai and Singh conclude that “the improvement in performance by 360° feedback is largely due to improved interpersonal communication, finer leader–member exchange quality, more perceived organizational support and better quality of working life” (Rai and Singh, 2013, p. 70). For educators this would mean more exchanges with those around them, resulting in better relationships, greater happiness, and better understanding of management views.

SETs would still form a valuable part of this exchange. It might be useful to have classes do three or more SETs spaced out throughout the semester. Regular feedback gives the instructor a chance to make adjustments and improve and gives the leader a chance to coach them through the process. Furthermore, regular feedback minimizes the impact of grades, quizzes and difficulty on SET scores, while opening communication channels further, which is shown to lead to greater student happiness and achievement.

Another component of a successful 360° feedback process would be the addition (but not substitution) of a feedforward mechanism. The traditional performance appraisal feedback process is shown to potentially have a negative impact on employee performance and attitudes. Feedforward takes the traditional process and turns it on its head. Instead of management telling the employee about their performance as per the SETs, they ask questions and engage in active listening. Employees are asked to talk about positive outcomes and positive processes that they have been a part of in their classes (Budworth et al, 2015). This gives management the employee’s perspective of their achievements which they would miss in a traditional review, minimizing negative feelings and mistakes due to issues like the halo effect.

In the future, I hope institutions won’t rely on SETs just to determine retention and make salary decisions because this encourages unethical behaviour. SETs alone are not an accurate source of instructor performance and may lead HR departments to make poor decisions on hiring and retention that may come back to haunt them later on. I wish that any administrator reading this would consider the value of multi-source feedback and feedforward interviews in developing a full picture of staff performance. Moving from using SETs to “weed out” so-called “bad” instructors to instead work with staff and mentor and coach them to develop their full potential would be the best use of this instrument.


