Some of the problems with module evaluation

 By Dr Tim Herrick, School of Education

Please note this blog post was originally written in 2019, and was published on the original Elevate blog.

 


Understanding how students are using our teaching to support their learning is clearly an essential part 

of professionalism within higher education.  Conscientious teachers want to know on a personal level 

what is going well and what could be developed; systematically, it’s useful to identify issues common 

across departments or institutions; and more holistically, it helps informs our obligations to the  

Quality Assurance Agency that we know what happens in our classrooms.  

 

So we devote time and energy to student evaluation of teaching, at module, programme, and year level;

and the University has recently provided us with resources to facilitate that process, including standard 

questions to ask all students.


And yet, I have concerns about some of what is taken as common practice.  These are less about the 

specific questions that we ask - although much ink has been spilled on that particular matter - or other

practicalities, for example the timing of the questions - such as before or after feedback and grades are 

received, and, for courageous departments and individuals, mid-way through the module rather than at 

the end.  My concerns are more about the systematic biases that may be present in any system of 

teaching evaluation - and perhaps more pointedly, how groups of students perceive individual teachers.


A while ago, through an LSE blog post, I was made aware of a piece of research in Innovative Higher 

Education calling attention to gender biases in how students evaluate teaching.  The study is nicely 

designed - it’s based on student evaluations of an online module, with one male teacher and one female.

The online class was divided into four groups - one taught by the male presenting a male identity, and

one taught by the female presenting a female identity.  The final two reflect the ingenuity of the design, 

as the third was taught by the male under a female identity, and the fourth by the female under a male 

identity.  So students were experiencing similar, sometimes identical, interactions, with the same 

teacher, sometimes under their true gender identity, sometimes under the assumed one, without ever 

meeting their teacher in person.


It would be nice to say that this led to consistent student evaluations, regardless of their perceptions of 

gender.  However: unfortunately not.  While all four groups gave positive scores to all “four” of their 

instructors, the scores for the “male” teachers were higher than the “female” teachers.  As the authors 

explain:


"the same instructor, grading under two different identities, was given lower ratings half the time 

with the only difference being the perceived gender of the instructor"


And by way of illustration:


"when the actual male and female instructors posted grades after two days as a male, this was

 considered by students to be a 4.35 out of 5 level of promptness, but when the same two 

instructors posted grades at the same time as a female, it was considered to be a 3.55 out of 5 

level of promptness"


These findings were reproduced by the LSE authors linked above.  The explanation the original article 

puts forward is that students have higher expectations of female teachers, wanting them to be 

approachable, empathetic, and warm in their communications, as well as knowing their subject matter 

and being clear in how it is explained.  The bar for men is set lower, leading to that horrible double bind - 

women are penalised for not meeting higher expectations, while men are rewarded for doing anything 

beyond a lower level of expectation.


So what to do?  First of all, recognise the problem - and while we’re there, look hard at the more complex 

data on ethnicity and student evaluation of teaching - and if anyone knows of research

on dis/ability, I would be glad to read it.  Secondly, share the problem with our students - in this woke 

era, I doubt many of them would want to collaborate in a(nother) system that appears to privilege white 

men.  Encourage them to consider their own presumptions before completing the evaluation form; and 

perhaps, like Geography, hold a conversation with them about what kinds of feedback are most useful 

for staff to receive.  

 

And lastly, alongside the standardised evaluation instruments, think of other ways in which we can come

 to know how our students are experiencing our teaching (of which, to slip in some quick plugs, there 

are some great ideas on the Elevate student engagement webpages, and a 

 Student Observation of Teaching scheme that I run).  We may need, for the purposes of quality 

assurance, to reach a minimum threshold of student evaluation; but to reduce the impact of systematic 

biases such as gender, and to hear most clearly their suggestions for how teaching might be enhanced,

I would argue that we also need to do more.