Hidden in plain sight
Research is powerful. It can chime with your intuition, or shatter preconceptions. Like when half of all observers in an experiment to count passes of the ball, failed to spot a gorilla enter the game.
On Monday 13th January, Professor Rob Coe gave a speech at an event co-hosted by the Teacher Development Trust on lesson observations in English schools.
It was utterly shattering in its implications for school leaders. It turns out we are all complicit in this year’s brain gym.
Ben Goldacre in Bad Science demolished brain gym as a widely but uncritically adopted fad, an unscientific and useless intervention. Tom Bennett in Teacher Proof and Dan Willingham have demolished others such as VAK learning styles as pervasive but unevidenced. At ResearchEd 2013, Tom asked, what is this year’s brain gym? What are we falling for right now?
Professor Coe’s collation of the research suggests it is graded observations. I agree. It is not reliable – two different observers who see the same lesson are unlikely to agree. Nor is it valid – even if they agree that what they see is good practice, it often isn’t.
Here are Professor Coe’s killer stats:
- if a lesson is judged outstanding, the probability that a second observer would give a different judgment is up to 78%
- if a lesson is judged inadequate, the probability that a second observer would give a different rating is 90%.
But that’s in the robust, $50 million MET project; most schools observations are not as robust (Strong et al, 2011)
- Fewer than 1% of those judged inadequate are genuinely inadequate
- Only 4% of those judged outstanding actually produce outstanding learning gains
- Overall, 63% of judgements will be wrong
Prof Coe is rightly scathing: ‘tossing a coin would have been better’; ‘you might as well decide you don’t like someone’ as give them unsatisfactory.
The effect sizes of observation as an intervention are also very low: 0.22 and 0.11. As John Hattie says, setting the bar at zero is absurd; most interventions have some effect, so his threshold for effectiveness is 0.4, which graded observations do not meet.
Graded observations: the gorilla in the classroom
The evidence shows that grading lessons is not reliable, valid or useful. But intuition and experience tells me that it is also counterproductive and damaging.
Damaging, as some fifty teachers tell here of the pressure and pain they felt after being downgraded. What if they had known the 90% probability that a second opinion would have changed their rating?
Counterproductive, as David Didau shows here, as ‘the cult of the outstanding lesson is retarding learning.’ The focus on busy engagement in protocols over memorable instruction is problematic: it is precisely this distractor that Professor Coe says compromises validity.
So what do we do about it?
First, do no harm: end numerical judgements
Doctors take the Hippocratic oath: first, do no harm. So should school leaders. But we are harming teachers’ professionalism by grading them out of 4, often in 20 minutes. There’s no way a surgeon would be graded out of 4 for 20 minute observation of an operation.
We must stop grading lessons. Professor Coe says we should ‘stop doing what we’re doing’; ‘if you don’t want to use observations for grading, it may not matter that they’re not reliable.’
If we just use them formatively, teachers can focus on improving rather than being judged, and school leaders can combine quantitative assessment data, qualitative feedback from colleagues and their own intuition to form nuanced judgements of teaching quality.
Then, follow the bright spots: use formative-only observations
‘Sow the seed of the end of the judgemental approach to school leadership’ Alison Peacock said at the same event, a primary head who eschews grading lessons and instead uses lesson study for a culture of trust.
In years to come, like BrainGym, we may well look back on grading as a travesty and a historical curiosity. Now, though, this business of grading observations must end. Let’s get the gorilla off our backs.