Life after levels: who’ll create a mastery assessment system?

Whosever redesigns their curriculum and assessment for life after levels will reap the benefits

A great many schools I know are now considering the question of what to do about assessment. ‘Is there an alternative to national levels?’ they are asking. After all, assessment drives the curriculum: the curriculum cannot be considered without considering how it is being assessed. Here is the argument that I am building up on this blog:

Our curriculum and assessment aren’t designed with memory in mind.

National levels are imprecise, ill-sequenced and confusing.

So we must redesign our curricula and assessment for memory with precision, sequencing and visibility in mind.

“There is plenty of mileage in Joe Kirby’s mastery model, but it needs flesh on the bones to become a viable proposition,” said Chris Hildrew in a recent blog. This blogpost tries to flesh out the model, asking: what might a mastery assessment model look like?

Rationale: Why redesign assessment with memory in mind?

There is a threefold rationale for changing assessment. First, the current system inhibits pupils’ memory, both intuitively and empirically, as I set out in my last blogpost. Second, research from cognitive science signposts the way to curriculum design that enhances rather than inhibits memory. Third, international comparisons show that high-performing school systems use mastery assessment models, and some schools in England are already ahead of the curve, pioneering mastery in Maths and Humanities.

The Scientific Rationale

Cognitive science shows us that we don’t remember things because of insufficient focus, time or attention spent on them, and because of insufficient practice, usage, revisiting, consolidation or application. So, when we grumble as teachers that students don’t use grammar properly, even though they’ve learned it, we need to ask ourselves: have they really learned it? Have we really taught it with sufficient time, focus and attention? Have we sufficiently revisited it? Have we consolidated it in their minds? Have they mastered it? Have they automated it in their long-term memories? As Dan Willingham says, ‘practice makes perfect: but only if you practice beyond the point of perfection.’

Intuitively, teachers’ instincts on this are corroborated by the scientific research. As Michael Tidd asked in a recent blogpost,

‘Is mastery assessment just for maths? Mastery assessment is built on the premise of covering fewer topics in greater depth each year. This strikes me as sensible. Too often I have taught children at KS3 who have raced through the curriculum, picking up bits of skills, but for whom the basics of number knowledge and calculation are still insecure. The comparison to the end-moments of the game, Jenga, is too often fitting: students who lack the secure base on which to build their higher knowledge soon come crashing down.

‘I’m not convinced I’ve given them enough time to really securely practise and secure their use of those skills. And so, just like the kids who can’t do their tables in Y10, I’ve got students who haven’t applied even half of what they’ve learned. I fear that the downfall of the process has been the movement on to another genre and another set of techniques for the next fortnight. Indeed, I know many schools where each block lasts a week before moving on.’

Syntheses of the scientific research from Daniel Willingham and Robert Bjork converge on these key principles:

Distributing practice (rather than cramming): ‘it is virtually impossible to become proficient at any mental task without extended, dedicated practice distributed over time.’

Overlearning: keep pupils learning after they know the material to prevent forgetting: ‘a good rule of thumb is to put in another 20 percent of the time it took to master the material’.

Interleaving: we learn content better when it is revisited, consolidated and interleaved with upcoming problem types.

Testing frequently: testing students frequently helps them remember material, because using our memory improves our memory: the act of retrieval helps us remember the things we recall, and makes them more recallable in the future (Bjork, 1975).

The International Rationale

With 30 years of expertise in transnational assessment comparisons, Tim Oates notes that ‘a distinctive feature of high-performing systems is a radically different approach to pupil progression, as a fundamental rather than surface element’:

‘Crude categorisation of pupil abilities and attainment is eschewed in favour of encouraging all pupils to achieve adequate understanding before moving on to the next topic or area. Achievement is interpreted in terms of the power of effort rather than the limits of ability. Teachers in such systems see their task as ensuring that all pupils have developed an adequate level of understanding of the key concepts and content in a block of learning prior to moving onto the next block of content. Labelling of differential attainment is of secondary importance. This approach appears to be particularly concerned with securing a suitable degree of understanding by all pupils prior to moving on to the next set of learning objectives. The approach to pupil progression used by some high-performing countries could be referred to as a ‘mastery model’, and this emphasis could be replicated in the English context.’

Oates suggests combining *resolute commitment to essential knowledge for all* with *monitoring to record the attainment of pupils who are ‘ready to progress’*. If we took this approach, far fewer pupils would end up in the long tail of underachievement that blights English education, with 20% innumeracy and illiteracy persisting over decades.

What are the principles of mastery assessment?

Mastery curriculum and assessment is designed for precision, sequencing and visibility. The curriculum is precisely sequenced to focus on a much greater depth of concepts, which are rigorously checked for deep understanding. All pupils are expected to master all the concepts in assessments, and there is no room for underachievement, as any pupil that does not master the content is entitled to precise support and targeted intervention: so as a pupil, for instance, by the end of the year, if you have not understood all the concepts required to make expected progress, you would stay in for summer school, and your teachers would ensure you understood them all deeply. Parents are crystal-clear on whether their child has achieved the high expected threshold in each subject each year. Teachers have crystal-clear visibility on who to support.

What would a curriculum and assessment system look like if designed with these principles: interleaved practice for overlearning, and frequent, low-stakes testing?

The roadmap to meaningful rigour

The empirical success of schools from the USA that use mastery assessment is encouraging. Paul Bambrick Santoyo from the successful Uncommon Schools network explains:

‘Curriculum standards are meaningless until you define how to assess them. The level of mastery that will be reached by the students is determined largely by what sort of questions students are expected to answer. We should not teach, then write an assessment to match – we should create a rigorous assessment, then teach to meet its standards. Assessments are not the end of the teaching and learning process: they’re the starting point. Teachers must see the assessments at the beginning of the teaching cycle. Interim assessments that define the high level of rigour needed to succeed have a ripple effect improving visibility and instruction, planning and feedback.’

I see two imperatives that change the game when designing mastery assessment:

The curriculum must be frontloaded and interleaved.
Assessment must be cumulative and revisited.

Let’s take each imperative in turn.

The curriculum must be frontloaded and interleaved

One of the most useful concepts for curriculum design is the Pareto law, also known as the 80:20 principle. This idea, discovered by economist Vilfredo Pareto, holds that surprisingly often, around 20% of the inputs lead to around 80% of the outputs. In languages, 20% of the vocabulary is used 80% of the time. In education more broadly, 20% of the most vital concepts hold 80% of the value for academic achievement. The trick is to work out which concepts are the most vital for each subject, and so which should be frontloaded.

Curriculum design armed with the insight of the 80:20 principle can benefit from identifying threshold concepts in each subject. Threshold concepts are those that represent ‘seeing things in a new way… a portal, opening up a new and previously inaccessible way of thinking about something. It represents a transformed way of understanding, or interpreting, or viewing something without which the learner cannot progress; a transformed view or landscape, how people ‘think’ in a particular discipline: tacit, troublesome, fruitful knowledge. My experience discovering this lens was a revelation, akin to the experience I had when I put on my first pair of eyeglasses – suddenly everything was sharp and clear.’

Only this morning Alex Quigley has posted on threshold concepts as a way of redesigning the curriculum. As Tom Sherrington urged us in a recent blogpost on the importance of clarity to ‘define the butterfly’, so we must define the threshold concepts in our own disciplines, and sequence them over the curriculum and assessment so that they are frontloaded and interleaved throughout. An example of this is how the Maths Mastery curriculum frontloads and interleaves in their visual Year 7 curriculum overview:

This is in stark contrast to the widespread ‘spiral’ textbook Year 7 Maths overview I put on the last blogpost, which has some 18 topics and spends two weeks only on each.

Assessment must be cumulative and revisited

The key thing about the sequencing of mastery assessment is that there are incentives for teachers to revisit core concepts and vital material. Most assessments in the schools I know are six-weekly, roughly half-termly, but each is stand-alone and content from previous half-terms is not retested, but allowed to be completely forgotten. In English poetry is tested in one half-term, then left til next year. In Maths negative numbers are dealt with in a couple of weeks a year, raced on from, forgotten then retaught from scratch next year. It’s a sequence designed for forgetting rather than remembering.

Paul Bambrick-Santoyo offers a solution:

Design the test to reassess earlier material: effective assessments revisit material from earlier in the year. This review of past material is vital to remembering it and learning new concepts. Teachers have a chance to see if their re-teaching efforts were effective. One such method is to make tests longer as the year progresses.

We can’t just do stand-alone unit tests; assessment must be cumulative.

Frequent Low-Stakes Testing through Multiple Choice Questions

Daisy Christodolou has effectively made the case that that levels have resulted in complex tasks crowding out precise tasks. Multiple choice questions are not used in many English or humanities departments I know. There is a myth that they aren’t very rigorous. This seems ludicrous when you consider that Harvard Business School rely on the GMAT multiple-choice assessment test for a core component of their MBA recruitment. The benefits to precision are worthwhile. The visibility that question-level, objective-level precision yields is unmatched by complex, extended tasks. There is a place for multiple choice questions. Phil Stock’s blog posts show some excellent examples of how useful they can be. Paul Bambrick Santoyo makes the case that they should be combined with open questions:

In an open-ended question, the rubric defines the rigour
In a multiple choice question, the options define the rigour
Effective assessment will combine them to mastery
Both are necessary, complementary sides of the same coin

Of course, mastery assessment must be reliable and dependable, otherwise it falls foul of exactly the same validity and reliability tests that levels fell foul of. This depends on effective moderation, which is a story for another time. Another story for another time is how to deal with benchmarking against ‘national progression’. But the difficult thing about open questions was that though they are easy to set, they are hard to mark, moderate, standardise and agree on with what Daisy Christodolou calls the ‘adverb problem’ of abstract criteria. Multiple choice questions are the opposite: they are very hard to set, but very easy to mark. A lot of the work can be done up front by curriculum experts, whereas in extended writing, a lot of the work must be done downstream by teachers. Multiple choice mastery assessments can thereby help us out when it comes to reliability and dependability.

The case for changing levels is overwhelming. The case for an alternative is just getting started. Mastery assessment offers a way out of the impasse. Whoever dares replace levels with mastery assessment stands to reap the benefits, for teacher’s instruction and feedback, and pupils’ memory and achievement.