The Unnatural Teacher 

There are people who say they knew from a very early age that they wanted to be a teacher. My sister in law, for example, used to play at being a teacher whilst at home. Now, she’s a successful primary teacher. I had no such youthful aspirations. I hadn’t a clue what I wanted to do until after I’d left university and realised I wanted to make something worthwhile of the knowledge I’d gleaned from my degree. I was a parent’s frustration – a career adviser’s nightmare. Likewise, some teachers are often described in the early days of their careers as being a “natural in the classroom.”


Just as I didn’t feel a natural calling to the profession, neither do I think would I have been described as a natural in my formative years as a classroom practitioner. I’ve had to come to terms with the fact I don’t inhabit my body well and have fairly random limbs; I have had to mentally rehearse explanations and questions; I have had to carefully craft classroom sequences. Perhaps (you might argue I’m jealous) this is why I have questions about the whole concept of the natural teacher. I am an unnatural teacher. Why would I be content with anyone else being described as a natural in the classroom?

It seems more than a little churlish though to dismiss the possibility out of hand. Wouldn’t it be great (despite also being a likely plot for a piece of dystopian fiction) if school leaders could naturally select preformed awesome teachers.

Is there really such a thing as a “natural” teacher or are people described as such just likeable, empathetic, responsive and/or authoritative people? Nick Rose does a great job of explaining the research on this here and here. In short, he tells us that:

  • This report from Rockoff et al identifies “the correlations between teacher characteristics and student outcomes are typically very small or non-existent.”
  1. Teachers’ content knowledge, including their ability to understand how students think about a subject and identify common misconceptions
  2. Quality of instruction, which includes using strategies like effective questioning and the use of assessment
  • Sadler et al’s study from 2013 on science teaching finds that subject knowledge alone is not sufficient. In addition teachers need to be able to identify and address students’ misconceptions.
  • According to some psychologists, Theory of Mind (the ability to infer how others think or feel) plays an important role in the initial level of our ability to teach. Rose highlights this 2002 report from Strauss, Ziv and Stein which, he says, points out that “the ability to teach arises spontaneously at an early age without any apparent instruction and that it is common to all human cultures as evidence that it is an innate ability.” He then lists the following as teaching attributes displayed by five year olds when attempting to help younger children to play a board game:
  1. Demonstration
  2. Explanation
  3. Demonstration mixed with explanation
  4. Specific direction or instruction
  5. Question used to check for understanding
  6. Responsiveness to actions from the learner
  7. Narration of own teaching decisions (“Now I’m going to tell you…”)

At the end of his blog, based on the fact that these skills are displayed by five year olds when teaching other children how to play a board game, Rose poses the question, “What is the ‘technical’ or ‘professional’ body of knowledge or set of skills required of an effective teacher, which can actually be taught?” Given that many of these things are achievable by five year olds, where do we go beyond this?

Nick’s premise is that many five year olds could have a go at doing these things. Many five year olds “could” cut someone open if you gave them a scalpel. Cutting someone open, doesn’t make you a surgeon. Having a natural propensity to direct another child how to play a board game doesn’t lead you, necessarily into being an effective classroom practitioner. What matters, in my mind, is not that we can do these things to different degrees of success from the age of five or four or three, but the extent to which we can develop teachers’ levels of impact when they do each of these things.

In this post, Michael Fordham makes a case for the argument that skills cannot be taught, instead advocating the point of view that what we perceive to be the teaching of skills is essentially the building of someone’s knowledge. That is to say that we can teach someone to ‘know-that’, “If they do x, then y happens/doesn’t happen” or that we can teach them to ‘know-how’ “doing A affects/doesn’t affect B.” I think there are important messages here for how we educate teachers in the knowledge relating to teaching.

Many of the seven elements of teaching which Rose lists above, as well as the many other elements of teaching which he doesn’t list can be broken down into procedures with related ‘know-thats’ and ‘know-hows.’ In addition, each of these procedures can be refined, practiced and honed with teacher educators providing further knowledge at the point of practice. We can unpick and learn more about what happens if we deliver an explanation in a certain way, we can look at how, when we demonstrate x in a certain way, it affects y so that, though we can’t achieve a dystopian sense of perfection, we can become better teachers. We do not have to be stuck as unnatural teachers if we are supported to become more knowledgeable.

In order for this to happen more effectively, there are certain things which I think schools and those involved in ITE could do better, if we are to make the  process more effective.


If teachers in their early days are to flourish, they need an environment in which they can learn to teach rather than learn to crowd control. This, I think, means they need to train in schools which have a clear and reliable system for dealing with behavior issues – all teachers need this. Novice teachers particularly need this so that they know what is possible for students of all levels of prior attainment. Imagine trying to learn to drive towards the Magic Roundabout in Swindon for the first time whilst your instructor were shouting at you, prodding you and lobbing food around the car. You’d stop and get out to maintain your sanity. You wouldn’t do it. We owe it to our students and the novice teachers who are learning to teach them to maintain an effective system of behavior in schools.

Content, pacing, interleaving and coaching

Harry Fletcher Wood identifies here that, just like the students in our classrooms, teachers in their early days forget content from their programmes. Having read this, I was left wondering why we rush them through so much so quickly. In the first instance, I think we need to consider whether there is content that could be stripped out of the ITE curriculum. In the second, I think there is a job to do, if we are serious about educating our least experienced teachers, in pacing the programme out over a longer period of time so that there is time for interleaving the content as well as space for more deliberate practice of isolated procedures which teachers use frequently. In this way, there would be more opportunities for coaches to build on the ‘know-that’ and know-how’ knowledge that our novice teachers possess and process.


The model we use for assessing the teachers we’re educating goes hand in hand with the content of the ITE curriculum. Hundreds of schools have now moved away from isolated lesson observations now, yet many if not most still do so with their novice teachers. If we have to judge whether they can do a hundred and one different things from the very beginning of the year it encourages a rushed approach to educating them. It’s possible, in this model to rush to assume that a novice has something pinned down when, in fact, there are gaps in their skills. My preference would be a model closer to that offered in Getting Better Faster by Paul Bambick Santoyo. In the book, he outlines in fine detail the elements of teaching which you’d expect a teacher to have grasped in their first three months. There is a great deal in there, but it also holds back a lot of content which many of our current models of ITE assessment use as it acknowledges that they are having to get a full grip on these early principles rather than rush on. It’s also far more developmental as each element is linked to a practice and coaching sequence to support teachers in their early days to build these elements over the first few months.

Subject curriculum and assessment

One of the biggest errors we make is expecting too many of those learning to teach and less experienced teachers to plan longer sequences of learning too soon. I think there are two main reasons we do this. The first is that, in many cases, nationally, we recognise teachers as novices for two years – the ITE year and the NQT year. This means that we “have” to train them in curriculum and assessment design at some point in this process and we have to squeeze it in amongst lots of other elements. This means we do it (at least in my experience) too quickly and therefore pretty badly. The second reason we do it is because, in some schools, less experienced teachers make up a high proportion of the teaching staffing so we need them to take on curriculum planning earlier than we might ideally wish. Time spent taking on the nitty gritty of planning a longer sequence of learning, is time that less experienced teachers can’t spend on looking at the granular work of framing questions in individual exchanges in class or working on their modeling or honing the use of the schools behavior system or embedding their routines.

Clarity of curriculum and teaching model in schools

Just as having a clear behaviour system in schools is important, so I also think it helps all teachers, particularly those in their initial years, if there is a clarity to the school’s teaching model. Ours looks like this, drawing on Shaun Allison and Andy Tharby’s Making Every Lesson Count, Martin Robinson’s model of the Trivium and Doug Lemov’s Teach Like a Champion:

Codification Document

Without a model like this, those learning to be teachers (especially unnatural ones like I was) can find it difficult to make the connection between their education programmes and the reality of the classroom – the world of the school can seem a very fuzzy place and, I think, trainees need clarity rather than fuzziness.

Real Assessment vs Gummy Assessment 

This post forms the basis for my presentation at TLT17. Elements of it are drawn from a number of the other posts on this blog.

I’d like to think my children have fairly high levels of cultural capital. My daughter and I have read the Narnia series, The Secret Garden, we’re just finishing off Salman Rushdie’s Haroun and the Sea of Stories. We do ballet lessons, swimming lessons, orchestra, choir. We go to National Trust properties in the summer. We probably seem so middle class that, even after the first Chanel 4 series of Great British Bake Off, you’d think we’re still struggling to cope with the departure of Mary Berry – we do actually see it as being equivalent to the advent of anarchy.

And yet my cosy middle class life of easy aspirations has been polluted by this:My little boy has discovered Real Food Vs Gummy Food. This is a  challenge in the YouTubes, the main variation of which entails two participants selecting one of two covered items of food. One is a real item of food like a banana or a fried egg or even something more substantial like a roast dinner and the other is a Haribo style gummy version. One child has to eat the real version, the other has the gummy item. They score the food out of ten using a carefully crafted marking rubric and then start again with more food.

There are hundreds, possibly thousands of videos in the YouTubes featuring children and adults taking part in this kind of whacky challenge. I have actually used some of the time I have carefully built up in my attempts at a work life balance to save you the need to do so as I think there is at least a tenuous link between this and what I want to really focus on here. Plus, it’s always good to start with an anecdote.

My contention here will be that we often use a gummy form of assessment rather than a real one. In my training and for the first fifteen years of teaching, my knowledge of assessment was limited to a few dubiously remembered quotations from Inside the Black Box by Black and Wiliam, a load of ‘assessment for learning’ strategies and APP. I think there were many who trained at around the same time as me who fell into the trap of fairly carelessly pulling our favourite squidgy bits of synthetic assessment from the Starmix pack rather than being able to use a more real form of assessment – one which was carefully interwoven with curriculum design and deliberate practice.

How did we come to a point where we fell into our bag of gummy assessment? It begins, I’m afraid for me, with a confession. I used to be a Local Authority, National Strategies Consultant for literacy. My PGCE training year at Warwick was built, in large part, around National Strategies materials. During my NQT year and second year of teaching, the department I was a part of worked closely with National Strategy advisers and at the school I moved to, in order to become Head of English, we continued to use National Strategy consultants and support materials. When I got the job as consultant, it was as if my whole teaching career had led me to the unquestioning pedalling of other people’s materials. This was essentially a delivery model. Schools, especially schools in which students were underperforming, did most if not all that they were told rather than developing an understanding of the principles behind curriculum, pedagogy and assessment design.

In my consultancy work, I was guilty of advising others to use more group work, more discovery learning, less teacher talk, heavy scaffolding, short extracts of texts rather than full novels, plays or non-fiction texts and, in terms of assessment the monstrous behemoth of the Assessing Pupil Progress materials and hundreds of different objectives and progress grids. I’d go into other teachers’ classrooms to teach one off lessons to demonstrate’ how to do this as if it were possible to do in an individual lesson.

This was a delivery model. Band aids for bumps and bruises which often generated rashy allergies. When we use this kind of assessment, we over-complicate matters for our teachers and this has a knock on impact on the experiences of our students in the classroom. We should be better than this. I hope we are becoming better than this because we’re taking a more coordinated approach to curriculum and assessment design.

Curriculum and Assessment

Have you ever tried connecting two hoses together while the water’s flowing? It’s a tricky, exceptionally splashy business that’s easy to get wrong. There’s a pivotal point in education where you can say the same of curriculum design and assessment. The issue with assessment is, I think, that you can’t  focus on assessment whilst divorcing it from curriculum design. If you do, you’ll end up getting soaking wet.

Let me exemplify this by using the example of swimming.

Both of our kids are making their way through the stages of Aqualetes based on the nationally recognised Swim England Learn to Swim programme. As you watch the sessions unfold over time, you can see the way everything has been carefully sequenced – from the way the instructors get children used to having water in their hair through doggy paddle, breaststroke, back and front crawl to developing the children’s knowledge of survival strategies. I’m still not quite convinced by butterfly or backwards skulling, but the rest all makes sense.

The other week, I watched as one of the teachers called a boy back to the side of the pool, re-explained the correct leg movement for breaststroke, showing him with her arms, then gave him two more opportunities to show her that he could do it correctly. The boy set off the first time and you could tell from the reduction in his speed and slight awkwardness in his movement that he was really thinking carefully about the corrections he’d been coached to make. His legs were moving better, but the front half of his body was submerging more between each stroke. This corrective stage was affecting his fluency but he was trying to do exactly as he’d been told. The second time through, his performance improved. It wasn’t, by any means, perfect but it was more fluid and resembled breaststroke more closely. This was Stage 5 swimming and he was moving closer to Stage 5 performance.

Knowing what’s required in Stage 5 and what the child should have been able to do in the previous enables the teacher to isolate and identify where the issue is for the learner. Assessment is easier if you understand the sequencing of prior learning.

But assessment and curriculum are not good enough alone for students to improve their performance in a discipline. Once an aspect of the curriculum has been grasped, whether it’s back crawl, or simultaneous equations or the use of subordinating conjunctions, students need to continue deliberately practicing these granular elements or steps within procedures to both improve and maintain them.

In their book Peak, Anders Erickson & Robert Pool propose that,

“Some activities, such as playing pop music in pop music groups, solving crossword puzzles and folk dancing have no standard training approaches. Whatever methods there are seem slapdash and produce unpredictable results.

Other activities, like classical music performance, mathematics and ballet are blessed with highly developed, broadly accepted training methods. If one follows these methods carefully and diligently, one will almost surely become an expert.”

Peak by Anders Ericsson and Robert Pool

Some time ago, Bodil Isaksen wrote a blog entitled A Lesson is the Wrong Unit of Time (sadly I can no longer find this available on line) in which she argued that we fell into a trap of attempting to plan learning into chunks of an hour or however long your school’s lessons are because that felt convenient. This isn’t how learning works. I think, though, that many schools – certainly all the schools I’ve worked in or with – fall into a similar trap with curriculum and assessment design, for which, a half term is the wrong unit of time. How many of us have, in the past or present, looked at the spacing of a year and decided that we’re going to have six units for our English curriculum because that’s the way the academic year is split up in the calendar and these are the points in the year at which we will be reporting to parents? If we want the children we teach to move from being novices in our subjects towards becoming experts, then we need to accept that it’s more complex than this at the level of curriculum and assessment, but less complicated than we try to make it at the level of teaching.

Curriculum alone

Swindon Academy’s curriculum design follows a mastery model. Mastery is a term which has become commonplace now in education. It’s used to mean so many different things that it runs the risk of becoming meaningless. It would therefore be worthwhile explaining what we mean here by a mastery curriculum. There are four essential strands to our mastery Curriculum. For us, mastery can be contrasted with other approaches, such as spiral curricula which require pupils to move through the content at a pre-determined pace, often changing units after four weeks or half a term because it is time to move on, rather than because the students have understood the content contained within the module. Our model is based on the following set of principles:

  • It’s pitched to the top with clearly mapped, carefully sequenced schemes of learning
  • There’s a shared belief that the vast majority of students can meet our high expectations
  • We have a clear model of teaching
  • There is a process of assessing and closing the gaps

You might be forgiven for thinking that, students taught in a system of mastery would never return to content they had mastered again and again. However, we were keenly aware in establishing our curriculum that, if we wanted students to be genuinely successful, they would need to retain this knowledge way beyond the first testing. Even wizened jedis need to practice to maintain their level of skill.


Likewise, elite athletes returning to their sport have to find different methodologies to regain their form and some never do. Once mastered, it is possible for mastery to fade or return.

Jess Ennis Hill

Meanwhile, it is also possible for many of us on the second, third, fourth or umpteenth time to pass a driving test, to claim we have mastered driving at that point and then to almost immediately begin to develop driving methods which suggest we had never mastered the procedures in the first place.

A question which is commonly heard in school staff rooms across the country is: ‘Why don’t our students remember what they’ve been taught? How come when it comes to the exam, they seem to forget so much?’ We also wonder why our students don’t use and apply the basic rules of spelling, grammar and numeracy we have taught them – especially when they are writing in subjects other than English or using mathematical processes outside of their maths lessons. To understand why this happens, there are two models of memory and the mind which we believe it’s important for every one of our teachers to know.

The first model of the mind is from  Daniel Willingham  which he discusses at length in his book “Why Don’t Students like School?” Willingham identifies that the crucial cognitive structures of the mind are working memory (a system which can become a bottleneck as it is largely fixed, limited and easily overloaded) and long-term memory (a system which is like an almost limitless storehouse).

Willingham Memory Model

To exemplify the difference between being able to recall knowledge as a single fact and having to work through unnecessarily laborious process when facts aren’t stored in long term memory, Willingham uses the mathematical calculation 18×7.


It’s worth bearing in mind that, as Willingham admits himself, this is a highly simplified model. A range of other models have divided the working memory into a set of subsystems. Alan Baddeley, for example, has developed a model which includes a phonological loop which deals with spoken and written material and a visuo-spacial sketchpad which deals with visual and spacial information as well as episodes or events.

Badeley Model

The central executive in this model monitors, evaluates and responds to information from three main sources:

  • The external environment (sensory information).
  • The internal environment (body states)
  • Previous representations of external and internal environments (carried in the pattern of connections in neural networks)

These alternative models have implications for the ways in which we differentiate learning experiences for students. We don’t currently have a clear map of the information processing pathways and there is evidence that the feedback and feed-forward pathways are more complex than the diagram here shows, but this is a useful representation for us to think about in terms of teaching and learning.

For teachers, a key learning point from both of these models is that if nothing has changed in long-term memory, then nothing has been learned and nothing can be recalled or applied. Our teaching should therefore minimise the chances of overloading students’ working memories and maximise the retention in their long-term memories. Willingham maintains that this requires deliberate, repeated practice. The models therefore have implications for curriculum design, lesson planning, pedagogy and the strategies which students need to develop in order to move towards independence.

The second model of memory I think teachers should be aware of stems from Robert Bjork’s work on learning and forgetting – again, I’m sure many of you are familiar with his work, but just to quickly recap, the storage strength and retrieval strength of a memory explain why we remember some things better than others. Storage strength is how well learned something is. Retrieval strength is how accessible or re-callable it is.

I’ve adapted the following diagram and explanation from David Didau’s blog.

Bjork - Storage and Retrieval Strength

Making learning easier causes a boost in retrieval strength in the short-term leading to better performance. However, because the deeper processing that encourages the long-term retention is missing, that retrieval strength quickly evaporates. The very weird fact of the matter is that, when you feel you’ve forgotten how to do something because the task you’ve taken on is difficult, you are actually creating the capacity for learning . If you don’t feel like you’ve forgotten you limit your ability to learn.

So we actually want students to feel like they’ve forgotten some of their knowledge. When learning is difficult, students make more mistakes and naturally they infer that what they’re doing must be wrong. In the short-term, difficulties inhibit performance, causing more mistakes to be made and more apparent forgetting. However, it is this “forgetting” that actually benefits students in the long-term – relearning forgotten material takes demonstrably less time with each iteration. I think this connects to Robert Coe’s best proxy for learning being students having to think hard about challenging subject content. This could have the following implications for our curriculum design:

  • We should space learning sessions on the same topic apart rather than massing them together
  • We should interleave topics so that they’re studied together rather than discretely
  • We should test students on material rather than having them simply restudy it
  • We ought to have learners generate target material through a puzzle or other kind of active process, rather than simply reading it passively
  • We should explore ways to make learning challenging so that learning is not easy

Assessment alone:

We’ve found, in introducing the our mastery curriculum as well as our teaching and learning model, that it’s useful to have a shared vocabulary so that teachers can have efficient and effective conversations about their work and their impact. This should also be the case with assessment practices. The following terms (all of which I’ve learnt from reading the work of Daisy Christodoulou will be, I think, key to developing a shared understanding of assessment practices:


The domain is the entirety of the knowledge from which an exam/assessment could draw to test a student’s understanding/ability. At Key Stage 4 and 5, this is defined by the specification, though there are also elements of knowledge from previous Key Stages which aren’t listed in specifications but that still form part of the domain.


The sample indicates the parts of the domain which are assessed in a specific task or exam. It’s rare we’d assess the whole of a domain as the assessment would be overly cumbersome. Well designed assessments are carefully thought through. Samples should represent the domain effectively so that valid inferences can be made based on the data gained from the assessment.


The validity of an assessment relates to how useful it is in allowing us to make the inferences we’d wish to draw from it. “A test may provide good support for one inference, but weak support for another.” (Koretz D, Measuring Up) We do not describe a test as valid or invalid, but rather the inferences which we draw from them.


Daisy argues that if an assessment is reliable, it would “show little inconsistency between one measurement and the next.”

Test reliability can be affected by:


  • Most tests don’t directly measure a whole domain; they only sample from it as the domain is too big. If the sample is too narrow, the assessment can become unreliable.
  • If the sample is always the same, teachers will strategically teach to the test to seemingly improve student performance.


  • Different markers may apply a mark scheme rubric differently.
  • One marker’s standards may fluctuate during a marking period.
  • Teachers can consciously or subconsciously be biased towards individuals or groups of students.


  • Performance on a particular day can vary between the start and end of a test.
  • Students perform differently due to illness, time of day, whether they have eaten, emotional impact of life experiences.

Difficulty model

In this form of assessment, students answer a series of questions of increasing difficulty. A high jump competition or a GCSE Maths exam paper are good examples of this model.

Quality model

Here, students perform a range of tasks and the marker judges how well they have performed, most often in relation to a set of criteria. Figure skating competitions and English and history GCSE papers use this model.

General issues which Christodoulou identifies with the most common assessment models:

  • A focus on the teaching and assessment of generic skills can lead to teachers paying insufficient attention to the knowledge required as a foundation for those skills. For example, vocabulary, number bonds, times tables, historical chronologies or relevant, subject specific facts can be overlooked in favour of how to evaluate or problem solve.
  • Generic skill teaching makes deliberate practice far more challenging as it focuses on larger scale success as opposed to fine grained assessment and training. For example, formative assessment in sport may take place during a match rather than a drill. Here, the coach may miss an issue which a student has with a specific aspect of the sport and then not address it.
  • Using only exam questions for assessment, especially though not exclusively for subjects whose exams are based on the quality model, can hide weaknesses which are at a smaller scale.

To more fully grasp this, take a look at these two videos, imagine you’re a cricket coach and think about which clip would be most useful to support your formative coaching of a batsman and which would be most helpful in picking a batsman for your team.

In the first of the two clips, as a coach, you can see the player’s ability to repeatedly respond to a specific situation. The ball lands almost exactly the same every time. As with the swimming coach earlier on, you can provide feedback to the response and potentially provoke an immediate change in processing. However, this drill doesn’t provide you with information about how the player will respond to the same situation in a match play situation. The second clip may or may not provide you with this as you could watch hours of footage without seeing the same kind of ball being bowled down the pitch. When it does, however, there are factors which could impact on the player’s reaction other than the bowling motion, the ball’s movement through the air and the bounce off the pitch. The pattern of proceeding balls, the length of time the batsman has been at the crease, the relationship between the batsman and his current partner, the quality of sledging the batsman has been exposed to. There are, therefore, times when we need to drill our students in the really granular elements of our subjects to be able to provide them with high impact, immediate feedback and, I believe, times when we need to allow them to play something more akin to the full match.

This requires a greater understanding of assessment design and the relationship between the curriculum and the assessment.

When we teach, we teach the whole of a subject domain. A (hopefully representative) sample of the domain is used for summative assessments. If the domain is colour and you’ve taught the whole spectrum, then this sample could be a good one.

Domain and Sample

The sample in terminal exams won’t be the same year on year though – just as no one whole cricket match is the same as another. If your students came out of their GCSEs which had covered this sample, and you hadn’t taught your students much about light blue then you might be quite relieved.

Domain and Sample 2

If you were to turn this into a school subject, Spanish, and you were to imagine that light blue is the equivalent of aspects of language relating to family and you’d spent quite a lot of your curriculum time on family, then you could feel like kicking yourself – though you may well be wrong to do so.

When producing assessments, there is a need to consider how the sample you’ve selected might skew your outcomes, the inferences you draw from these assessment outcomes and the actions you take as a result of these inferences. Dependent on the assessment format, this sample could be used to make valid summative inferences if you’ve just taught the blue, yellow and green elements.

Domain and Sample 3

This sample, meanwhile, may be less effective in making valid summative inferences if you’ve taught the blue, yellow and green elements but could be used well if you’ve just taught yellow and green. Having said this, it doesn’t assess the aspects of purple they were taught last year to see how much they’ve

Domain and Sample 4

Two added complications arise in a subject like English language. The first is that the primary domain in the GCSE papers is a set of procedures and the second is that there is what could be described as a hidden domain. As English teachers in AQA centres, we know that the pattern of questions will always be the same on both papers. you take each strip of colour below to be a separate question and you could teach the procedures students need to follow for these questions ad nauseam. This would cover the domain which will invariably be sampled in the paper.

Domain and Sample

The second English language paper, though, can feature texts from any number of different domains: geography, history, religion, philosophy, science. Many of the vocabulary, grammatical elements and linguistic requirements are also hidden if you only go about teaching the procedures required in responding to the question types. Again this highlights the need to drill students in both the granular details and give them opportunities for match play.

Hidden Domain and Sample

Bringing the hoses back together

Typically, in my experience at least, schools will map their curriculum and assessment so that it looks something like this:

Typical map of assessment and curriculum

Chunks of time are allocated to specific blocks of the curriculum. Often these blocks are dealt with discretely, assessed separately, students forget content and, as they are not required to recall it frequently, or potentially at all, they are less successful when their abilities are sampled in a terminal examination.

An alternative model is to carefully sequence content and interleave both content and assessment so that students are having to more frequently recall elements of the subject. This would look a little more like the model below. Each module introduces new curricular content, but also involves further assessment of prior content to secure a greater chance of increasing storage strength and retrieval strength.

Enhanced map of curriculum and assessment

To support our implementation of these principles in school, we’ve identified two aspects we want to address: a teacher element and a student element.

What are our next steps for the teacher element?[1]

  1. Ensure the curriculum is effectively mapped out and sequenced, establishing the factual and procedural knowledge which students will learn. Divide the knowledge from the curriculum into that which students need in the long term and that which students need for a specific unit. Ensure the bulk of curriculum and prep/revision time is spent on students focusing on retaining the most important knowledge. Build space into the curriculum to assess retention of knowledge from previous units which students need in the long term.
  2. Establish when students will be assessed both summatively (whole Academy calendar) and formatively (faculty curriculum overviews). As far as possible, this should take into consideration: the completion of teaching all elements, enough time between teaching and testing for revision and to suggest that our inferences are based on learning rather than performance.
  3. Ensure that the purpose of each assessment is clear to all involved in its design, delivery, marking and provision of feedback. The format of the test should enable the function to be achieved. It should also ensure that the inferences drawn from the results are as valid as possible. The main purposes of our summative assessments include re-streaming students, reporting to parents, establishing attainment and progress over time in teaching groups and cohorts of students to report to governors. A key question for you here is whether your summative assessments are reliable enough to enable you to validly infer that certain students are working at “age related expectations” in your subject. Formative assessments should be used to identify potential gaps in knowledge, misconceptions or deficiencies in ability that can be subsequently addressed.
  4. Design assessments aligned with this timing and purpose. Using Christodoulou’s principles for summative and formative assessments will help here. Over time, two separate banks could be built up: one of summative and one of formative assessment tasks. For summative assessment, it’s also worth asking yourself the following questions, based on those found in Santoyo’s book Driven by Data. Do assessments in each year:
    • Address the same standard of skill/content as the end of Key Stage assessment
    • Match the end of Key Stage assessment in format?
    • Enable students to move beyond that year’s content/skill level?
    • Reassess previously taught content which is necessary to retain until the end of the Key Stage?
  1. Trial the use of comparative judgements in subjects where the substantial proportion of assessment uses the quality model. 
  2. Preview assessment tasks to ensure that:
  • Questions don’t provide clues as to the answer.
  • Questions are actually testing that students have learned or can apply the knowledge you wanted rather than something else.
  • Questions are worded accurately and any unnecessary information is removed.
  1. Review assessments after use to establish whether they provided you with information that enabled you to make the inferences you wished. Make amendments to assessment items, where required, if they are to be reused in the future. 
  2. Standardise the conditions in which summative assessments take place and the ways in which they are marked. 
  3. Ensure that, where data from assessments is used to make key decisions, the data is sufficiently reliable. For example, when moving students between sets, data from more than one assessment is utilized.
  4. Develop the teaching and learning review which forms part of each teacher’s CPD Booklet to ensure that teachers have action plans in place to address gaps in attainment.
  5. Establish procedures for Curriculum Leaders to review and summarise teacher’s action plans, sharing them with their Line Managers for quality assurance.

The Student Element. 

Over the past two years, a number of our faculties have been trialing the use of knowledge organisers and low stakes testing or quizzing as part of the process of curriculum design. Different models have emerged, sometimes with different purposes and using different frameworks. We want to make the use of knowledge organisers, self-testing and the use of flashcards a core part of our students prep across subjects.

In order to secure the highest impact of this work, we need to evaluate the models currently in use to generate a set of shared principles and uses for these tools. We need to be sensibly consistent in our approach, keeping in mind the differences between the subjects that we teach. There are certainly potential benefits to the use of both knowledge organisers and quizzing, but we need to ensure these are harnessed effectively in each subject area.

Why should we bother with quizzing and knowledge organisers? Aren’t they just fads?

The term knowledge organiser could be a fad, but the idea of organising knowledge into schemas is certainly not as it has been going on for centuries.

As subject specialists, having carefully mapped our curriculum through from Key Stage 3 to Key Stage 5, it would be both wise and desirable to look for the most effective methods to ensure that students retain as much of the knowledge we are teaching them from one year to the next and, of course, into their lives beyond school. 

On a more pragmatic level, in order to support our students to do well with the new GCSE qualifications, we need to help them develop methods for retaining knowledge in the longer term. These qualifications are now more demanding. They require students to retain knowledge longer as they are based increasingly on terminal examinations rather than coursework and they ask more of them in terms of problem solving.

Even if it weren’t for this though, over the course of the last century, hundreds of cognitive science studies have ranked practice testing as one of the most effective methods of improving the retention of information and procedures in the long term memory. “In 2013, five cognitive scientists (Dunlosky, Rawson,Marsh, Nathan, Willingham 2013) collated hundreds of such studies and showed that practice testing has a higher utility for retention and learning than many other study techniques.”

The table below is taken from John Dunlosky’s “Strengthening the Student Toolkit”. In this paper, he argues that, “while some [study] strategies are broadly applicable, like practice testing and distributed practice, others do not provide much – if any – bang for the buck.” Low stakes, practice testing is one of the most effective study methods. 


Alongside this, sits Cognitive Load Theory and the work of John Sweller. Our teaching and learning handbook outlines the idea that our working memories have limited capacity only coping with approximately 7+/- 2 items of information. Once we go beyond these limits, then our thinking processes become bogged down. These ideas have been refined over the last couple of decades into a set of instructional principles called Cognitive Load Theory. In their book, “Efficiency in Learning” Sweller et al argue that, “Taken together, the research on segmenting content tells us that:

  • Learning is more efficient when supporting knowledge, such as facts and concepts, is taught separately from main lesson content.
  • Teaching of process stages should be proceeded by teaching the names and functions of components in the process.
  • Teaching of task steps should be segmented from teaching of supported knowledge such as the reasons for the steps and/or concepts associated with the steps.”

Well-designed knowledge organisers or schemas and effective self-testing could therefore be useful in terms of reducing the cognitive load on our students when they are applying knowledge in performance, production of problem solving.

Knowledge Organisers

In a blog post entitled, “Knowledge Organisers: Fit for Pupose?” Heather Fearn describes how she looked at lots of examples of knowledge organisers and found that often there was a confusion over their purpose which caused the documents to be muddled in design. As a result, they were confusing for students to use. She identifies three valid purposes:

  • A curriculum mapping tool for the teacher
  • A reference point for the pupil
  • A revision tool for the pupil and possibly parents

Given that we have Schemes of Learning for teachers to make use of and text books for students as a wider reference resource, I believe a useful definition of a knowledge organiser at Swindon Academy would be:

A structured, single A4 sheet which students, teachers and parents can use to create low stakes practice quizzes. The sheet identifies the raw knowledge which needs to be recalled swiftly in order to be successful within the assessment for a specific unit. This could include: 

  • Definitions of terms, concepts or key ideas
  • Components of a process
  • People/Characters involved in a chronology
  • Processes/Chronologies/Narrative summaries
  • The steps in procedures

Use the following to check the formatting of your knowledge organisers.

  • Identify knowledge which will be required beyond the end of the unit and knowledge which is only required in the medium term.
  • Include the absolute key words so that peer markers of quizzes can check they have been used in answers.
  • If you have to write more than one sentence, consider your phrasing. This will make your own explanations clearer and more efficient when you speak.
  • Don’t have too many sections/categories – four or five are probably sufficient.
  • If including images, ensure these are the same format as those you will use in your actual lessons.
  • Spellcheck your knowledge organizer.
  • Don’t include ‘thoughts to consider’.
  • If it isn’t essential it shouldn’t be there.


In his blog, “One Scientific Insight for Curriculum Reform” Joe Kirby of Michaela Community School poses the question: “what’s the optimal format and frequency of low-stakes testing or retrieval practice?” He cites various research papers from Roediger et al. In terms of format, he maintains that “Applied research suggests [well designed] multiple-choice questions are as effective as short-answer questions. The latest research study is as recent as March 2014, so this is a fast-evolving field, and one to keep an eye on.” With regards to frequency, he adds, shorter and more frequent quizzes outperform longer and less frequent. However, current research suggests that impact on our long term memory is maximised if this testing is spaced and interwoven.

He then goes on to summarise the work of a number of cognitive psychologists from the book “Make It Stick” in the following set of principles for self-testing:

  • Use frequent quizzing: testing interrupts forgetting
  • Roll forward into each successive quiz questions on work from the previous term.
  • Design quizzing to reach back to concepts and learning covered earlier in the term, so retrieval practice continues and learning is cumulative.
  • Frequent low-stakes quizzingin class helps the teacher verify that students are in fact learning as well as they appear to be and reveal the areas where extra attention is needed.
  • Cumulative quizzingis powerful for consolidating learning and concepts from one stage of a course into new material encountered later.
  • Simply including one test retrieval practicein a class yields a large improvement in final exam scores, and gains continue to increase as the frequency of testing increases.
  • Effortful retrieval makes for stronger learning and retention. The greater the effort to retrieve learning, *provided that you succeed*, the more learning is strengthened by retrieval.
  • In virtually all areas of learning, you build better mastery when you use testing as a tool
  • One of the best habits to instill in a learner is regular self-quizzing.

What are our next steps for the student element?

  1. Design knowledge organisers which fit the definition above for Schemes of Learning in Years 7-9.
  2. Use the checklist above to review the knowledge organisers.
  3. Devise self-tests or drills which could be used to assess students’ retention of the knowledge. This should include:
  • Completion of a blanked out timeline
  • Matching definitions and key terms
  • Labeling key diagrams from the organiser
  • Answering questions based on the knowledge organiser
  • A crossword with definitions from the organiser as the clues
  • Translation exercises for MFL using vocabulary from the organiser
  • Short answer questions and multiple choice questions based on the knowledge from the organiser
  1. Generate a prep schedule for students for self-testing of the sections of each knowledge organiser. In the first week, students will produce flashcards based on the organiser and in future weeks, students will use Look And Say and Cover and Write and Check (LASACAWAC) or an online quizzing platform for a specific proportion of their prep each week.
  2. Ensure knowledge organisers are stuck in to each prep book.
  3. Train students in how to use their knowledge organisers.
  4. Ensure that, as students move through the key stage, they are frequently testing themselves and being assessed in class on knowledge from previous units which they require at the end of the key stage.
  5. Add the schedule to E-Praise (the online homework record we use).

Here are the knowledge organisers for terms one and two for Year 7.

[1] Some of the following are taken from Phil Stock’s sequence of blog posts “Principles of Assessment.”

We bring the stars out

In the AQA GCSE English Language qualification, Section B of both exam papers features a writing task. Paper 1 is either a descriptive or narrative task whilst Paper 2 requires students to write a letter, newspaper article, essay, leaflet or speech. On the whole, in 2017, our Year 11 students performed better in the non-fiction than the creative task. I know some people are of the view that the question (the nasty one with the bus) was partly to blame for their students’ weaker responses but I’m not willing to accept that. 

Describe a time when your bus journey was actually stressful rather than just a bit tedious.

And we can do this until we pass out. 
Our cohort walked into their exams  better prepared for writing the non-fiction response, in large part, because we’d given them a clearer strategy. We’d shared and pulled apart more models, they had a clearly defined method for dealing with the question, they’d frequently and deliberately practiced elements or the whole of the task. Almost every week we had the whole year group in the exam hall to sit a non-fiction question with a short input. Then our Head of English and I would sit and mark all the responses before preparing whole cohort feedback and tweaking the planning for the following week. It worked. 

Surely we’d done the same with the narrative and descriptive tasks. You’d think, but at Christmas of Year 11, the cohort had been better at the creative task and the non-fiction had been a relative flop.  

Describe a time when you felt like a one eyed mangy cat because you hadn’t taught something as well as you might.

We flipped our focus and arguably went too far the other way. 

What I want to explore here is a method of using what we did with paper 2 and developing it for paper 1. In the first instance, I want to look at planning. 

Check out my visual, Check out my audio:

The following is taken from the exam board report from the June 2017 series:

“Unfortunately, there was also considerable evidence of a lack of planning. Occasionally, spider diagrams were used, which may generate ideas but do not help with organisation or cohesion, whilst other ‘plans’ consisted of mnemonics, usually linguistic techniques the student intended to include regardless, which may aid some of the less able students but tends to stifle the creativity of the most able. A lack of planning also resulted in unnecessarily lengthy responses, where the more a student wrote, the greater the deterioration in ideas, structure and accuracy. Many students would have benefitted from a quality rather than quantity approach: having the confidence to take time to plan, and then craft a shaped and structured response in two or three sides, with time at the end to revise and improve. This would certainly have helped those who started ambitious narratives but managed to get no further than establishing the two characters because they set out to achieve the impossible in the time given.”

These kinds of comment are fairly typical in examiners’ reports. Unfortunately, when I asked AQA if they had released or would be releasing any examples of really effective responses with efficient and impactful plans, they said that their examplar responses didn’t include plans as these aren’t marked. I think this is a shame as it’d be useful to see what the exam board believes to be an effective plan. 

A big part of me feels that planning primarily makes an examiner more likely to assume a student has written well. Plans are a proxy for better writing, like handwriting. They act like a suit at an interview. 

In the long term, what actually makes students better at crafting effective narratives and descriptions is a well mapped out curriculum, moving students through the composition, structuring and connecting of sentences and paragraphs and the development of a range of mental models of effective creative writing. I’ve just started reading The Writing Revolution by Judith Hochman and Natalie Wexler and would encourage you to do the same if you’re looking for a model of how to do this. 

There are, though, students who arrive at the start of year 11 who don’t have these tools at their disposal. We have to ask ourselves what we can do for these students. Could a clearly structured planning process with modelling and exemplification of how this could transfer into a complete composition provide them, in the short term, with a better grade and, in the long term, a better grasp of how to structure their writing? Perhaps. And because the answer was perhaps, I was willing to give it a go. 

When I asked for a suggested planning model on Twitter, a few very helpful people came forward. Some suggested strategies which were more useful for the non-fiction writing. Some suggested strategies which were great for questions where there was a linked image but I was looking for something which could work for the question with the image and the one without. Some suggested strategies which were focused mainly on literary or linguistic techniques and I wanted a model which primarily supported students to plan the structure of their writing. 

Describe a time when things didn’t go according to plan.

My students needed a shortcut. What was required was something that could be done swiftly, within about ten minutes; something that worked for both descriptive and narrative writing tasks both with and without linked images; and something where students believed that the planning process would genuinely improve their writing – it’s not a straightforward process convincing students with gaps in their knowledge from the past that spending time on part of an answer which isn’t even marked is a worthwhile use of their time in an exam. 

We about to branch out. 

Luckily, @JoanneR85 (I’m afraid I don’t know her actual name at the time of writing) got in touch with a suggestion I’m working on with my two groups. Her suggestion was something she called:

  1. Drop
  2. Shift
  3. Zoom in
  4. Zoom out

I’ve added two additional steps to this so far. It’s now become:

  1. Thoughts/Feelings + Contrasts
  2. Motif
  3. Drop
  4. Shift
  5. Return/Zoom in
  6. Zoom out and leave

The first two steps here are intended to help students create a thread through their writing, whilst steps three to six provide them with a strategy for planning out, in note form, the phases in their writing.

They begin by dropping the narrative voice into the text.

Then they shift to another time, contrasting mood or alternative place based on the stimulus. 

The third step is to return to the original point in time or location and mood and/or zoom in on a tiny detail in a way that illuminates the character’s feelings. 

Finishing off involves zooming out and leaving the location. 

The motif must appear at a number of points in the text and at least twice – once towards the beginning and once at the end.

Here are two worked examples: one for a description based on an image and one for a narrative based task. 

Worked example 1:

Write a description of a forest based on this image. 

Thoughts/Feelings + Contrasts. 

  • Mystery/Creepiness/Confusion vs Clarity/Understanding
  • Sadness/Upset/Depression vs Happiness/Contentment 


  • Mist – linked to confusion
  • Red flowers – linked to poppies/remembrance 
  • Stream – reflectiveness

Drop in 

Early morning make the forest seem dreary, tall trees – personify like soldiers, expect singing birds but none there, rotting leaves, autumn – all is beginning to die, mention the motif of the red flowers, link to poppies, grandfather has just died. 


Six months earlier, bright day, walking through the forest with your grandfather, describe him physically and one thing you remember him doing that shows what he was like as a person. Needs to be a happy memory to contrast. Make it on Remembrance Day so he’s wearing a poppy. 

Return and/or zoom in 

Back in the present. Dreary again, return from my thoughts. Zoom in on a twig and describe in detail – size, pick it up and texture, strength, should you step on it and break it or leave it and move on?

Zoom out and leave 

Look around again and notice the red flowers. They’re beginning to lose their petals, crumpling up at the edges. A bird begins to sing in the background. You leave the scene to return home. Describe the mist as if it’s wrapping itself around you – protective but confusing.

Worked example 2:

Write a story that begins with the sentence: ‘This was going to be a brilliant day, one of those days when it’s best to jump out of bed because everything is going to turn out perfectly.’

Thoughts/Feelings + Contrasts

  • Optimism/Hope vs Pessimism
  • Anticipation/Excitement vs Gloom/Disappointment 


Broken baby Jesus from a Christmas crib scene. 


Flashback to priest’s first Christmas as a priest. Pews heaving. Describe two members of the congregation. Grandmother holding a baby – the first the priest had baptised. Baby Jesus in the crib freshly painted. Looks vivid, glorious. 


Flash forward. Priest in his church looking around at Christmas time alone ten years later. Incense smell, candles burning – lights dim. Sense of emptiness. 

Return and/or zoom in

He picks up the Jesus figurine. It’s cracked, paint peeling, yellowing. Link it to the child from before who tragically died last week. He’d led the funeral service. 

Zoom out and leave

Describe the church as he walks out – empty pews, now sounds. Still holding the figurine. Drops it in the bin before leaving the church for the last time. 

They say hello, they say hola and they say bonjour

Describe a time when you were optimistic about the future.

I introduced this planning strategy to my classes this week by sharing each step. 

I modelled a step with one image, then gave the students two minutes to complete the step with another image, before moving on to the next step. 

This meant that the planning lasted twelve minutes in total which would leave students plenty of time to write their responses. The benefit, hopefully, will be that they have cognitive space to focus on how they’re writing rather than what they’re writing. Fingers crossed, the overarching structure will be much tighter too. 

I’ll let you know how they got on with their writing soon. 

Litteranguage – Part 2

In this sequence of posts, which I began here, I’m exploring how far the English Language GCSE is fit for purpose.

One issue for the qualification is the long list of purposes it’s meant to fulfill. In Part 1, I looked at the GCSE’s relationship with the curriculum. This time, we’ll look at its connections with assessment and we’ll explore how well it asesses performance relating to a proportion of the content of the National Curriculum for English at Key Stage 4.  

The way that we walk, the way that we talk, so easily caught. 

In terms of the first of these two purposes, it’s worth reminding ourselves that the government publish both a National Curriculum document as well as a document which identifies the Subject Content and Assessment Objectives for Key Stage 4. Exam boards then go on to produce their specifications, such as this one for AQA English Language and then their examination papers are produced based on these. Here are the specimen exams for AQA Paper 1 and Paper 2. In the case of most English exam boards, and certainly in the case of all of AQA’s sample papers, as well as this year’s first proper exam paper, the format and sequence of the questions is formulaic. In addition, the exam boards provide training to teachers to help them support their students to answer these formulaic question types. Teachers become examiners with the dual purpose of making a little more money for the summer and understanding how best to answer the questions so that they can share this with their students the following year. Schools will organise for Edexcel to share the marked scripts of their students this year so that their future students get better grades than the students of schools who have not. We do this, not because we want to know how to teach the English curriculum better, but because we want to teach the exam better so that our students get better results. At the moment, that means being better than other students in their cohort.

Most, if not all, of us are complicit in this. We want our students to do well in their exams for all kinds of reasons. Perhaps it’s because the results will have an impact on their life chances. Perhaps it’s because the results affect their self-esteem or our own sense of self-worth. Perhaps it’s because the results are a measure of our department or our school. Perhaps it’s due to performance related pay.

Water’s running in the wrong direction, got a feeling it’s a mixed up sign.

In order to more fully understand why this is important in relation to assessment, we need to remind ourselves of three key pieces of terminology relevant to the field. The following are adapted from my reading of Daisy Christodoulou’s latest book. If you’ve not yet read it, then you should.


The domain is the entirety of the knowledge from which an exam/assessment could draw to test a student’s understanding/ability. In the case of Key Stage 4 English, this is defined by the two governmental documents mentioned above and the exam board specification. However, there are also vast expanses of knowledge from previous Key Stages, as well as life generally, which aren’t listed in the specification that still form part of the domain for English Language. The subject is currently an aleph – a point at which all other subjects meet.


The sample indicates the parts of the domain which are assessed in a specific task or exam. It’s rare we’d assess the whole of a domain as the assessment would be overly cumbersome. Well designed assessments are carefully thought through. Samples should represent the domain effectively so that valid inferences can be made based on the data gained from the assessment. The sample in English Language is defined each year through the choice of texts and tasks which are set in the exams. For example, if we take AQA’s Specimen Paper 2, we can see that the texts and questions sample (to name but a few) students’ knowledge of:

  • Vocabulary, including “remorselessly,” “obliged,” “confidentially” and “endeavour.”
  • Father son relationships and educational experiences in social and historical contexts which may be vastly different to those in their own lives.
  • Irony.
  • Voice.
  • Linguistics and literary terminology.
  • Typical linguistic and structural features of a broadsheet newspaper article.

In addition though, because the question types are formulaic and because AQA and other exam boards produce booklets like this there is a wave of procedural knowledge which it appears students need in order to be able to respond to each task type. This domain is sampled too. In the worst case, students need to know that AQA Paper 1 Question 4 asks them to evaluate but doesn’t actually want them to evaluate. 


The validity of an assessment relates to how useful it is in allowing us to make the inferences we’d wish to draw from it. “A test may provide good support for one inference, but weak support for another.” (Koretz D, Measuring Up) We do not describe a test as valid or invalid, but rather the inferences which we draw from them.

I think the main inference we draw from someone’s grade in an English Language GCSE is that they have a certain level of proficiency in reading and writing. This level of proficiency could be measured against a set of criteria, it could be established through comparison with a cohort of other students, a combination of the two or through an odd statistical fix. At a student level, we measure these proficiencies in order to decide whether students:

  • Have the ability to take a qualification at a higher level.
  • Are able to communicate at a level appropriate to the job for which they’ve applied.

In order to be able to make these inferences, education institutions, training providers and employers need to have a shared understanding of the relationship between grades and competency or ability levels.

Problems arise here. Firstly, because these three groups want different parts of the domain emphasised in the sample. Many employers would want to know that applicants who have a grade 4 have basic literacy skills. In this report from 2016, though the overall tone is one of positivity about the improvements in schools, the CBI report that 50% of the employers they’d surveyed said “school and college is not equipping all young people with with…skills in communication.” Further to this, 38% of respondents said “There should be a [greater] focus in this [secondary] phase on developing pupils’ core skills such as communication.” 35% said the same of literacy and numeracy. When employers talk of communication or literacy skills, they tend to mean competency in reading for meaning, fluency of speech and clarity of writing in terms of spelling, punctuation, grammar and structure. In addition to these things, Educational establishments – particularly where the student is applying for further qualifications in English Language – are likely to want to know how well the student can analyse language and write creatively. As students only receive a single grade for all of these things, it’s possible that the student has performed adequately well in terms of language analysis, but not the other aspects or vice versa. This could lead to invalid inferences being made.

Further problems occur because, what you can infer from the grades depends on whether your results are criterion referenced or norm referenced as well as how far the system of comparable outcomes, which we now use, is understood by those making the inferences. In the 2017 CBI report, we’re told that “more than a third of businesses (35%) are wholly unaware of the GCSE grading reform in England.” In addition, there is a concern raised that, “Many young people still leave school without the solid educational foundations needed for success in work and life: on the academic side alone, more than a third of candidates did not achieve a grade C or better in GCSE English (39.8%)”  Given the changes to the assessment system over the past couple of years, it is unlikely that this percentage will change dramatically. In making this statement in the Executive Summary, it seems that the CBI seem unaware that this is the case. It’s clear that either further explanation needs to be provided to employers (and likely educators too) or another set of changes need to be made to the qualification if more valid inferences are to be drawn from results over coming years.

I couldn’t help but realise the truth was being compromised. I could see it in your eyes. 


More often than not, when English teachers talk about the reliability of the GCSE, they are thinking about the ten students whose papers they requested a remark on, some of which were successfully upgraded. In the circles inhabited by assessment experts, the term reliability tends to be used in relation to specific assessments rather than qualifications which rely on different assessments each year and are made up of multiple components consisting of a number of different questions or tasks like the English GCSE. If an assessment is deemed reliable, it would “show little inconsistency between one measurement and the next.” (Making Good Progress? – Daisy Christodoulou). 100% reliability at qualification level is a utopian concept as it would require each element of the qualification to be absolutely reliable. Task,  component and qualification reliability can be affected by marking, but if can also be impacted by sampling and a range of factors relating to students. 


As we’ve already established, most  tests don’t directly measure a whole domain; they only sample from it as the domain is too big. If the sample is too narrow, the assessment can become unreliable. Moreover, if the sample is always the same, teachers can fall into the trap of  strategically teaching to the test to seemingly improve student performance.

The domain from which the English Language GCSE texts can be sampled is huge. It could be argued that this is problematic as students who have a broader general knowledge, much of which they’ve been exposed to outside of school through access to a wider range of books to read or experiences, are at an advantage. 

Of equal concern is the limited sample of question styles. As the question style remains the same and  the range of texts which can be drawn from is so huge, teachers will look for the aspects of the exams their students can have most control over and confidence in whilst in the exam hall. This heightens the extent to which teachers focus more on how to respond to the question types, rather than how to be a better communicator through building students knowledge. 

Marking and grading:

The two biggest issues in terms of marking in English are that:

  • Different markers may apply the mark scheme rubric differently. 
  • One marker’s standards may fluctuate over time during a marking period.

On page 25 of Ofqual’s Marking Consistency Metrics is a graph which highlights the relative consistency in marking of different subjects. The report states “the quality of marking for physics components is higher than that for the more ‘subjective’ English language or history components.” Interestingly, though perhaps unsurprisingly as it’s marked in an even more subjective way, English literature is gauged as being less consistent. 

What we have to ask ourselves here though, is whether we’re happy with this level of reliability given the number of remarks we might request each year, whether we want an English qualification which takes the same objective approach as physics at GCSE level in order to increase reliability or whether there might be a third way, perhaps in the form of comparative judgement, which might increase reliability but also maintain opportunities for extended, open answer questions. 


Students’ performance can vary from one particular day to another as well as between the start and end of a test. They can  perform differently due to illness, time of the day, whether they have eaten and as a result of the emotional impact of life experiences. However, it is difficult to apportion any blame for this to the test itself. 

Let me see you through. I’ve seen the dark side too. 

There are numerous issues with the GCSE in terms of assessment. The domain is enormous, the sample is skewed by the nature of the assessment being so formulaic, the reliability of marking is affected by marking rubrics and marker accuracy and the inferences we can draw are blurry in their validity. Despite this, the nature of the assessments themselves are, in spirit, closer to the study of English than we might end up with were we to seek a more traditionally reliable method. It’s clear that, even if we’re not happy with what we have at present, we need to be careful what we wish for in terms of a change. 

The Cock and Bull

The place had been there for years, but we’d not been in before now. We thought we’d try it this time though, Kate and I, just to snatch a quick drink after the film which had finished later than expected. It was, we thought, close to last orders. No time to get anywhere else and it was good to escape the muzzy autumn rain. 

From the outside, it looked like one of those places where the locals, the elders, those women and men who’d been around the block a bit, would attempt to hold court, share the benefit of their wisdom and experience. Some of them might stare as you walked through the door but, if you stayed long enough, you’d learn something about something or someone or something else. You’d expect an odour of wet dog, a dart board, Scampi Fries and Nobby’s Nuts behind the bar. You’d expect wisdom. You’d expect. 

We stepped through the door into a small porch-like area. On one wall was a pin board and a set of ads, presumably from the locals. Some asked for stuff; others were offering stuff. Someone from the Roundabout Mums and Tots Group had put up a poster for a ‘Hot Choc Friday.’ Must be the new Tea and Coffee Morning. Someone else had crossed out the second h. On the other wall someone had stencilled the phrase “The definition of insanity is doing the same thing over and over and expecting different results. Albert Einstein.” Someone else had scrawled out Einstein’s name and, underneath, there sat what was presumably the culprit’s graffiti tag. 

Kate looked uncertain, but the cold outside and the relative heat of the air we could feel beyond the door drew us from the porch through to the pub. 

The single room beyond was small but overly-typical pub: typically stripped back bare floor boards, typically mish-mashed reconditioned rustic furniture, typically drippy candles in old wine bottles. Of all the tables, the two largest were already taken. The rest of the room was empty, barring the lingering smells of stale beer and stale competition. Tonight was Quiz Night. 

The first group, on the far side of the room, were huddled together at a table that was too big for them. One or two were wearing tweed, some office wear: skirts, blazers, trousers. Others were in chunky knit sweaters. There were beards as well as hairless faces, some shaven, some not; the odd novelty pair of socks; an odour of ales, cigars, port, whisky, pies, bangers, mash, gravy. There were few who spoke in the group; those who did were highly vocal though incomprehensible. Occasionally, one of the tweeds would attempt to voice something across to the other table which was, more often than not, poorly received.

Nearer to us, by the door, were what must have been the other team. They were more miscellaneous and there were more of them, though their table was still overly substantial. They were all looking downward, as if they were fully focused on the competition at hand. If it’d been last month, they’d have been on the gin. The month before it would have been prosecco. The month before that, fruit cider – probably pear. This month was Apperol. They were drawn to the effervescent but, for them, there was no best drink overall – just whatever was right for the context of that evening. Each time a tweed called across to this group, some seemed deeply aggrieved, some violently stunned and some nervously looked round their group, intrigued. 

Kate and I wondered over to the bar to order some drinks. The bar offerings were as divided as the clientele and as odd as the little man who popped up to serve us. His hair was fluorescent, swept upwards into a fountainous spray; his grin moronic, but his eyes were hypnotic. He apologised for the delay – he’d been putting on a tune for the music round. He’d have been better off apologising for the tune as it was, I think, something by Justin Timberlake. Kate plumped for some kind of energy drink in a futuristic looking can called Progress whilst I chose a pint of White Rabbit. 

Whilst he served us, and in between switching tracks for the quiz, the sprout haired man got to talking. He told us more about the divisions in the room. At one stage, he said, these had all been sensible people with sensible ways of communicating. Over a clip of some kind of cover of True Colours, we found out that the reason some of the tweeds didn’t speak was because their mouths had been violently clamped, closed. During a snippet from one Ariana Grande track or another, we were told that the effervescents had had their eyes and ears glued shut. They used to be some of the brightest minds of their generation, used to work together, talk together, have their disagreements but (part of the Sound of Silence was playing now) that all came to an end. 

I took a swig from my glass. The beer tasted curious. Kate took a sip from her Progress then looked over to me or tried to. Her eyes were jammed shut. I attempted to speak but my lips wouldn’t move. 

The barman grinned his inane grin, looked at us with his hypnotic eyes. Turning from the bar, I noticed that, snuck into each group, there were one or two members who looked just like this man. 

Turning back, the barman told us this was the tiebreaker round. No one could leave, he said, until we could tell him aloud the name of the soundtrack from which all the tunes had been picked. 

No one had left yet and, as the tunes went round and round and round, it felt unlikely we ever would. 

The sound of silence rang in Kate’s  ears. 

Nosmo King – There’s nothing worse than an ex-prog teacher.

H. Vernon Watson (1886–1952), was a popular English variety artist. He toured the music halls before World War I. However, he remained relatively obscure until the 1920s, when he became increasingly popular, despite his terrible routines, often misogynistic jokes and the fact he blacked up, taking to the stage with the name Nosmo King.

We had it all along. Smoke. 

Until recently, I assumed Nosmo King was a relatively innocent (non-blacked up) character who my dad often referred to when we were on a fun-packed family holiday back in the mid eighties. A typical Wells reference to Nosmo King began with an hour long hunt around a drizzly Cornish tin mining town or a Welsh coal mining village for a pub which:

  1. Was willing to allow children in
  2. Served food for lunch
  3. Had a tiny area of about three tables with little plastic signs declaring it a “No Smoking” zone

“Oh look, it’s Nosmo King,” Dad would hilariously declare on spotting the signs. We would all reluctantly mumble a laugh, before ordering a round of crinkly cheese sandwiches with cress and blue Panda Pop garnish. This relief would be short lived, however, as the smoke mysteriously drifted across from the surrounding areas of the pub and Dad began the ritual complaining of the worst kind of anti-smoker: the ex-smoker.

Rockin’ the Suburbs

It would be difficult, these days, to find a pub that didn’t do at least two of the three things on that list. Numbers one and two now make fairly sound business sense, whilst number three could end up in a legal case. It would also be rare for a performer to take the decision to black up on stage or screen as the practice is widely held to be racist; it is mainly associated with derogatory portrayals of racial stereotypes.

Go ahead you can laugh all you want, but I’ve got my philosophy. 

These two examples demonstrate that, as individuals, groups and societies, our attitudes and our resulting actions can change over time – we’d hope for the better, though quite often this is not the case. In this old post, I outlined how my philosophy relating to education has changed over the past few years; I am an ex-progressive. I thought that post would get things off my chest. However, I feel the need to speak out again as the yellowing smoke of those denying the need for debate has come creeping once more towards my now traditionally No Smoking zone of traditional education. It’s even making a muddled mess of my metaphors.

I’m missing the war (ba ba ba ba)

The first example of this occurred in a conversation on Twitter in which a headteacher claimed that the teachers in their school weren’t interested in the debate. This headteacher went on to say that they wouldn’t be wasting their time trying to explain the debate to their teachers as taking an interest in the debate might encourage teachers to take sides and prevent them from being reflective and open minded.

Fighting the battle of who could care less

The second, similar point of view came from the #Nobestwayoverall thread on Twitter. The argument is that there is no single way of teaching that has been proven to be the best methodology in all contexts. In this line of argument, traditionalism is a set of teaching strategies which can sit alongside any other. It is a pragmatic view as it means teachers should be free to select the best way of teaching for their context.

I don’t want to take on this view in its entirety here, but rather focus on a more specific claim from the same thread – that acceptance of the #nobestwayoverall viewpoint would end the ‘trad’ and ‘everyone else’ Twitter war for good. 

I don’t get many things right the first time. In fact, I am told that a lot. 

The issue I have with these two stances is that they seem to imply that the debate between traditionalists and progressives is primarily about the processes of education rather than its priorities. This is, perhaps, because many of the, admittedly very useful, books and blogs which have been written about the debate (some of which can be found at the end of this post) have set out the differences in a manner which pits the processes one by one, head to head.

If you switch this and focus on the differences in the priorities of the two ideologies rather than the processes, I think you can make more sense of the longevity of the debate. Apologies if any of the following seems caricatured. If you feel there are inaccuracies, then please let me know, but keep in mind that it is very difficult to capture the contrasts in two sets of beliefs that various people have defined in different ways over the years.

In essence, traditionalists see the purpose of education as being the passing on of a body of factual and procedural knowledge (a tradition). The priority is the tradition because retaining it and passing it on are seen as being beneficial to society and individual children now and, therefore, society and adults in the future. Authority lies with the tradition and the individuals and institutions who pass it on – teachers and schools. Children are expected to behave in a way which is respectful towards the tradition as well as these individuals and institutions.

The progressive school of thought pits itself against this sequence of priorities. It is harder to define as it is, I think, a broader “church.” The educational priority of progressives is the development of the child. For some this means the development of the child as an individual. For others it relates to the development of the child as a part of society. For most, it is about preparing children for their lives after school. As a result of the child coming first the curriculum and pedagogy are built around the children who are being taught. In many ways, this results in a symbiotic relationship between the curriculum and the pedagogy as the process being used to learn can be as important as, or more important than, the material or knowledge which is being manipulated in the process. 

To examplify this (and again I risk caricature here) a progressive teacher might want their students to improve their critical thinking. They will spend time selecting materials to use, but likely more time on the methods they will use (the activity/ies) as it is the practice of critical thinking which they want the students to have and develop more than the retention of the content in the materials. Students will learn about the topic of the materials though this will be as a by product of them learning how to be critical thinkers. In contrast, a traditional teacher will carefully select the knowledge which they wish students to learn and retain. Becoming a critical thinker will likely be, in their minds, a by product of learning the knowledge – the more knowledge their students retain, the more critical they can be when they encounter others’ viewpoints relating to that topic in the future. Thus, progressives are not unconcerned with knowledge and traditionalists are not uncaring toward children. It is just that, in terms of education, their priorities are different. 

From this dichotomous set of priorities and purposes stem a set of associated methodologies rather than the other way round. 

Come pick me up. I’ve landed. 

So, where does this leave us with the views highlighted earlier on. 

In the case of #Nobestwayoverall it is difficult to see that the hashtag will bring an end to the debate as it is primarily about what “best” means rather than the ways we teach – though some people have tried to make it about this. If we can’t agree on a definition for the purposes and priorities of education, then the empirical evidence offered by science in its different forms, used by one side or the other, is not going to help us to agree as to the methods we should use.

In terms of the other line of argument, it seems to me that it is easier to achieve the purpose a leader has in mind for their school if that sense of purpose is shared as broadly as possible by the people who are working within the school. If the purpose is changeable or the priorities are debatable, then perhaps a pragmatic approach is required. Perhaps the fixedness of the curriculum and teaching methodology matters less in these circumstances, though I doubt it. If there is a clarity and steadfastness to the purpose, then a more fixed set of methods is likely to be shared too and, one would think, the purpose is more likely to be achieved. 

It seems to me that a shared sense of purpose, curriculum and methodology is most likely the best way. But then I’m an ex-progressive: the worst kind of teacher. 


Dory says…

“Some activities, such as playing pop music in pop music groups, solving crossword puzzles and folk dancing have no standard training approaches. Whatever methods there are seem slapdash and produce unpredictable results. Other activities, like classical music performance, mathematics and ballet are blessed with highly developed, broadly accepted training methods. If one follows these methods carefully and diligently, one will almost surely become an expert.” Peak by Anders Ericsson and Robert Pool

Saturday mornings in the Wells household begin with a manic game of hunt the swimming kit before a trip to the local leisure centre for my son’s swimming lesson. 

Both of our kids are making their way through the stages of Aqualetes, based on the nationally recognised Swim England Learn to Swim programme. As you watch the sessions unfold over time, you can see the way everything has been carefully sequenced – from the way the instructors get children used to having water in their hair through doggy paddle, breaststroke, back and front crawl to developing the children’s knowledge of survival strategies. I’m still not quite convinced by butterfly or backwards skulling, but the rest all makes sense. 

Last week, I watched as one of the teachers called a boy back to the side of the pool, re-explained the correct leg movement for breaststroke, showing him with her arms, then gave him two more opportunities to show her that he could do it correctly. The boy set off the first time and you could tell from the reduction in his speed and slight awkwardness in his movement that he was really thinking carefully about the corrections he’d been coached to make. His legs were moving better, but the front half of his body was submerging more between each stroke. This corrective stage was affecting his fluency but he was trying to do exactly as he’d been told. The second time through, his performance improved. It wasn’t, by any means, perfect but it was more fluid and resembled breaststroke more closely. This was Stage 5 swimming and he was moving closer to Stage 5 performance. 

“You’ve been struggling to make things right. That’s how a superhero learns to fly.” The Script. 

Watching this expert swimming teacher and her pupil reminded me of a couple of other interactions I’d seen recently on Twitter about Direct Instruction and scripted lessons. If you’re uncertain about what Direct Instruction is, then this is worth reading. In essence, it’s a carefully sequenced, carefully scripted approach to teaching which is designed to ensure mastery of the content covered. The programmes, written by experts, have been tested and reworked based on a set of core principles. They are designed for students who, based on assessments, are currently at a very similar level in terms of the knowledge gaps they address. They are designed to maximise the impact of the interactions between teacher and students. Here’s a page of scripted material from Corrective Reading, available from McGraw Hill. 

Direct Instruction was road-tested in the most extensive educational experiments ever conducted: Project Follow Through. Details of how this was done can be found here.

The objections to Direct Instruction on Twitter seem to come in two main forms. The first is that the Direct Instruction programmes are likely to be detrimental to students as their scripted nature negates the fact that students are human beings who don’t talk in scripts – scripting is dehumanising, opponents argue, and students aren’t all the same. The second is that scripting deprofessionalises teachers. In particular, there is a concern that new teachers will not learn how to plan themselves or how to adapt to the needs of different learners. 

We run two Direct Instruction programmes at our school and are planning on implementing a third next year. Though I haven’t personally taught one, I’ve observed a number of sessions and I’d disagree with both of the issues raised above, though I can partially understand why people might raise them. 

The students who are learning through these programmes need the structured support which they receive from the teacher, the materials and the method of delivery, even more so than the boy in the pool last Saturday. They need explanations which have been planned with precision; they need their teacher to ask questions which elicit responses from them, either individually or chorally, in order for that teacher to intervene and correct swiftly where necessary; and they need each session to build on the last with application of new material as well as knowledge that has previously been mastered. All students need these things, but these students have gaps in their knowledge which need closing quickly, efficiently and in a humane way. Some of the critics of Direct Instruction seem to imply that it is a robotic process. When I walk past the classrooms where these sessions take place, it seems far from robotic. These classrooms are places of joy because the students taking part are motivated by the fact that they are building their knowledge together from one session to the next. 

The teacher leading the session loves it too and certainly doesn’t feel de-skilled by the process of using the script. They can see, over time, the students retaining more knowledge and able to apply this in more skilled ways. The feedback they get from the students, because of the way the scripts are designed, is more frequent and interventions or corrections can be made immediately due to the clarity of the questioning. 

So, should we be concerned for new teachers? Well, when I look back on my very first lesson ten years ago, I think not. I was going to be teaching my mentor’s top set Year 9 group for a one off lesson on persuasive speaking. I must have spent the week beforehand preparing myself. I recorded, on videotape, Thursday’s episode of Question Time. Amongst the panel members that week were Richard Branson, Theresa May, presumably a model of poor speech, and Margaret Beckett. I spent ages and ages watching a short clip, rewinding again and again and again until I’d pulled together worksheets on a rickety word processor for each of the speakers. Different students would focus on different members of the panel and then feedback to other students in jigsaw groups. 

Genius and a worthwhile use of a new teacher’s time? 


I thought it was ok and it probably wasn’t a car crash. But probably not a car crash is not a good model of lesson for students and is not a good model for teacher training either. As a result of all the time I’d spent on resourcing, I’d spent almost no time on the precision of my explanation of what makes excellent panel debate or on the concision of the questions I’d ask afterwards. I also had no sense of how this lesson would have fitted into the whole curriculum and, because I was so busy focusing on whether I could pause the VHS at the precise moment I wanted, I had no time to focus on the basics of classroom management. I had a limited mental model of what expert teaching could be like and I’d certainly not practiced teaching using someone else’s model – a decent model mapped out or scripted by an expert. I was basically sinking in style rather than swimming through substance.  

Though I’m not advocating their use for every lesson, what Direct Instruction scripts offer is a way to rapidly and efficiently build students’ knowledge whilst also benefiting trainee and expert teachers. 

As an added bonus, I’ve yet to see any DI materials featuring faltering politicians.