Integrating Assessment, Knowledge and Practice

“If we want pupils to develop a certain skill, we have to break that skill down into its component parts and help pupils to acquire the underlying mental model [to see in their mind what it should look like in practice]. Similarly, when developing assessments for formative purposes we need to break down the skills and tasks that feature in summative assessments into tasks that will give us valid feedback about how pupils are progressing towards that end goal.”

(Daisy Christodoulou – Making Good Progress)

“Deliberate practice develops skills other people have already figured out how to do and for which there are effective training methods.

  • It takes you out of your comfort zone. 
  • It involves well defined, specific goals. 
  • It requires full attention. 
  • It involves feedback and modification. 
  • It requires a focus on specific aspects of procedures.”

(Summarised from Anders Erickson – Peak)

Have you ever tried connecting two hoses together while the water’s flowing? It’s a tricky, splashy business that’s easy to get wrong. There’s a pivotal point in education where you can say the same of curriculum design and assessment. 

Swindon Academy’s curriculum design follows a mastery model based on the following set of key principles:

Curriculum Model

Our Curriculum Leaders and the teachers in their teams have worked hard to develop curriculum overviews and schemes of learning which reflect these principles, often drawing on the best, publicly available resources. In some faculties and in some key stages, this work is further advanced than in others. Whatever stage the teams are at in this development though, they would agree that there is more to do, especially with the recent introduction of new GCSE and A-Level specifications and changes to vocational courses.

Over the course of last half term, I met with each of the Curriculum Leaders to review the Swindon Academy Teaching and Learning Model.

Codification Document

These discussions confirmed that the model still effectively summarises what we would want to occur in classrooms on a day to day basis as well as over time. Two areas of the model came up consistently in discussion as needing re-development though:

  1. The assessment element no longer describes current feedback practices which now vary from faculty to faculty due to the introduction of faculty feedback policies.
  2. Prep (homework) needs to feature more prominently to establish clearer expectations and reflect its importance in developing students’ independence.

Alongside this review, I’ve been reading a range of research relating to cognitive science as well as educational texts on assessment, including the books Driven by Data by Paul Bambrick Santoyo and Making Good Progress by Daisy Christodoulou and blogs by Phil Stock, Heather Fearn and Joe Kirby. These have made me consider, in a different light, how we could tighten up on our assessment model so that we:

  1. Know that the assessment systems which are being used are as reliable as we can make them.
  2. Have a shared understanding of the range of valid inferences we can draw from the data provided by these systems.
  3. Ensure that we maximise the impact of these systems, without damaging their reliability.
  4. Continue to increase the level to which students take responsibility for their own progress.

The remainder of this blog is taken from a paper designed to kick start this process. It is divided into two elements: a teacher element and a student element. The first focuses on curriculum and assessment design whilst the second looks at the use of knowledge organisers and self-testing as prep.

The teacher element:

We’ve found, in introducing a number of Doug Lemov’s Teach Like a Champion strategies, that it’s useful to have a shared vocabulary so that we can have efficient and effective conversations about teaching. This should also be the case with assessment practices.

Key definitions:

The following terms will be, I think, key to developing our shared understanding of assessment practices:

Domain:

The domain is the entirety of the knowledge from which an exam/assessment could draw to test a student’s understanding/ability. At Key Stage 4 and 5, this is defined by the specification, though there are also elements of knowledge from previous Key Stages which aren’t listed in specifications but that still form part of the domain.

Sample:

The sample indicates the parts of the domain which are assessed in a specific task or exam. It’s rare we’d assess the whole of a domain as the assessment would be overly cumbersome. Well designed assessments are carefully thought through. Samples should represent the domain effectively so that valid inferences can be made based on the data gained from the assessment.

Validity:

The validity of an assessment relates to how useful it is in allowing us to make the inferences we’d wish to draw from it. “A test may provide good support for one inference, but weak support for another.” (Koretz D, Measuring Up) We do not describe a test as valid or invalid, but rather the inferences which we draw from them.

Reliability 

If an assessment is reliable, it would “show little inconsistency between one measurement and the next.” (Christodoulou)

Test reliability can be affected by:

Sampling:

  • Most tests don’t directly measure a whole domain; they only sample from it as the domain is too big. If the sample is too narrow, the assessment can become unreliable.
  • If the sample is always the same, teachers will strategically teach to the test to seemingly improve student performance.

Marking:

  • Different markers may apply a mark scheme rubric differently.
  • One marker’s standards may fluctuate during a marking period.
  • Teachers can consciously or subconsciously be biased towards individuals or groups of students.

Students:

  • Performance on a particular day can vary between the start and end of a test.
  • Students perform differently due to illness, time of day, whether they have eaten, emotional impact of life experiences.

Difficulty model

In this form of assessment, students answer a series of questions of increasing difficulty. A high jump competition or a GCSE Maths exam paper are good examples of this model.

Quality model

Here, students perform a range of tasks and the marker judges how well they have performed, most often in relation to a set of criteria. Figure skating competitions and English and history GCSE papers use this model.

General issues which Christodoulou identifies with the most common assessment models:

  • A focus on the teaching and assessment of generic skills can lead to teachers paying insufficient attention to the knowledge required as a foundation for those skills. For example, vocabulary, number bonds, times tables, historical chronologies or relevant, subject specific facts can be overlooked in favour of how to evaluate or problem solve.
  • Generic skill teaching makes deliberate practice far more challenging as it focuses on larger scale success as opposed to fine grained assessment and training. For example, formative assessment in sport may take place during a match rather than a drill. Here, the teacher may miss an issue which a student has with a specific aspect of the sport and then not address it.
  • Using only exam questions for assessment, especially though not exclusively for subjects whose exams are based on the quality model, can hide weaknesses which are at a smaller scale.

Specific issues which Christodoulou identifies with ongoing descriptor assessment and exam based tests:

Limitations with using descriptor based assessments to formatively assess:

  • Descriptors can be vague or unspecific.
  • Using assessment descriptors to feedback can be unhelpful as the describe performance rather than explain how to improve.
  • Descriptors focus on performance rather than long term learning.

Limitations with using descriptor based assessments to summatively assess:

  • Tasks are often not taken in the same conditions by all students which makes assessment less reliable.
  • Descriptors are interpreted differently by different markers.
  • Judgement based on descriptors is subject to bias.

Limitations with using exam based assessments to formatively assess:

  • By their nature, these tests have to sample from a wider domain so we cannot identify precise areas of strength and weakness for students.
  • As questions become more challenging or more difficult, it also becomes more difficult to identify which aspect of the question students did well or badly in.
  • Exams are designed to provide grades and grades aren’t sensitive enough to measure progress in individual lessons.

Limitations with using exam based assessments to summatively assess:

  • If we use exam formats and grades too often with students we can end up teaching to the short term rather than the longer term.
  • All students need to take the assessments in the same conditions to secure levels of reliability.

Assessment Solutions: 

Having established these issues, Christodoulou suggests the following principles for effective formative and summative assessment:

Formative assessment principles:

  1. The tasks/questions set need to allow teachers/students to easily identify issues and next steps. In particular, if you teach a subject which is normally assessed through the quality method in exams, it is worth considering a more fine grained testing approach to assess formatively.
  2. The process needs to include repetition to build to mastery otherwise the formative assessment won’t have the desired impact.
  3. Once material has been mastered, students need to be required to frequently retrieve key learning from their long term memories.
  4. Formative assessment should be recorded as raw marks as this makes it easiest to track from one lesson to the next.

Summative Assessment Principles:

  1. Summative assessments should be taken in standardised conditions and marked in a way which maximises reliability.
  2. They should cover a representative sample of a significant domain.
  3. Scaled scores are more reliable than raw marks for summative assessment.
  4. Enough time should pass between summative assessments for students to make worthwhile improvements.

What are our next steps for the teacher element?[1]

  1. Ensure the curriculum is effectively mapped out and sequenced, establishing the factual and procedural knowledge which students will learn. Divide the knowledge from the curriculum into that which students need in the long term and that which students need for a specific unit. Ensure the bulk of curriculum and prep/revision time is spent on students focusing on retaining the most important knowledge. Build space into the curriculum to assess retention of knowledge from previous units which students need in the long term.
  2. Establish when students will be assessed both summatively (whole Academy calendar) and formatively (faculty curriculum overviews). As far as possible, this should take into consideration: the completion of teaching all elements, enough time between teaching and testing for revision and to suggest that our inferences are based on learning rather than performance.
  3. Ensure that the purpose of each assessment is clear to all involved in its design, delivery, marking and provision of feedback. The format of the test should enable the function to be achieved. It should also ensure that the inferences drawn from the results are as valid as possible. The main purposes of our summative assessments include re-streaming students, reporting to parents, establishing attainment and progress over time in teaching groups and cohorts of students to report to governors. A key question for you here is whether your summative assessments are reliable enough to enable you to validly infer that certain students are working at “age related expectations” in your subject. Formative assessments should be used to identify potential gaps in knowledge, misconceptions or deficiencies in ability that can be subsequently addressed.
  4. Design assessments aligned with this timing and purpose. Using Christodoulou’s principles for summative and formative assessments will help here. Over time, two separate banks could be built up: one of summative and one of formative assessment tasks. For summative assessment, it’s also worth asking yourself the following questions, based on those found in Santoyo’s book Driven by Data. Do assessments in each year:
    • Address the same standard of skill/content as the end of Key Stage assessment
    • Match the end of Key Stage assessment in format?
    • Enable students to move beyond that year’s content/skill level?
    • Reassess previously taught content which is necessary to retain until the end of the Key Stage?
  1. Trial the use of comparative judgements in subjects where the substantial proportion of assessment uses the quality model. 
  2. Preview assessment tasks to ensure that:
  • Questions don’t provide clues as to the answer.
  • Questions are actually testing that students have learned or can apply the knowledge you wanted rather than something else.
  • Questions are worded accurately and any unnecessary information is removed.
  1. Review assessments after use to establish whether they provided you with information that enabled you to make the inferences you wished. Make amendments to assessment items, where required, if they are to be reused in the future. 
  2. Standardise the conditions in which summative assessments take place and the ways in which they are marked. 
  3. Ensure that, where data from assessments is used to make key decisions, the data is sufficiently reliable. For example, when moving students between sets, data from more than one assessment is utilized.
  4. Develop the teaching and learning review which forms part of each teacher’s CPD Booklet to ensure that teachers have action plans in place to address gaps in attainment.
  5. Establish procedures for Curriculum Leaders to review and summarise teacher’s action plans, sharing them with their Line Managers for quality assurance.

The Student Element. 

Over the past two years, a number of our faculties have been trialing the use of knowledge organisers and low stakes testing or quizzing as part of the process of curriculum design. Different models have emerged, sometimes with different purposes and using different frameworks. We want to make the use of knowledge organisers, self-testing and the use of flashcards a core part of our students prep across subjects.

In order to secure the highest impact of this work, we need to evaluate the models currently in use to generate a set of shared principles and uses for these tools. We need to be sensibly consistent in our approach, keeping in mind the differences between the subjects that we teach. There are certainly potential benefits to the use of both knowledge organisers and quizzing, but we need to ensure these are harnessed effectively in each subject area.

Why should we bother with quizzing and knowledge organisers? Aren’t they just fads?

The term knowledge organiser could be a fad, but the idea of organising knowledge into schemas is certainly not as it has been going on for centuries.

As subject specialists, having carefully mapped our curriculum through from Key Stage 3 to Key Stage 5, it would be both wise and desirable to look for the most effective methods to ensure that students retain as much of the knowledge we are teaching them from one year to the next and, of course, into their lives beyond school. 

On a more pragmatic level, in order to support our students to do well with the new GCSE qualifications, we need to help them develop methods for retaining knowledge in the longer term. These qualifications are now more demanding. They require students to retain knowledge longer as they are based increasingly on terminal examinations rather than coursework and they ask more of them in terms of problem solving.

Even if it weren’t for this though, over the course of the last century, hundreds of cognitive science studies have ranked practice testing as one of the most effective methods of improving the retention of information and procedures in the long term memory. “In 2013, five cognitive scientists (Dunlosky, Rawson,Marsh, Nathan, Willingham 2013) collated hundreds of such studies and showed that practice testing has a higher utility for retention and learning than many other study techniques.”

The table below is taken from John Dunlosky’s “Strengthening the Student Toolkit”. In this paper, he argues that, “while some [study] strategies are broadly applicable, like practice testing and distributed practice, others do not provide much – if any – bang for the buck.” Low stakes, practice testing is one of the most effective study methods. 

Dunlovsky

Alongside this, sits Cognitive Load Theory and the work of John Sweller. Our teaching and learning handbook outlines the idea that our working memories have limited capacity only coping with approximately 7+/- 2 items of information. Once we go beyond these limits, then our thinking processes become bogged down. These ideas have been refined over the last couple of decades into a set of instructional principles called Cognitive Load Theory. In their book, “Efficiency in Learning” Sweller et al argue that, “Taken together, the research on segmenting content tells us that:

  • Learning is more efficient when supporting knowledge, such as facts and concepts, is taught separately from main lesson content.
  • Teaching of process stages should be proceeded by teaching the names and functions of components in the process.
  • Teaching of task steps should be segmented from teaching of supported knowledge such as the reasons for the steps and/or concepts associated with the steps.”

Well-designed knowledge organisers or schemas and effective self-testing could therefore be useful in terms of reducing the cognitive load on our students when they are applying knowledge in performance, production of problem solving.

Knowledge Organisers

In a blog post entitled, “Knowledge Organisers: Fit for Pupose?” Heather Fearn describes how she looked at lots of examples of knowledge organisers and found that often there was a confusion over their purpose which caused the documents to be muddled in design. As a result, they were confusing for students to use. She identifies three valid purposes:

  • A curriculum mapping tool for the teacher
  • A reference point for the pupil
  • A revision tool for the pupil and possibly parents

Given that we have Schemes of Learning for teachers to make use of and text books for students as a wider reference resource, I believe a useful definition of a knowledge organiser at Swindon Academy would be:

A structured, single A4 sheet which students, teachers and parents can use to create low stakes practice quizzes. The sheet identifies the raw knowledge which needs to be recalled swiftly in order to be successful within the assessment for a specific unit. This could include: 

  • Definitions of terms, concepts or key ideas
  • Components of a process
  • People/Characters involved in a chronology
  • Processes/Chronologies/Narrative summaries
  • The steps in procedures

Use the following to check the formatting of your knowledge organisers.

  • Colour code knowledge which will be required beyond the end of the unit and knowledge which is only required in the medium term.
  • Number each item in each section to enable easy self-testing.
  • Embolden the absolute key words so that peer markers of quizzes can check they have been used in answers.
  • If you have to write more than one sentence, consider your phrasing. This will make your own explanations clearer and more efficient when you speak.
  • Don’t have too many sections/categories – four or five are probably sufficient.
  • If including images, ensure these are the same format as those you will use in your actual lessons.
  • Spellcheck your knowledge organizer.
  • Don’t include questions or ‘thoughts to consider’.
  • If it isn’t essential it shouldn’t be there.

Self-testing. 

In his blog, “One Scientific Insight for Curriculum Reform” Joe Kirby of Michaela Community School poses the question: “what’s the optimal format and frequency of low-stakes testing or retrieval practice?” He cites various research papers from Roediger et al. In terms of format, he maintains that “Applied research suggests [well designed] multiple-choice questions are as effective as short-answer questions. The latest research study is as recent as March 2014, so this is a fast-evolving field, and one to keep an eye on.” With regards to frequency, he adds, shorter and more frequent quizzes outperform longer and less frequent. However, current research suggests that impact on our long term memory is maximised if this testing is spaced and interwoven.

He then goes on to summarise the work of a number of cognitive psychologists from the book “Make It Stick” in the following set of principles for self-testing:

  • Use frequent quizzing: testing interrupts forgetting
  • Roll forward into each successive quiz questions on work from the previous term.
  • Design quizzing to reach back to concepts and learning covered earlier in the term, so retrieval practice continues and learning is cumulative.
  • Frequent low-stakes quizzingin class helps the teacher verify that students are in fact learning as well as they appear to be and reveal the areas where extra attention is needed.
  • Cumulative quizzingis powerful for consolidating learning and concepts from one stage of a course into new material encountered later.
  • Simply including one test retrieval practicein a class yields a large improvement in final exam scores, and gains continue to increase as the frequency of testing increases.
  • Effortful retrieval makes for stronger learning and retention. The greater the effort to retrieve learning, *provided that you succeed*, the more learning is strengthened by retrieval.
  • In virtually all areas of learning, you build better mastery when you use testing as a tool
  • One of the best habits to instill in a learner is regular self-quizzing.

What are our next steps for the student element?

  1. Design knowledge organisers which fit the definition above for Schemes of Learning in Years 7-9.
  2. Use the checklist above to review the knowledge organisers.
  3. Devise self-tests or drills which could be used to assess students’ retention of the knowledge. This should include:
  • Completion of a blanked out timeline
  • Matching definitions and key terms
  • Labeling key diagrams from the organiser
  • Answering questions based on the knowledge organiser
  • A crossword with definitions from the organiser as the clues
  • Translation exercises for MFL using vocabulary from the organiser
  • Short answer questions and multiple choice questions based on the knowledge from the organiser
  1. Generate a prep schedule for students for self-testing of the sections of each knowledge organiser. In the first week, students will produce flashcards based on the organiser and in future weeks, students will use Look And Say and Cover and Write and Check (LASACAWAC) or an online quizzing platform for a specific proportion of their prep each week.
  2. Ensure knowledge organisers are stuck in to each prep book.
  3. Train students in how to use their knowledge organisers.
  4. Ensure that, as students move through the key stage, they are frequently testing themselves and being assessed in class on knowledge from previous units which they require at the end of the key stage.
  5. Add the schedule to E-Praise.

[1] Some of the following are taken from Phil Stock’s sequence of blog posts “Principles of Assessment.”

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s