Making the Grade
The fall out from the A level results and this week the GCSE results will be uncomfortable for government and upsetting and challenging for teachers and students alike. Arguments over whether this years results are robust and fair miss one key issue.
Put simply; " Has the exam system in England ever been robust and fair for individual pupils?"
For those of us who did well in exams and whose children also did well, it is too easy to be confident. Accepting that our success and others failure is a systemic problem not a result of competence and capability is not easy.
Let me be clear, I do not have confidence in the exam system in England as a measure either of success or capability.
Take for instance the argument that teachers over estimate grades. Where is the evidence that that is the case rather than that the exam system misses what teachers see? On that I am neutral. For me the evidence works both ways.
I have appealed in the past against teacher grades because they were ( in my opinion) too low. One individual got a 2.1 and masters at a University that would not have offered a place on the basis of teacher grades. That's OK if like me you know the system and have the confidence to challenge. What about those less fortunate?
A recent example is an Eton Educated Nobel prize winner who was told by a teacher that he would never make a scientist.
I think that it is important to describe what a robust system would look like, in my opinion, and to show why the current system should not be seen as such.
Try this as a thought experiment. Imagine that I gave an exam paper submission to 100 examiners. Let me assume that it "objectively" is a C grade. Would all 100 examiners give it a C? If not, what is the spread? Is the spread the same for English literature, Physics and Geography as just 3 examples. If you cannot provide clear evidenced answers to these questions, how can you be confident that the system is objective?
If we look at the examiners, the same challenge appears. Are all examiners equally consistent in their marking, or do some tend to mark up or down? Where is the evidence, reviewed and published to demonstrate robustness?
We also know that the month you are born still has an effect on GCSE grades. What is robust about that?
So what does objective look like? Imagine the shy boy answering a question on Romeo and Juliet. He wonders why he is drawn to Romeo and not Juliet. How do you know that it is his literature understanding that you are marking? If you think that is unfair, let me illustrate with an example. Some years ago, I sat in on a filming in a school where the pupils had been engaged in developing the schools policy on bullying. The maturity of the children in a school that was pretty mainstream was impressive. One girl talked about a boy in her class who was being bullied. She was 12-13. She offered the opinion that " he was probably gay but didn't know it yet". Importantly, the teachers did not know that the boy was a victim of bullying.
When my eldest son was doing his GCSEs one of his teachers told him " you will lose marks for knowing that". If a teacher feels that, correctly or incorrectly, then this is a sorting game of sheep and goats, not a measure of achievement and capability.
I have known children who have missed out on grades after divorce, separation and death of parents, siblings and pets. I cannot objectively give a measure of the impact, but then neither can the exam system. I would add that I suspect a class mate of mine missed out because of hay fever. Children with health issues such as leukaemia and asthma whose schooling is disrupted have had their grades affected every year, not just this one.
So, the high stakes exam system is, for me, a winner takes all loaded gun embedding inequality and privilege in the outcomes.
Can we do better? Well, if we want to use exams, then each paper needs to be marked by say 5 independent assessors. If they all agree on a "B" then that is a measure of confidence. This is often a model used for assessing loans, grants and investments in businesses. It does not guarantee success of course, but what it does do is reduce reliance on potentially biased individuals. If I was an examiner and woke up today in a foul mood, would I mark a paper the same today as yesterday? I would not bet on it.
The really interesting cases in my experience are where you get 2As a C and 2Ds for instance. In my experience, I've seen it more often in "creative subjects", but some non traditional thinkers in subjects like mathematics ( a highly creative discipline, by the way) often don't fit the narrow models of assessment of our exam system. The problem with this example of bringing people together to try get a consensus on a "B" eliminates the value that comes from the diverse views and the richness of the different perceptions.
So, for me, for a system to be robust it has to have more than one measure to allow the individual, parents, universities, FE and employers access to a richer view of an individual. If someone got an ABBCD in English that is as interesting as someone who got straight Bs. Some years ago at a Conference I was talking to a teacher about a gifted pupil, "Tom".
Tom lived for poetry. he had written poems since he was 10. He memorised many and could recite fluently without notes. Within a week of discovering a new poet he could attempt writing a poem in their style. With a play he would always volunteer to be part of a reading. However, novels bored him. He switched off and was difficult to engage. Let's grade him, A* (Poetry), B (Play), D (Novel). What is a fair grade that reflects his achievements and potential? A Grade of B is there to meet the needs of the exam system, not the learner. The nuance is lost by an overall grade. Interestingly, if I had that discussion today, I would probably suggest audiobooks.
More importantly there are already models that command respect in grading skill levels. Parents are quite happy if a child is doing grade 6 piano and grade 2 flute at the same time. They are quite happy for a child to sit when ready and have the chance to resit. Yet in the school setting the pressure is there for a child to be at level 8 say for all subjects. That puts unnecessary pressure on pupils, teachers and schools.
Imagine how society would react if you could only take the driving test once at 17 and barriers were raised to stop you retaking it.
I feel sorry for the students, their families, teachers and their schools in the current tragedy. It is about time that we acknowledged that we have the "emperors new exam system".
This years bizarre algorithmic system is not robust, but then we have never had a robust system as far as I am concerned. Let's open our eyes and build something that we should have more confidence in. Carpe Diem
Principal Learning Technologist and Doctoral Candidate (EdD) at Sheffield Hallam University. SFHEA.
4 年Hi Chris, This is a superb article which expressed much more eloquently what I have been saying for years (and being ignored or talked down) about the exam system's shortcomings. I'm so pleased that you have written this (and will be quoting some of it in my Ed D work which I am engaged with currently!) It would be nice to think that this crisis would lead to major change in the exam system ... but sadly I doubt it as the current system is very beneficial to the politicians (of all persuasions) as a crude sorting tool - as you mention in your text. Thank you for writing this.
https://www.bbc.co.uk/news/uk-england-oxfordshire-53798821 Note the reference to "University organised subject aptitude tests". I would expect the use of these to spread as confidence in Ofqual and the processes it supervises fades away.