What Is 'Test Fairness'?
Teaching English with Oxford
Providing education leaders and decision-makers with solutions and insights into English learning and assessment.
What exactly is a ‘fair test’, and why is it so crucial for English language teaching? This blog post takes a closer look.
Test fairness, justice and bias
At first glance, ‘fairness’, ‘justice’ and ‘bias’ may seem like overlapping terms. However, language testers use these terms differently to describe different ideas and aspects of testing. ‘Test fairness’ is more than just giving everyone the same exam. It's about ensuring that assessments accurately measure language proficiency without unfairly disadvantaging any group of test-takers. This concept is therefore closely linked to validity. This ensures test takers can have confidence that the test is measuring what the test developers claim it is measuring. Justice is about ensuring that test scores are used for the correct purposes. English language test scores should only be used to evaluate English language proficiency.
Bias, on the other hand, is the enemy of fairness. Bias is something we measure statistically. It occurs when test items or procedures systematically favour or disadvantage certain groups of test-takers for reasons unrelated to their language ability. As educators and test developers, it's our responsibility to identify and eliminate any bias in our assessments.
Equity and equality – is there a difference?
A more recent trend in education is the discussion around ‘equity’ versus ‘equality’. While equality means treating everyone the same way, equity recognizes that different individuals may need different support to achieve fair outcomes. This distinction is crucial in language testing and it’s illustrated using a widely-circulated image (see Figure 1).
This image, while often shared, is rarely interpreted in any depth. In Figure 1, the aspects of the image are interpreted with direct relevance to English language testing. The image depicts two scenarios of people watching a baseball game over a fence. The fence represents a hypothetical language test, with the top of the fence symbolizing a cut score or pass mark. The baseball game itself represents success or achievement after successfully completing the test.
On the left side, labelled "Equality," each person stands on an identical box, meaning all test-takers take the test under the same conditions. However, we can see that the shortest person still can't see over the fence – they can't "pass" the test despite the equal treatment. On the right side, labelled "Equity," the boxes are redistributed based on need. These boxes can be interpreted as changes to the test based on specific needs. Now, all three individuals can see over the fence, representing fair access to success on the test. This is what we strive for in equitable language testing – providing appropriate support or accommodations to ensure all test-takers have a fair chance to demonstrate their true language abilities.
Universal design, modifications and accommodations
Universal design in testing aims to create assessments that are accessible to the widest possible range of test-takers from the start. In our baseball analogy (Figure 1), the boxes on the left side represent universal design principles. These could include features like clear instructions, consistent layout, or adjustable font sizes and colour schemes that benefit all test-takers without giving an unfair advantage to any group.
When universal design isn't enough, we may need to make modifications or offer accommodations. In Figure 1, these are represented by the redistribution of boxes on the right side. Accommodations could range from:
领英推荐
These accommodations are provided based on individual needs, depending on certain learning disabilities.
It's important to note that accommodations should always level the playing field, not give an unfair advantage. The goal is to remove construct-irrelevant barriers while still measuring the intended language skills. The analogy of the image falls down on the right-hand side of the figure, as not all test takers can or will receive the same test score!
Investigating test bias
Finally, we come to bias. Identifying bias in language tests is a critical step towards fairness. This often involves rigorous statistical analyses, such as Differential Item Functioning (DIF), which can reveal if test items are behaving differently for different groups of test-takers. It’s important to note that average performance differences by groups of test takers do not represent bias by themselves. For example, in the Oxford Test of English Advanced Pilot study, there were observable performance differences between language groups (Romance vs. Turkic languages). What we’re looking for are questions or tasks which behave differently compared to the overall trend in the test.
Some possible sources of bias
Bias can creep into our tests in many ways. Cultural content that favours certain groups, gender-stereotyped language, or test formats that advantage test-takers from particular educational backgrounds are just a few examples. As English becomes increasingly global, we must also consider the fairness of using only one variety of English in our tests. Including diverse varieties of English in our content and standards frameworks can help make our assessments more equitable for learners from different linguistic backgrounds.
Concluding thoughts
As we navigate the complexities of assessing English in a global context, we must remember that fairness goes beyond just the test itself. It's about creating transparent assessment practices, dedicating classroom time to explain our assessment methods, and always being open to questions and requests for accommodations.
Returning to our baseball analogy (Figure 1), we can see that true fairness in testing isn't about giving everyone the same thing (equality), but about giving each test-taker what they need to have an equal opportunity to demonstrate their language skills (equity). By keeping fairness at the forefront of our testing practices and continually questioning and improving our methods, we can ensure that our assessments truly reflect our students' language abilities, regardless of their background or circumstances. After all, isn't that what it's all about?
How could your institution apply the principles of equity we've discussed to daily language teaching and assessment practices to create a more equitable testing experience for all learners? Share your thoughts in the comments.
Article by Dr. Nathaniel Owen