|
|
|
Moderation of marking: departmental and peer regulation of marking
Moderation acts as a check on the reliability of marking, ensuring consistency of marking against the criteria provided and
that marking is fair. Practice in this area is governed by the Academic Standards Quality Handbook, Section 15. The excerpts
below present the kinds of moderation which are available to NTU staff depending on the focus on consistency and consensus:
a) Checking consistency: This approach describes the notion that the goal of moderation consists in confirming that a marker
has consistently applied the marking criteria.
“Moderation of marking is generally undertaken by reviewing a sample of students' marked work. This involves the moderator
in reviewing (rather than marking in the full sense) an agreed sample of work to establish whether the marking is at the appropriate
standard, consistent, and in line with the explicit assessment criteria” (Section 15, ASQ Handbook p. 13).
b) Emphasis on seeking consensus or inter-marker agreement: This approach emphasises that moderation is aimed at obtaining
agreement (consensus) across markers as a way to enhance reliability.
“Moderation can also be completed in specific instances through double or team marking. In this case student work is independently
marked by more than one marker. Double or team marking can be undertaken as blind marking, where each marker is unaware of
the marks allocated by the other(s), or as second marking, where all markers are aware of the marks they have assigned.”
“Double or team marking should be used as the moderation process for dissertations and major projects/studio work at final
award level.” (Section 15, ASQ Handbook p. 13)
The guidelines above leave room for different interpretations regarding practical implementation in practice. In conversations
about moderation practices at NTU, there are two issues that may have an impact on both practical aspects and effectiveness.
Firstly, different practitioners adopt a consensus over a consistency approach. Secondly, moderation can be conducted at different
points (before marking, mid-way or after marking is complete) and with different aims. These practices vary across contexts.
Emphasis of moderation on consistency versus consensus In practice, moderation combines aspects of both the consistency and consensus approaches described above. There might be
an element of checking the consistency of the marker but also, consensus on marks across markers is commonly sought. Most
interviewees reported applying a mixture of both and prioritising the notion of reaching inter-rater consensus on marks (over
checking consistency). This priority, in practice, translates into conducting a form of second marking exercise where the
aim is to check that both markers agree with the marks given.
The majority of interviewees approached moderation as an exercise to reach agreement on marks (primarily) with only one interviewee
revealing a consistency checking approach. This colleague's moderation consisted primarily in checking another marker's consistency
in applying marking criteria.
Moderation types The main types of moderation of marking are pre- and post-marking. Post-marking is conducted in all cases. Pre-marking might
be conducted in addition to post-marking moderation.
- pre-marking moderation – some teams might engage in pre-marking moderation. This can take place in a range of ways: discussions on standards
and expectations of performance; team marking where each team member marks a set of scripts, followed by a meeting to discuss
different standards applied and marks awarded. These meetings may result in refining a marking grid for markers.
- post-marking moderation - moderation that takes place after marking. This is the most common at NTU. This post-moderation generally checks
consistency of marking and looks at fails and borderline cases. Practices vary, as described above, according to whether the
emphasis of moderation is placed on consensus or on consistency checking. Post-marking practices also vary by mode, with some
teams meeting face to face and some systematically conducting moderation without requiring a face to face meeting. Factors
which influence this process are time constraints and perceptions of necessity.
The amount of moderation and how samples are chosen also varies widely. For example, on some occasions the first marker might
make a selection of scripts to be moderated. On other occasions, the complete set of marked scripts is made available to the
moderator for them to make their own selection.
Reflecting on the nature of moderation: principles and practices In the marking as a sequence section, the challenges of marking essays reliably was explored. The relatively unconstrained nature of essays poses challenges
for the reliability of marking (Brown, 2001) creating an issue for both inter- and intra-marker consistency. The reported
practices at NTU suggest a preference for prioritising inter-marker agreement as a form of moderation, although it should
be noted that this discussion is based on a small number of interviews and therefore this conclusion might not be representative
of team practices more broadly.
Literature on the relative effectiveness of either approach.
According to Brown (2001) moderation is more effective and efficient if interpreted as an exercise to verify the self-consistency
of a marker. This is on the grounds of the intrinsic challenge of reaching inter-marker agreement posed by essay-type answers.
Reaching agreement between markers (consensus-oriented) is not superior to reaching a high-level of self-consistency for improving
the reliability of marks. Equally, from a practical perspective, emphasis on inter-marker consensus resembles to some extent
double marking and is far more resource-intensive than a consistency-check approach.
Meeting face to face in a post-moderation exercise might not be necessary if this is conducted as a check on a marker's self-consistency.
Equally, face to face interaction might pose challenges to the transparency of the exercise.
Most of the literature also indicates that pre-marking trial marking in teams might be more effective than second marking
post-marking moderation (Brown, 2001; Meadows and Billington, 2005).
However, no single method can ensure or enhance reliability, particularly in marking essay answers. A greater number of checks
on reliability does not necessarily improve reliability unless an extensive standardisation exercise is conducted. An example
of a technique which is known to enhance reliability in a robust way is multiple marking strategies where all markers (e.g.
2 or 4), read all scripts and award marks to all scripts. Final marks are the average from all those markers (Meadows and
Billington, 2005). Standardisation exercises are conducted in large scale testing (national tests) but they are too labour
intensive for most other contexts. A certain amount of disparity across markers is natural in the context of essay marking.
Given the constraints, it might be more appropriate for moderation to aim at checking consistency in marking rather than debating
individual marks to reach consensus.
You may also be interested in:
References BROWN, G., 2001. Assessment: A guide for lecturers. LTSN Assessment series. [Accessed on: 15 of September 2011].
MEADOWS, M. and BILLINGTON, L., 2005. A Review of the Literature on Marking Reliability. AQA Research Paper. RP
|
|
|
CADQ Nottingham Trent University Dryden Centre 202 Dryden Street Nottingham NG1 4FZ
|
|