Strategies for Comparable Standards: External Examination in National Developments

Guide 2:External Examining – National Developments Liverpool Hope University External Examiner Guidance - 2018 University Registrar

Contents Three strategies for assuring that standards slides 3-5are comparable across institutions Some challenges: slides 6-11 2a. degree classification rules slides 6-9 2b. accuracy and reliability of marking slides 10-12 3. Recent developments: slides 13-16 3a. UUK Degree Algorithms Project slides 13-14 3b. HEA External Examining Project slides 15-16 4. And Finally….. Slide 17

Three Strategies for assuring that standards are comparable across institutions [a] • QAA Subject Benchmarks • “Subject Benchmark Statements set out expectations about standards of degrees in a range of subject areas. They describe what gives a discipline its coherence and identity, and define what can be expected of a graduate in terms of the abilities and skills needed to develop understanding or competence in the subject.” [qaa.ac.uk] • To see benchmarks for each subject, go to • http://www.qaa.ac.uk/assuring-standards-and-quality/the-quality-code/subject-benchmark-statements

Three Strategies for assuring that standards are comparable across institutions [b] • QAA Frameworks for Higher Education Qualifications • For example: “Bachelor's degrees with honours [Level 6] are awarded to students who have demonstrated: • a systematic understanding of key aspects of their field of study, including acquisition of coherent and detailed knowledge, at least some of which is at, or informed by, the forefront of defined aspects of a discipline • an ability to deploy accurately established techniques of analysis and enquiry within a discipline • conceptual understanding that enables the student: • to devise and sustain arguments, and/or to solve problems, using ideas and techniques, some of which are at the forefront of a discipline • to describe and comment upon particular aspects of current research, or equivalent advanced scholarship, in the discipline • an appreciation of the uncertainty, ambiguity and limits of knowledge • the ability to manage their own learning, and to make use of scholarly reviews and primary sources”. • For details, go to • http://www.qaa.ac.uk/en/Publications/Documents/qualifications-frameworks.pdf

Three Strategies for assuring that standards are comparable across institutions [c] • External Examiners • External examining has played an important role in UK higher education since the 19th century. • At Liverpool Hope, we are very grateful for the work undertaken by External Examiners, and value their comments very highly. • In principle, the use of external examiners can provide assurance to students, employers and other stakeholders that standards are comparable across institutions. • However, there are at least two reasons to suppose that currently, the External Examining system, even when used in conjunction with the other strategies, cannot give complete confidence that standards are always comparable across institutions [or indeed within institutions]……….. • These reasons are explored in the next section.

2. Challenges:[a] Degree Classification Algorithms (i) • In almost all cases Level 6 Honours degrees are classified using the following scheme: • First class [typically 70+] • Upper second class [typically 60-69] • Lower second class [typically 50-59] • Third class [typically 40-49]. • Level 7 Masters degrees tend to be classified differently: • Distinction [70+] • Merit [60+] • Pass [50+ or 40+]. • However: • the scale can vary across Universities [especially for Masters degrees]; • each University is able to set its own detailed rules: there is no national benchmark.

2. Challenges:[a] Degree Classification Algorithms (ii) – types of variation • Type of rule: • University A overall aggregate [eg 1st if 69.5 or higher overall]; • University B preponderance [eg 1st if 2/3 of modules are 70+]. • Weighting of levels to the overall aggregate: • University C 10% C[4] / 30% I[5] / 60% H[6]; • University D 0% C[4] / 30% I[5] / 70% H[6]; • University E 0% C[4] / 20% I[5] / 80% H[6]; • An overall 68.9 has a different meaning in the 3 institutions. • Which marks are included in the overall aggregate: • University F – ALL; • University G – ignore poorest 20 credits at I[5] & H[6]. • Students have to do better to get 68.9 in F than in G. • Mitigating Circumstances: • University H – ignore [due to fit to sit policy]; • University I – can exclude work affected by MC from calculation. • Students with mit. circs. get a higher mark in I than in H.

2. Challenges:[a] Degree Classification Algorithms (iii) – impact on students’ awards • Woolf & Turner (1997); • took sets of marks from real UK graduates, and ran each student through classification rules from different UK Universities; • 15% of the graduates might have been given a different classification by another University. • In 2015, 370910 students gained a classified award [HESA, 2016]: • 15% of 370910 is 55737; • over 55000 2015 graduates might have been given a different classification by another University!

2. Challenges:[a] Degree Classification Algorithms (iv) – reflections • Role of External Examiners? • external examiners CAN provide assurance that a University is following its own rules; • they CANNOT, in principle, provide assurance that the University’s classification rules are in line with national standards, • they CANNOT, in principle, provide assurance that “a first class degree in Institution X means the same as a first class degree in all other institutions”. • Can they at least provide assurance that every mark is accurate?

[b] Accuracy and reliability of marking (i) • It is sometimes assumed that an External Examiner can provide assurance that each assessment has been given the “correct” mark on a percentage scale. • This may hold for some assessments [eg multiple choice tests] but the nature of most assessments means that in many cases, it is unlikely that anyone can mark reliably to a high degree of precision. • Numerous studies have demonstrated unreliability of marking; perhaps the most interesting is the study on the next slide.

2. Challenges:[b] Accuracy and reliability of marking (ii) • Bloxham et al (2015) ** • 24 academics in Psychology, Chemistry, Nursing and History [all external examiners] assessed 5 assessments in their subject; • all work had been awarded marks at 2ii/2i borderline, but • participants graded 1st class to 3rd, • only 7 participants thought all 5 were in adjacent classes; • in relation to rank ordering: • just 1 assessment [History] given same rank by all examiners, • 9 assessments were ranked best and worst; • evidence of differences even when looking for same criteria! • Sue Bloxham, Birgit den-Outer, Jane Hudson & Margaret Price (2015): Let’s stop the pretence of consistent marking: exploring the multiple limitations of assessment criteria, Assessment & Evaluation in Higher Education, DOI: 10.1080/02602938.2015.1024607

2. Challenges:[b] Accuracy and reliability of marking (iii) • Reflections: • it is important for all examiners [internal and external] to be aware of the limits to the precision to which some assessments can be graded or marked; • assuming that it will not be possible to identify the precise mark that every assessment is worth, an External Examiner; • ought not to impose mark changes for individual students • but could usefully provide overall feedback about • broad standards achieved by students, and • the appropriateness of assessment tasks and exam questions; • in principle, it might be possible to enhance reliability if consideration was given to developing national expectations for awarding different grades [probably at discipline level].

3. Recent Developments:[a] UUK Degree Algorithms Project (i) • UUK conducting project for HEFCE • Objectives • To “look at the range of models employed by the sector”. • To “assess whether there are trends that may undermine wider confidence in degree standards. • To “consider whether concerns about threshold effects at degree boundaries are influencing the types of algorithms…being employed”. • Questions: • “What are the pedagogical foundations of degree algorithms?” • “What other drivers of change are affecting the design of…algorithms?“ • “Who are the internal and external stakeholders?” • “What are the characteristics of the different rules and processes that are currently used?” • “What principles should be taken into account when considering changes?”

3. Recent Developments:[a] UUK Degree Algorithms Project (ii) The report highlights how between 2007-08 and 2016-17, the combined number of first-class and upper second-class (1st and 2.1, or 'upper') honours awarded in the UK grew by 55 per cent. The proportion of 1sts has doubled from 13 per cent to 26 per cent of all classified degre75% of undergraduate students are expected to graduate with upper degrees. The trends present two main challenges: • the need to maintain and strengthen the public's confidence in the integrity of academic standards in the context of improving student attainment, and • the need to respond to ongoing improvements in student attainment within the current approach to classifying and calibrating student attainment to enable differentiation. The report sets out the case that these complex challenges require a clear and demonstrable response, individually by institutions and collectively by the UK and national sectors.

4. And finally…… • We hope you have found this presentation useful. • If you have any queries please email Dr Cathy Walsh [University Registrar] at walshc@hope.ac.uk

Strategies for Comparable Standards: External Examination in National Developments

Strategies for Comparable Standards: External Examination in National Developments

Presentation Transcript

“This is a Test. This is Only a Test!”

Software Testing

3D Test Issues

Test and Test Equipment December 2012 Hsin -Chu , Taiwan

Who wants to be a Millionaire?

Test Preparation, Test Taking Strategies, and Test Anxiety

Test Automation Tools: QF-Test and Selenium

System Test Specification

TDC ( Test Description Code)

Engine Condition Diagnosis

Chi-square test or c 2 test

200

Test del Software, con elementi di Verifica e Validazione, Qualità del Prodotto Software

Test of Significance

System Test Tools

Lesson 7