40 likes | 57 Views
Addressing the challenge of imperfect website accessibility, the BITV-Test introduces a graded rating scale to assess flaws and determine conformance to WCAG Level AA. The scale offers nuanced evaluations, ensuring both frequency and criticality of accessibility issues are reflected in the final site score. Utilizing a reliable evaluation process, the test aims to provide a fair and accurate assessment of web accessibility.
E N D
1. Problem addressed • Real-life websites usually show less-than-perfect accessibility — even those that strive to be accessible. • WCAG Techniques and Failures have binary tests which make it difficult to deal with minor flaws: neglect them, or be too strict? • The German BITV-Test (www.bitvtest.eu) uses a 5-point graded rating scale to address this problem.
2. Major difficulties • When rating individual instances, results can often be somewhere between pass and fail. • Some ratings will apply not to instances but to patterns. What level of deficiency will constitute a failure? • Some instances can be critical, others minor • Often, some instances on a page pass while others fail. Should the page then pass or fail a particular success criteria?
3. The graded rating approach • BITV-Test has 50 checkpoints mapping to WCAG level AA with a weight of 1, 2 or 3 points (adding to 100 points) • Full “pass” will contribute 100% of checkpoint weight. Further grades: 75%, 50%, 25%, 0 % • Ratings reflect both the frequency and criticality of flaws • Results per page are aggregated to a site score (X of 100 points) based on the page sample
4. The reliability of graded ratings • Reliability can be expressed as degree of replicability in an independent test with another tester • The BITV conformance test is conducted as independent tandem test followed by an arbitration phase • Arbitration corrects oversights and rectifies both too lenient and too strict ratings • Experience shows that the 5 point graded rating scale is quite reliable. A statistics function has been added to quantify inter-evaluator reliability