1 / 14

Challenges of Piloting Test Items

This article discusses the challenges faced by the Branka Petek School of Foreign Languages in Slovenia when piloting test items. It highlights the importance of piloting to obtain a clear picture of candidates' language skills and the need for good test items. The challenges faced include finding an appropriate population for piloting, administration of the items, test format, and statistical analysis. Lessons learned include the importance of population size, similarity to the testing population, level of proficiency, test fatigue, reliable administration, course timing, and test format considerations. Statistical analyses, such as classical test theory (CTT) and item response theory (IRT), are also discussed. The article concludes by emphasizing the significance of piloting in the testing cycle and the investment required for reliable results.

bhayden
Download Presentation

Challenges of Piloting Test Items

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Challenges of Piloting Test Items Branka Petek School of Foreign Languages Slovenia

  2. Content • Challenges Slovenia had to face when piloting test items • What we learned from experience

  3. Why pilot test items? To get a clear picture about candidates’ language skills. To get a clear picture we need good test items. Impossible to have good test items without pre- testing.

  4. Challenges SFL had to face • Appropriate population for piloting • Administration of the items • Test format • Statistical analyses

  5. Population for piloting • Size • Similarity to the Slovenian testing population • Level of proficiency • Test fatigue

  6. Lessons learned • SIZE: the population should be as big as posible, (but) anything is better than nothing; • SIMILARITY: the population should be similar to the testing population; • LEVEL OF PROFICIENCY: normal (or near normal) distribution otherwise the results will be unreliable. • TEST FATIGUE: Have the canidates piloted before? Are they tired of taking the tests, piloting?

  7. Administration • Administrators • Time • Courses • Collecting data on testakers

  8. Lessons learned • ADMINISTRATORS: the most reliable results when we administer the tests; • TIME: depends on a course cycle; • COURSES: courses designed to prepare students for STANAG tests normally give the most reliable results; • QUESTIONNAIRES: help investigate face validity of tests, time allocated, clarity of rubrics, appropriacy of test methods, text topics (if well designed).

  9. Test format • Length • Number of items • Task types • Topics (cultural background, influence of the course)

  10. Lessons learned • LENGTH: Similar to the live test version; • NUMBER OF ITEMS: approximately the same number of items; • TASK TYPES: different countries use different methods – candidates might not be familiar with the task types we use; • FAMILIARITY WITH THE TOPICS: e.g. military topics (cultural background);

  11. Statistical analyses • CTT • IRT • ‘Manual check’ • The influence of a particular population

  12. Lessons learned • Small population, CTT – the only option; • Sometimes less than 30 - manual checking: odd answers and strange behaviour, can help eliminate some problems and improve the items; • With small population the data is less reliable - always an element of risk.

  13. Perfect & real-world of piloting • A perfect world piloting session would mean at least 300 test takers, IRT, revising test items, repilot, IRT, final version of the test and experts to determine cut-off scores. • In real world piloting is difficult to plan and carry out. • Absolutely essential part of a testing cycle. • Piloting internationally can produce more reliable results but also represents many pitfalls we have to be aware of. • Being aware of possible problems might help us plan. • The more we invest (in the sense of time, effort and money), the more we get.

  14. Thank you Questions, suggestions?

More Related