UNIT IV ITEM ANALYSIS IN TEST DEVELOPMENT

UNIT IV ITEM ANALYSIS IN TEST DEVELOPMENT CHAP 14: ITEM ANALYSIS CHAP 15: INTRODUCTION TO ITEM RESPONSE THEORY CHAP 16: DETECTING ITEM BIAS

CHAPTER 14 ITEM ANALYSIS *The goal of test construction is to create a test with minimum length and good reliability and validity. *Item Analysis is the computationand examination of any statistical property of an item response distribution. *Item Analysis is a process that we go through when constructing a new test or subtests from a pool of items withgood reliability and validity.

CHAPTER 14 ITEM ANALYSIS • *Categories of Item Parameter *Item parameters fall into 3 categories or indices. 1. Indicesthat describe the distribution of responsesto a single item (e. g. mean andvarianceof item responses). 2. Indicesthat describethe degree of relationshipbetween the response to the item and some criterion of interest. Ex. next

CHAPTER 14  ITEM ANALYSIS • Ex. The relationshipbetween the questions (items) and the criterion of interest i.e., depression in Factor Analysis. • 3. Indicesthat are a function of both, meaning, relationship to item variance/meanand a criterion of interest. • Ex. First, find the variance/mean for your items then, calculate the relationship between these items variance and the criterionof interest (i.e., depression) for two groups..

CHAPTER 14 ITEM ANALYSIS • Item Difficulty “P” P= f/N or Number of examinees who answered an item correctly/ Total number of participants (See your midterm item analysis and Chap 5). The higher the P value the easier the item

CHAPTER 14 ITEM ANALYSIS • *Steps in Item Analysis In a typical item analysis the test developer will take 7 steps (they are similar to the process of test construction in Chapter 4). Next Slide

FYI Process of Test Construction Chap IV 1-Identifying purposesof test scores use • 2-Identifyingbehaviorsto represent the construct • 3- Preparing test specification i.e., Bloom Taxonomy • 4- Item construction • 5- Item Review

Process of Test Construction • 6- Preliminaryitem tryouts • 7- Field test • 8- Statistical Analysis • 9- Reliability and Validity • 10- Guidelines

CHAPTER 14 ITEM ANALYSIS • *7 Steps in Item Analysis 1. Describe what proportions of the test score are of greatest important. Ex. when I select questions for your midterm/final exam I look for the similarities of the questions with those of qualifying/comprehensive or EPPP exam.

CHAPTER 14 ITEM ANALYSIS • *Steps in Item Analysis 2. Identify the item parameters (e.g. mean, variance) most relevant to these proportions. 3. Administerthe items to a sample of examinees representative of those for whom the test is intended. Ex. IQ test for children or depression test for adults.

CHAPTER 14 ITEM ANALYSIS • Steps in Item Analysis 4. Estimatefor each item the parametersidentified in step 2 i.e., variance). 5. Establish a plan for item selection. Ex. Using item difficulties (P) as in Item Analysis to select the items.

CHAPTER 14 ITEM ANALYSIS • Steps in Item Analysis 6. Selectthe final subset of items, or use the data (Items in your Item Analysis) for test revision. Ex. Takeout all questions with very high or very low item difficulties. 7. Conduct a cross validation (validity) study. Ex. Use SPSS and compare the results of 2 tests or 2 classes (e. g. this year class and last year class). i.e., Confirmatory Factor Analysis.

UNIT V TEST SCORING AND INTERPRETATION CHAP 17: CORRECTING FOR GUESSING AND OTHER SCORING METHODS CHAP 18: SETTING STANDARDS CHAP 19: NORMS AND STANDARD SCORES CHAP 20: EQUATINGSCORESFROM DIFFERENT TESTS

UNIT VTEST SCORING AND INTERPRETATION • CHAPT 19 • NORMS AND STANDARDS SCORES

CHAPTER 19NORMS AND STANDARD SCORES *Alfred Binet (1910)RatioIQ = Ratioof MA/CA • *Louis TermanRatioIQ = Ratio of MA/CA X 100 standardized it. • *Deviation IQ = Uses Normsto estimate the IQ We use Norms when we want to compare an examinee’s score (raw score) or score on a test to the distribution of scores (scaled or standard scores) for a sample from a well-defined population. Ex. next

CHAPTER 19NORMS AND STANDARD SCORES • Ex. When we want to estimate the IQ of a 20 year-old person, We compare his/her raw score on the subtest of an IQ test with the people of his/her age, which is his/hernorm (standard scores). Using this technique tells us where this person stands among the people of his/her age.

NORMS AND STANDARD SCORES*9 Basic Steps in Conducting a Norming Study (p.432) • 1. Identify the population of interest Ex. Students, employees of a company, inmates, patients, etc. 2. Identify the most critical statistics that will be computed for the sample data. Ex. Standard deviation σ, σ² , M, SS, p

NORMS AND STANDARD SCORES*9Basic Steps in Conducting a Norming Study (p.432) • 3. Decide on the tolerable amount of sampling error That is the discrepancy between the sample statistic (M) and population parameter, (µ) (Central TendencyM=µ). The Central Limit Theorem has 3 characteristics; 1. Central Tendency 2.The Shape of the Distribution (normal) and 3. Variability or Standard Error of Mean (σm).M-µ

9Basic Steps in Conducting a Norming Study (p.432) 4. Device a procedure for drawing a sample from the population of interest. There are 4 types of probability sampling I Simple Random Sampling Give everyone in the population an equal chance to be selected Ex. Draw names from a hat. II Systemic Sampling N/n Select every Kth name on the list.Ex. CAU Pop N=1500 and your sample size n=150 N/n=1500/150=10 Select every 10th student.

9Basic Steps in Conducting a Norming Study (p.432)Sampling cont.. IIIStratified Sampling “Strata” means different layers. We use Stratified Sampling when we want to compare 2 different groups (e.g. Males and femalesCAU Doctoral Students). First we randomly select males then, randomly select females.

9Basic Steps in Conducting a Norming Study(p.432)Sampling cont.. IV Cluster Sampling We use Cluster sampling when the population consists of units not individuals, such as classes. Ex. Miami Dade School Districts. If we want to conduct a research with the Miami Dade 2nd graders (1000- 2nd grade classes). We’ll randomly select about 10 of these 1000- 2nd grade classes to be in our sample then we conduct research.

9Basic Steps in Conducting a Norming Study (p.432) • 5.Estimate the minimum sample size (n) required to hold the sampling error within the specific limits. There are different statistical procedures to estimate the (n). (n) should be ≥30. 1. n= (σ/d)² d=effect size d=M-µ/σ 2. n= (σ/σm)² σm= σ/√n Standard error of mean for pop Ex. Z score Sm=S/√n Estimated Standard Error of the Mean for a sample. Ex. t-distribution

NORMS AND STANDARD SCORES

The Effect Size Ex. Two Independent t-test

NORMS AND STANDARD SCORES

9Basic Steps in Conducting a Norming Study (p.432) • 6. Draw the Sample and collect the Data • 7. Compute the Values of the Group Statistics of interest and their standard error. Sm=S/√nor σm = σ/√n Calculate the standard error of measurement, which is the difference between M andµ. Also known as sampling error.

9Basic Steps in Conducting a Norming Study (p.432) • 8. Identify the Types of Normative Scores that will be needed, and prepare the Normative Score Conversion table (see next 2 slide). 9. Prepare written documentation of the Normative Scores.

NORMS AND STANDARD SCORES • Types of Normative Scores • Raw Score Score on a subtest or a test. • Scaled Score Normative score for specific age.

Normative Scores Wex-ler

*Normative Scores

NORMS AND STANDARD SCORES • *Usefulness of Scaled Scores Scaled Scores are useful fortwo purpose: 1. Scaled scores relate the examinee’s performance to percentile rank scores of the norm group and their grade level. 2. In evaluation and research the mean scaled score is a better estimation of average group performance than the mean raw score.

Normative Scores Multiply by 5 to convert to percentile. This means neither USA nor Iran are using a Normal Distribution in their grading system. USA is negatively and IRAN is positively skewed.

CHAPTER 19NORMS AND STANDARD SCORES • *Echternacht (1971) 3 steps Process of Grade and Age Equivalent Scores • 1. First we convert the raw scores to scaled scores • 2. Second, calculate the median scaled score for each grade-level, and plot them on a bivariate scatter plot. • 3.Connect the points and draw a smooth curve. It is similar to Deviation IQ. I.e., Child’s performance compares with that of others at a particular age or grade level.

CHAPTER 19NORMS AND STANDARD SCORES

UNIT IV ITEM ANALYSIS IN TEST DEVELOPMENT

UNIT IV ITEM ANALYSIS IN TEST DEVELOPMENT

Presentation Transcript

Test Item Analysis

Test Items and Item Analysis

Item Analysis

Test Item Quality

Item Analysis

Item Analysis

Copying An Item Bank Unit Test

Test Development and Analysis

Test Development and Analysis

Test Development and Analysis

Item Analysis

ITEM ANALYSIS

Item Development

ITEM ANALYSIS

Test Item

ITEM ANALYSIS

Welcome to Unit IV Inventory Item Forms

IV. UNIT IV

ITEM ANALYSIS

Multiple Choice Test Item Analysis

ITEM ANALYSIS

Item Analysis