290 likes | 887 Views
VALIDITY. MAJOR KINDS OF VALIDITY. Validity: the test measures what is was designed to measure. content validity criterion validity construct validity . CONTENT VALIDITY.
E N D
MAJOR KINDS OF VALIDITY • Validity: the test measures what is was designed to measure. • content validity • criterion validity • construct validity
CONTENT VALIDITY • Content validity refers to the degree to which the items on the test reflect the intended domain. There are two aspects to this part of validation: content relevance and content coverage.
Content Relevance • The investigation of content relevance requires the specification of the behavioral domain in question and the attendant specification of the task or test domain. It is generally recognized that this involves the specification of the ability domain. Examining content relevance also requires the specification of the test method facets.
Content Coverage • the extent to which the tasks required in the test adequately represent the behavioral domain in question.
Content validity • Content validity is established through a logical analysis, which is basically an analysis of the correspondence between the test items and the content being covered. In many cases this is determined by how closely the items match the objectives. Content validity is not done by statistical analysis but rather by the inspection of the items. It does not generate a validity coefficient. This is different from reliability and from other forms of validity where the evidence is in terms of test scores and their statistical properties.
Test of Content Validity • A test administered twice before and after the instruction. The significant difference of the two sets of scores can tell the content validity. • t=(X1’-X2’)/√((бx12+бx22-2rбx1бx2)/(n-1)) • where • X1’:Mean of Test 1 • X2’: Mean of Test 2 • бx1: standard deviation of test 1 • бx2: standard deviation of test 2 • r: correlation coefficient between test 1 and test 2 • Decision has to be made on the basis of the t-test table.
CRITERION VALIDITY • Criterion validity is concerned with the degree to which the test scores are accurate and useful predictors of performance on some other criterion measure. This other measure is called criterion measure, which may be a different test, a future behavior pattern, or almost any other variable of interest. Therefore, criterion validity of a test involves the relationship or correlation between the test scores and scores on some measure representing an identified criterion. The correlation coefficient can be computed between the scores on the test being validated and the scores on the criterion. A correlation coefficient so used is called a validity coefficient.
CRITERION VALIDITY • There are two slightly different types of criterion validity: concurrentvalidity and predictive validity. • Information on concurrent criterion is the most commonly used in language testing. Such information typically takes one of two forms:
CRITERION VALIDITY • (1) examining differences in test performance among groups of individuals at different levels of language ability, or • (2) examining correlations among various measures of a given ability. The process of establishing concurrent validity is one of administering the two measures - the criterion measure and the measure being validated - at about the same time.
Test of Criterion Validity • 1. Product-moment correlation coefficient • 2. When a variable is a continuum variable (e.g. scores) and the other is a dichotomous variable (e.g. sex) • 3. when one or both of the variables are of the grade type
Continuum and Dichotomous • r=((Xp’-Xq’)/б)*√(pq) • where • p: ratio of the first type of dichotomous variable • q: ratio of the second type of dichotomous variable • Xp’: mean of the continuum variable corresponding to p • Xq’: mean of the continuum variable corresponding to p • б: standard deviation of the continuum variable
Grade Type • r=1-((6∑D2)/n(n2-1)) • where • D: difference between the grades • n: number of the students
Concurrent Validity • Concurrent validity is involved if the scores on the criterion are obtained at the same time as the test scores. Predictive validity is involved if we are concerned about a test score's relationship with some criterion measured in the future.
Predictive Validity • Predictive validity indicates the extent to which an individual's future level on the criterion is predicted from prior test performance. For example, when test scores are used for selection purposes, such as choosing individuals for jobs or acceptance for admission to college, predictive validity of the test is of concern.
CONSTRUCT VALIDITY • The determination of construct validity is essentially a search for evidence that will help us understand what the test is really measuring and how the test works across a variety of settings and conditions. A construct is a trait, attribute, or quality, something that cannot be observed directly but is inferred from psychological theory. Tests do not measure constructs directly, rather, they measure performance or behavior that reflect constructs.
Correlational Analysis • A logical analysis of test content can usually give some indication of the number and nature of the constructs reflected by the test. • If tests (or items) measure the same constructs, scores on the tests should be correlated; conversely, scores on tests that measure different constructs should have low correlations.
Other Measure Correlation between Test VA and Other Measures Verbal Ability Test Creativity Test Self-Concept Test Vocabulary Test Math Aptitude Test Reading Test Individual weight .85 .65 .30 .89 .71 .57 .03 Correlational Analysis • Suppose we have a test that is hypothesized to measure verbal ability. Scores on this test are correlated with scores on seven other measures, six tests (including another known verbal ability test) and individual weight. The pattern of correlations with these measures is as follows:
Correlation Matrix • One way of assessing the construct validity of a test is to correlate the different test components with each other. • Since the reason for having different test components is that they all measure something different and therefore contribute to the overall picture of language ability attempted by the test, we should expect these correlations to be fairly low—possibly in the order of +.3-+.5. If two components correlate very highly with each other, say +.9, we might wonder whether the two subtests are indeed testing different traits or skills, or whether they are testing essentially the same thing.
Correlation Matrix • The correlations between each subtest and the whole test, on the other hand, might be expected, at least according to classical test theory, to be higher—possibly around +.7 or more
Problem: Ambiguity • It is impossible to make clear, unambiguous inferences regarding the influence of various factors on test scores on the basis of a single correlation between two tests. For example, if we found that a multiple-choice test of cohesion were highly correlated with a multiple-choice test of rhetorical organization, there are three possible inferences: (1) the test scores are affected by a common trait (textual competence); (2) they are affected by a common method (multiple-choice), and (3) they are affected by both trait and method.
Problem: Ambiguity • Because of the potential ambiguities involved in interpreting a single correlation between two tests, correlational approaches to construct validation of language tests have typically involved correlations among large numbers of measures.
Factor Analysis • A commonly used procedure for interpreting a large number of correlations is factor analysis, which is a group of analytical and statistical technique whose common objective is to represent a set of [observed] variables in terms of a smaller number of hypothetical variables.
Factor Analysis • Factor analysis is a procedure for analyzing a set of correlation coefficients between measures; the procedure analytically identifies the number and nature of the constructs underlying the measures.
Factor Analysis • Factor analysis is a statistical procedure; it does not provide names for the factors or constructs. Factors may be considered artificial variables - they are not variables that are originally measured but variables generated from the data.