200 likes | 3.03k Views
Validity Definitions Extent to which a test measures what it purports to measure Extent to which a test is used in an impartial, just, and equitable way Validity is what the test measures and how well it does so (Anastasi, 1954)
E N D
Definitions • Extent to which a test measures what it purports to measure • Extent to which a test is used in an impartial, just, and equitable way • Validity is what the test measures and how well it does so (Anastasi, 1954) • A test is valid to the degree that we know what it measures or predicts (Cronbach, 1954) • Validity = trustworthiness
Validity Process • Validity is determined through an ongoing process not a single score or decision: • through theory and hypotheses • through correlations, regressions, and factor analysis • through an examination of the consequences • Validity is a characteristic of test scores and their use, not of the test itself
3 Traditional Methods • Content validity • Construct validity • Criterion-related validity • predictive & concurrent Logical Empirical
What is ContentValidity? • Are the behaviors sampled by the test representative of the attribute being assessed? • Am I fully measuring what I think I am measuring? • Steps: • Describe the content domain • Identify domains measured by the test • Compare the structure of the test with the content domain to analyze representativeness
Determining Content Validity • Primary outcome is a judgment about how well the test samples the content domains of the attribute • No statistical tests to determine • Easier to assess for concrete domains • Facts vs. abstract/complex concepts
Content Validity Strategies • What can you do to ensure a high degree of content validity? • Align content to standards carefully • Engage multiple stakeholders in the development and auditing process • Seek domain specific expert opinion
What is Construct Validity? What are Constructs? • names associated with hypothetical abstract concepts, but still connected with observable entities Why are they important? • constructs are the central means we have for connecting operations in research to language communities • they often carry social and political implications • the naming of things is a key problem for all sciences
Construct Validity • Does a test provide a good measure of the construct of interest? • Usually an ongoing process that involves continual development and change and refinement • Takes the form of an argument, presenting evidence for and against
Construct Validity • Construct explication • identify behaviors related to construct (convergent validity) • identify other constructs and decide if they are related or not (discriminant validity) • Establish nomological networks • identify behaviors related to each additional construct and assess relationships • interrelated laws supporting a construct
Types of Construct Validity • Convergent validity • the correlation between “like” behaviors/measures/constructs (e.g., similar or the same constructs) • Discriminant validity • the correlation between “unlike” or dissimilar measures
Calculating Construct Validity • Correlate scores on test with other measures or tests • It should have significant correlations with similar behaviors or tests (convergent) • It should be unrelated to unlike, dissimilar behaviors or tests (discriminant) • Factor analysis (unidimensionality)
Example: Construct of Love • Define love • Grounded in existing theoretical and popular conceptions of love • Measure it • highly inter-correlated items (r = .85) • factor analysis • assess its relationship to similar and dissimilar variables, i.e., hate, like, (discriminant)
Convergent Positive relationship to ‘in loveness scale’ Positive relationship to ‘probability of marrying coefficient’ Positive relationship to ‘never felt this before coefficient’ Positive relationship with ‘gazing adoringly’ Discriminant Positive but different relationship to self reported ‘friendness coefficient’ (i.e. like, not love) Negative relationship to ‘hate coefficient’ Negative relationship to ‘social desirability scale’ Positive relationship to ‘glancing’ Example: Love (think evidence) Herman (2004)
Convergence Correlation Matrix Herman (2004)
Discriminant Correlation Matrix Keith Herman (2004)
What is Criterion Validity? • Judgment regarding how well a test can be used to infer an individual’s standing on a measure of interest (the criterion). • Criterion should be reliable, relevant, and valid. • The primary concern is prediction: how well the test predicts the criterion of interest.
Types of Criterion Validity • Predictive (over time) • follow subjects over time • limited by time and feasibility • Concurrent (at the same time) • single point in time and pre-selected subjects • limitations: restricted range
The Language of Validity Validity of Inferences inside the test relationship to other tests internal external (generalizeability) construct content criterion tradition convergent discriminant reliability concurrent predictive inter-rater parallel forms internal test retest when we talk about validity we are addressing reasons why we might not trust inferences
Wrap-up • Validity is a complex, evolving judgment about the quality inferences made from test scores • Recent attention has focused not only on the psychometric properties of a test (i.e., reliability and conventional validity) but also on the social consequences related to test use • Awareness of social consequences of assessment is critical for both researchers and educators alike