Assessing the Quality of Research

Assessing the Quality of Research • What is validity? • Types of Validity • Examples in the Measurement of • Height & Weight • Learning Style Orientation

Validity • Validity • Evidence that a measure assesses the construct/concept accurately and in a meaningful way • Reliability • That a measure is consistent in assessing the construct

Corr b/w Objective (O) & Self-Reports (SR) of Height (H) & Weight (W)

Validity vs. Reliability • Reliability is a necessary but not a sufficient condition for validity • E.g. A measuring tape to is not a valid way to measure weight although the tape reliably measures height and height correlates w/weight

Types of Validity Construct Validity Criterion Validity Content Validity Predictive Validity Concurrent Validity Convergent Validity Discriminant Validity Adapted from Sekaran, 2004

Content Validity • Extent to which items on the measure are a good representation of the construct • e.g., Is your job interview based on what is required for the job? • Can be based on judgments of researcher or independent raters • e.g., Expert (supervisors, incumbents) rating of job relevance of interview questions

An Example of How Content Validity of the Learning Style Orientation Measure is Established • 112 items derived from 2 procedures based on theory about learning events… • Ps generated critical incidents of learning events • Two types of learning events: theoretical, practical (see next slide for examples) • Two types of outcomes=success, failure • 4 events from each of 67 participants • Ps indicated yes/no to action & reflection oriented statements

Examples of theoretical & practical learning events

Obtaining Data on “Content Valid” Items Generated Qualitatively (aka Item Development Phase Study) • 154 Ps rated 112 items on 5 point Likert scale agree/disagree type statements like • I like problems that don’t have a definitive solution • I like to put new knowledge to immediate use

Feedback on method section • Describing vs. including the questionnaire • Specific • Relevant • Graded on irrelevant details • What is irrelevant detail??

Quantitative Analyses of “Content Valid” Items Generated Qualitatively • Ps responses factor analyzed • 5 factor solution (i.e., 5 dimensions) • What is factor analyses? Demo if time permits • Retained 54 items of 112 original • 54 items sorted for content by 8 grad students blind to number and types of dimensions

Simplifying what the factor analyses of the 54 items mean • Computed sub-scales based on factor analyses & found high reliabilities • .81-.91 • Computed Correlations b/w the 5 factors • Range from .01 to.32 (more on the implications of this later....) • Only 1 is significant • Follow up with a more stringent test by replicate 5 factors with new data using Confirmatory Factor analytic technique

Further Validating the Learning Style Orientation Measure in a follow-up study • 350 -193 Ps complete the • new LSOM • old LSI (competitor/similar construct) • Personality (firmly established related construct as per theory)

Results demonstrating the Content Validity of LSOM in the second study • Confirmatory factor analysis shows 5-dimensions re-extracted with new data • More sophisticated than just demonstrating high reliability of sub-scales • Comparing reliabilities of LSOM subscales =.74 to .87 to reliabilities of… • Old learning style subscales=.83 to .86 • Personality subscales=.86 to .95

Implications of Content Validity Analyses of the LSOM • Not firmly established that LSOM is something different and/or better than LSI

What you learned so far • What is validity • How is it different from reliability? • Learning Check in the Essays data how will you establish validity? • One type of validity is content Validity • How to establish content validity? • Dual Career Relationship measure • What are limitations of with the notion of content validity

What’s next… Types of Validity Construct Validity Criterion Validity Content Validity Predictive Validity Concurrent Validity Convergent Validity Discriminant Validity Adapted from Sekaran, 2004

Criterion Validity • Extent to which a new measure relates to another known measure • Demonstrated by the validity coefficient • Correlation between the new measure and a known measure • e.g., do scores on your job interview predict performance evaluation scores? • New terms to keep in mind • new measure=predictor • known measure=criterion

Predictive (Criterion) Validity • Scores on predictor (e.g., selection test) collected some time before scores on criterion (e.g., job performance) • Able to differentiate individuals on a criterion assessed in the future • Weaknesses • Due to management pressures, applicants can be chosen based on high scores on predictor leading to range restriction (demo) • http://cnx.rice.edu/content/m11196/latest/ • Measures of job performance (highly tailored to predictor) are developed for validation

Concurrent (Criterion) Validity • Scores on predictor and criterion are collected simultaneously (e.g., police officer study) • Distinguishes between participants in sample who are already known to be different from each other • Weaknesses • Range restriction • Does not include those who were not hired/fired • Differences in test-taking motivation • Differences in experience • Employees vs. applicants bec. experience with job can affect scores on performance evaluation (i.e., criterion)

Concurrent vs. Predictive Validity • Predictor & Criterion variable collected at the same vs. different times • For predictive, the predictor variable is collected before the criterion variable • Degree of range restriction is more vs. less

Example of Criterion Validity Learning Style Orientation Measure • Additional variance explained by new LSOM vs. old LSI on criteria (i.e., preferences for instruction & assessment)

Construct Validity • Extent to which hypotheses about construct are supported by data • Define construct, generate hypotheses about construct’s relation to other constructs • Develop comprehensive measure of construct & assess its reliability • Examine relationship of new measure of construct to other similar & dissimilar constructs (using different methods) • Examples: height & weight; Learning Style Orientation measure

2 ways of Establishing Construct Validity • Different measures of the same construct should be more highly correlated than different measures of different constructs (aka Multi-trait multi-method) • e.g., objective height & SR of height should be higher than Objective Height & and Objective Weight • Different measures of different constructs should have lowest correlations • E.g., Objective Height & Subjective Weight

Correlations between Objective (O) & Self-Reports (SR) of Height & Weight

Convergent Validity Coefficients • Absolute size of correlation between different measures of the same construct • Should be large, significantly diff from zero, • Example of Height & Weight • Objective and subjective measures of height are correlated .98 • Objective and subjective measures of weight are correlated .92

Discriminant Validity Coefficients • Relative size of correlations between the same construct measured by different methods should be higher than • Different constructs measured by same method • Different constructs measured by different methods

Using the Example of Different Measures of Height & Weight to understand Discirminant Validity

Discriminant Validity Across Constructs • STRONG CASE: Are the correlations b/w the same construct measured by different methods significantly higher than corr b/w different constructs measured by same methods • Note: Objective measures of height & weight are corr .55 & Subjective measures of height & weight are corr .69 • So to establish strong case, establish that .92 & .98 are significantly greater than .55 & .69? • Not enough to visually compare, need to convert rs to z scores and check in z table

Discriminant Validity Across Measures • WEAK CASE: Are the correlations b/w the same construct measured by different methods significantly different from corr b/w different constructs measured by different methods • Note:Objective height & subjective weight are corr .68 & Subjective height & objective weight are corr .56 • So to establish weak case, demonstrate that .92 & .98 are significantly higher from .56 & .68 (after converting rs to z scores and comparing z-s)

Using the LSOM Item Development Study (aka Study 1) to understand Construct Validity

Recall, the 2 ways of Establishing Construct Validity • Different measures of the same construct should be more highly correlated than different measures of different constructs (aka Multi-trait multi-method) • e.g., subscales of LSOM should be correlated higher than corr b/w LSOM & personality • Different measures of different constructs should have lowest correlations • E.g., corr b/w LSOM & Personality

Convergent Validity of LSOM in The Item Development Study • Established via • High reliabilities of subscales of LSOM (.81-.91) • Correlations b/w different measures (subscales) of learning style =.01 to.32 should be somewhat significant (not too high and not too low) • Note only 1 corr is significant (could be due to sample size?) so weak support for convergent validity of new LSOM in Study 1 & conducted second validation study

Discriminant Validity in the LSOM Item Development Phase • Correlations between different measures of different constructs (i.e., Learning Style & personality) .42 to .01 should be lower than and significantly different from correlations between different measures of same construct (i.e., subscales of learning style) .01 to .32

Conclusions from LSOM Item Development Phase Study • Convergent & Discriminant validity is not established sufficiently researchers collected additional data to firmly establish the validation of the measure

Examining the LSOM Validation Study to understand Construct Validity

Method & Procedure of the Validation Study • 350 -193 Ps complete the • new LSOM (predictor) • old LSI (competitor/similar construct) • Personality (related construct as per theory) • Preferences for instructional & assessment methods (criterion)

Convergent Validity of the LSOM in the Validation Study • To examine the correlation (r) b/w similar measures of key construct compare the correlations b/w the different subscales (measures) of new learning style 01 to .23 to • r b/w similar measures of other similar & dissimilar constructs in the study • Similar constructs=Different subscales of old learning style .23 to .40 • Dissimilar constructs= Diff subscales of personality .01 to .27

Discriminant Validity of the LSOM in the Validation Study • Examine Correlations (r) between measures of similar constructs • r between new learning style subscales & old learning style = .01 to .31 • Examine r b/w measures of different constructs • r b/w new learning style & personality subscales is .01 to .55 • r b/w old learning style & personality subscales= .02 to .38

Criterion Validity can be an indirect way of establishing Construct Validity

Establishing Better Criterion Validity of LSOM • Additional variance explained by new LSOM vs. old LSI on criteria (i.e., preferences for instruction & assessment)

What you learned today • Kind of evidence you should look for when deciding on what measures to use • Content Validity • Criterion Validity • Concurrent vs. Predictive • Construct validity • Convergent & Discriminant

Implications of What you learned today for your Method Section • Did you examine relevant sources to establish validity of your measures? • How will you report that information?

Assessing the Quality of Research

Assessing the Quality of Research

Presentation Transcript

Assessing Earnings Quality

Assessing the Quality of Country Plans

Assessing the Mission of Doctoral Research Universities

Assessing the Quality of Individual Studies

Assessing the quality of evidence in realist synthesis

Assessing Program Quality

Assessing the quality impacts of posting Census Questionnaires

Assessing the quality of spatial predictions

Assessing the Quality and Impact of Surface Observations

Assessing the Quality of VA Human Research Protection Programs

Assessing Quality of Care

Assessing the Quality of Fundamental Movement in Children

Assessing the quality of early intervention with families

Assessing the Quality of Administrative Data

Assessing the Quality of Online Courses

ASSESSING QUALITY ACROSS THE REGION

Assessing clustering quality

Assessing the Quality of Individual Studies