320 likes | 590 Views
Measuring Social Variables. Measurement . To speak responsibly about our world, we need ways to document and measure what is out there. In social sciences, measurement is sometimes complex. How do you define poverty? Income level Nourishment Living conditions Property
E N D
Measurement To speak responsibly about our world, we need ways to document and measure what is out there. In social sciences, measurement is sometimes complex. How do you define poverty? Income level Nourishment Living conditions Property How should a researcher define poverty? There is an official definition:
Measurement To do research on poverty, one needs a definition of poverty that one then uses consistently and communicates to consumers of the research. One must also devise strategies for collecting information that is used to define a variable like poverty. (A variable is a measured concept that can take on more than one value or category—opposite of a constant. Sociologists measure concepts with variables.) “Operationalization” is the term used to denote the ways one measures concepts to form variables.
Measurement “Operationalization” is the term used to denote the ways one measures concepts to form variables. • One can make observations. • One can use official statistics. • One can use available data—But be careful to ensure that the data measure variables that you want. Researchers sometimes stretch the limits by using items that don’t measure what they claim they measure. E.g., measuring sports enthusiasm with extent of knowledge of NASCAR or frequency of playing golf. • One can ask questions of respondents.
Measurement If one is going to collect data using questions, one must learn to construct good questions. This will be more fully addressed in SY 382 You should err on the side of using others’ questions that have already been used to measure your concepts. Types of Questions:
Measurement Types of Questions: Closed-ended questions offer respondents a limited set of response options that should be mutually exclusive and exhaustive (unless “check all that apply” is useful) • Easy to process and quantify, efficient • Much thought must go into constructing each question • May obscure what people really think • Better for larger samples/quantitative research Example: Which type of television program do you enjoy the most? a. Drama b. Comedy c. Romance d. Talk e. News
Measurement Types of Questions: Open-ended questions allow respondents to write in their answers, without response options • Preferable if full range of responses cannot be anticipated • Useful for exploratory research • Avoid vague questions; vague questions lead to differing interpretations by various respondents • Better for smaller samples/qualitative research Example: Please tell us the type of television programming you prefer to watch and what you enjoy most about that programming.
Measurement Indexes and Scales A series of questions is used to more comprehensively measure a concept than would be possible with a single question. These are especially appropriate for measuring concepts that we know exist but cannot see. • We know the following exist, but we cannot directly view them: Self-esteem Well-being Gender Identity Depression Index: Each item is equally weighed to create a sum or average Scale: Some items add more value to the total measure than other items
Measurement For Example, Researchers typically operationalize self-esteem by using the Rosenberg Self-esteem Scale (which is technically an index).
Measurement Levels of Measurement One must know the nature of one’s variables in order to understand what manipulations are appropriate (and later, which statistical tests to use because they must be mathematically manipulated for statistics). Nominal Level of Measurement Ordinal Level of Measurement Interval Level of Measurement Ratio Level of Measurement
Measurement Levels of Measurement Nominal Level of Measurement • Items or responses are assigned to categories along a dimension of types. • A nominal variable classifies persons, places or things without implying any rank among them. • For example: Race: 1=black, 2=white, 3=Asian Cars: 1=Chevy, 2=Honda, 3=Ford • It makes no sense to add, subtract, multiply, or divide these.
Measurement Levels of Measurement Ordinal Level of Measurement • Items or responses are assigned to categories along a dimension of types with increasing value (or in order). • An ordinal variable ranks persons, places or things, but there is no accurate way to gauge the distance between them. • For example: Professor Rank: 1=Assistant Prof., 2=Associate, 3=Full Sexy Cars: 1=Green Gremlin, 2=Blue Impala, 3=Red Audi • It sometimes makes no sense to add, subtract, multiply, or divide these. Sociologists, using good judgment, may.
Measurement Levels of Measurement Interval Level of Measurement • Items or responses are assigned to their place along a dimension of increasing value, and there is a specific distance measure between each place on the dimension. • An interval variable assigns persons, places or things to a continuum that has specific intervals between units of measure, but does not have an absolute zero point. Units of measure are somewhat arbitrarily assigned—like Fahrenheit vs. Celsius • For example: Self-Esteem: Scale ranges from 10 to 40 Income Categories: 1=under $10K, 2=$10.001 – 20K, 3=over 20.001 • It makes sense to add & subtract these, but sometimes makes no sense to multiply, or divide. Sociologists, using good judgment, may.
Measurement Levels of Measurement Ratio Level of Measurement • Items or responses are quantified and assigned to their place along a dimension of increasing value. There is a specific distance measure between each place on the dimension and an absolute zero point. • A ratio variable notes the number of persons, places or things on a continuum that has a zero point and has specific intervals between units of measure. Units of measure denote quantity. • For example: Age: 1=1 year, 2=2years, 3=3years, etc. Income: 0=no income, 1=$1, 2=$2, 3=$3, etc. • It makes sense to add, subtract multiply, and divide these. Sociologists typically treat their ordinal and interval level variables as ratio variables.
Measurement While each variable we use has a number assigned to responses, we must remember whether the numbers are meaningful or not. For nominal variables, the numbers are meaningless. For ordinal variables, we sometimes treat the numbers as meaningful if one can make an argument for doing so.
Measurement The special case of dichotomous variables: A dichotomous variable can take one of two values. For example: Sex: 0=Male, 1=Female Race: 0=Other, 1=Hispanic Cars: 0=Other, 1=SUV Are dichotomous variables nominal, ordinal, interval, or ratio?
Measurement Assessing our Measures Researchers have an obligation to assess how well their measures of variables work. Issues are Reliability and Validity From Vogt… Validity: A term to describe a measurement instrument or test that accurately measures what it is supposed to measure; the extent to which a measure is free of systematic error. Validity requires reliability, but the reverse is not true. Reliability: Freedom from measurement error. In practice, this boils down to the consistency or stability of a measure or test from one use to the next.
Measurement Assessing our Measures Types of Validity • Face Validity • Content Validity • Criterion Validity Concurrent Validity Predictive Validity • Construct Validity
Measurement Assessing our Measures Validity Face Validity The measure is accurate “on its face.” Good sense tells you that you are measuring what you intend to measure. Self-esteem example: Valid How much do you like yourself? Invalid How much money do you spend each month?
Measurement Assessing our Measures Validity Content Validity The measure covers the full range of the concept’s meaning. Self-esteem example: Valid Ten questions covering aspects of self-worth and feelings. Invalid One question to capture this latent, vague concept.
Measurement Assessing our Measures Validity Criterion Validity The measure is accurate if it gives similar results as an already established measure of the same phenomenon. Self-esteem example: Valid My self-esteem scale matches the results of the Rosenberg Self-Esteem Scale. Invalid My self-esteem scale does not perform like the Rosenberg scale.
Measurement Assessing our Measures Validity Concurrent Criterion Validity The measure is accurate if it corresponds with another indicator of the phenomenon that is measured at the same time. Self-esteem example: Valid Subjects with high self-esteem on my measure are observed smiling in the setting more than others. Invalid Subjects with high self-esteem on my measure are heard making comments such as “I suck” and “I can’t ever win” more often than others.
Measurement Assessing our Measures Validity Predictive Criterion Validity The measure is accurate if it predicts the scores on another indicator of the phenomenon that is measured in the future. Self-esteem example: Valid Subjects with high scores on my self-esteem scale report fewer suicidal episodes six months later. Invalid Subjects with high scores on my self-esteem scale report no fewer or more suicidal episodes six months later.
Measurement Assessing our Measures Validity Construct Validity The measure is accurate if it corresponds with other phenomena as specified by a theory. Self-esteem example: Valid Subjects with high scores on my self-esteem scale report better well-being, more confidence, and have a positive outlook as they should according to theories of self-esteem. Invalid Subjects with high scores on my self-esteem scale show no signs of better well-being, confidence, or positive outlook while they should according to theories of self-esteem.
Measurement Assessing our Measures Types of Reliability Tests • Test-retest Reliability • Interitem Reliability • Alternate-forms Reliability • Split-halves Reliability People Interobserver Reliability
Measurement Assessing our Measures Reliability Test-retest Reliability The measure is reliable if a subsequent administration yields similar scores. There should be a high correlation between the sample’s two scores on the measure. Self-esteem example: Reliable Those with high (low) scores with the first administration of my scale have high (low) scores the second time. Not Reliable The second scores on my scale are not like the first.
Measurement Assessing our Measures Reliability Interitem Reliability (most commonly used) When using several items to measure a concept, the scores on each individual item should be similar to those on the others. • Often to avoid response bias, researchers use positively and negatively worded items. In these cases, each item’s responses should be correlated with the others of the same valence. Self-esteem example: Reliable All ten items on my self-esteem scale are highly correlated. Not Reliable Several items on my self-esteem scale do not behave the same as the others.
Measurement Assessing our Measures Reliability Alternate-forms Reliability Different versions of the measure are reliable if the give similar scores when given at different times. • Alternate forms may include slightly different wording, different question order, etc. Self-esteem example: Reliable My re-worded self-esteem scale gives similar scores with the same people as the first. Not Reliable The second scores on my re-worded scale are not like the first.
Measurement Assessing our Measures Reliability Split-halves Reliability Measures on an instrument are randomly divided in half. The two halves are reliable if they yield similar scores. Self-esteem example: Reliable Persons’ responses to the first five self-esteem questions are highly correlated with their responses to the second five. Not Reliable Persons’ responses to the first half of my scale do not correlate with the second half responses.
Measurement Assessing our Measures Reliability Interobserver Reliability To eliminate idiosyncratic interpretation of data, researchers often have more than one person rate the same phenomena. When two or more persons’ scores are similar, there is interitem reliability. Self-esteem example: Reliable Several clinicians agree, similarly rating high self-esteem people as high on my scale and low self-esteem people as low on it. Not Reliable Several clinicians cannot agree on who is high and who is low on self-esteem when rating persons with my scale.
Measurement Assessing our Measures • We must always remember that things must be reliable to be valid. “My scale gives a different reading each time I step on it.” • However, something that is reliable is not always valid. “A broken clock is right only twice a day.” • Something that is true, is not always useful either. “No one ever drowned in sweat.” The bottom line is that we must always assess whether we are measuring useful things and what we intend to measure. Keeping these validity and reliability concepts in mind will help us do that.