1 / 41

SAMPLING

. PROBABILITY OF SELECTION (P.O.S.) IS THE LIKELIHOOD AN ELEMENT WILL BE SELECTED INTO THE SAMPLE.IN A POPULATION CENSUS THE P.O.S. = 1.0; AS SAMPLE SIZE DECREASES, SO DOES THE P.O.S. INTO THE SAMPLE (e.g., YOUR P.O.S. INTO A SAMPLE OF 2,000 CANADIANS IS GREATER THAN YOUR P.O.S. INTO A SAMPL

alexis
Download Presentation

SAMPLING

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. SAMPLING PROBABILITY & NONPROBABILITY SAMPLING. WITH PROBABILITY SAMPLING WE KNOW THE PROBABILITY OF SELECTION FOR ANY ELEMENT IN OUR POPULATION. PROBABILITY SAMPLING ALWAYS INVOLVES SOME KIND OF RANDOM SELECTION.

    2. PROBABILITY OF SELECTION (P.O.S.) IS THE LIKELIHOOD AN ELEMENT WILL BE SELECTED INTO THE SAMPLE. IN A POPULATION CENSUS THE P.O.S. = 1.0; AS SAMPLE SIZE DECREASES, SO DOES THE P.O.S. INTO THE SAMPLE (e.g., YOUR P.O.S. INTO A SAMPLE OF 2,000 CANADIANS IS GREATER THAN YOUR P.O.S. INTO A SAMPLE OF 100 CANADIANS). WITH RANDOM SAMPLING, CASES ARE SELECTED ON THE BASIS OF CHANCE. PARADOXICALLY, THIS REQUIRES CAREFUL CONTROL OF THE SAMPLING PROCESS.

    3. EVEN WITH RANDOM SAMPLING WE MAY END UP WITH AN UNREPRESENTATIVE SAMPLE IF WE HAVE…. (1) A MATERIALLY INADEQUATE OR INCOMPLETE SAMPLING FRAME (e.g., A TELEPHONE DIRECTORY FROM ANY LARGE CITY…UNLISTED PHONE NUMBERS CAN RANGE FROM 10% - 30%+ OF THE POPULATION). (2) AN INADEQUATE RESPONSE RATE (e.g., BE CAUTIOUS IF NONRESPONSE RATE EXCEEDS 30% OF THE SAMPLE).

    4. PROBABILITY SAMPLING: KNOWN, NON-ZERO CHANCE OF SELECTING EACH ELEMENT INTO THE SAMPLE. NO MAJOR SYSTEMATIC BIAS….CHANCE DETERMINES THE ELEMENTS INCLUDED. USE WHEN MAJOR RESEARCH GOAL IS TO GENERALIZE FINDINGS TO A LARGER POPULATION. STILL SUBJECT TO RANDOM SAMPLING ERROR.

    5. SAMPLE SIZE & POPULATION HOMOGENEITY (“SAMENESS”) IMPACT RANDOM SAMPLING ERROR. Proportion of population the sample represents has little effect.

    6. . THE LARGER THE SIZE OF THE SAMPLE, THE MORE CONFIDENCE WE CAN HAVE IN THE SAMPLE’S REPRESENTATIVENESS ( less random sampling error) THE MORE HOMOGENOUS OUR POPULATION, THE MORE CONFIDENCE WE CAN HAVE IN THE SAMPLE’S REPRESENTATIVENESS (less random sampling error). SAMPLING FRACTIONS HAVE LITTLE IMPACT SAMPLE REPRESENTATIVENESS…OR ON RANDOM SAMPLING ERROR.

    7. DIFFERENT TYPES OF PROBABILITY SAMPLES VARY RANDOM SAMPLING ERROR. SIMPLE RANDOM. SYSTEMATIC. STRATIFIED. CLUSTER.

    8. SIMPLE RANDOM SAMPLING. REQUIRES A PROCEDURE FOR ASSIGNING UNIQUE NUMBERS TO ALL ELEMENTS IN THE SAMPLING FRAME, AND THEN IDENTIFYING CASES ON THE BASIS OF CHANCE… ….BY USING A RANDOM NUMBERS TABLE, ….OR A COMPUTER PROGRAM THAT GENERATES RANDOM NUMBERS. …OR RANDOM DIGIT DIALING (FOR TELEPHONE INTERVIEWS OR WHENEVER AN ADEQUATE SAMPLING FRAME IS UNAVAILABLE).

    9. PROBABILITY OF SELECTION IS EQUAL FOR EACH ELEMENT. IF: n = 500 and N = 17,000, than p = n/N = 500/17,000 = .03 IF: n = 2000 and N = 30,000,000, than p = n/N = 2000/30000000 = .00006 SIMPLE RANDOM SAMPLING IS AN “EPSEM” METHOD (Equal probability of being selected method).

    11. SYSTEMATIC RANDOM SAMPLING: THE FIRST ELEMENT IS SELECTED RANDOMLY FROM LIST….THEN EVERY nth ELEMENT IS SELECTED. CONVENIENT WHEN ELEMENTS ARE ARRANGED SEQUENTIALLY.

    12. 3 STEPS IN SYSTEMATIC SAMPLING TOTAL NUMBER OF CASES DIVIDED BY THE DESIRED SAMPLE SIZE PROVIDES YOU WITH YOUR SAMPLING INTERVAL ( I = N / n)…where I is your sampling interval; N is the population size; and n is the sample size. A NUMBER FROM 1 TO I (YOUR SAMPLING INTERVAL) IS SELECTED RANDOMLY. AFTER THE FIRST CASE IS SELECTED, EVERY Ith CASE IS SELECTED FOR YOUR SAMPLE.

    14. SYSTEMATIC RANDOM SAMPLING TYPICALLY PROVIDES SAMPLES AS REPRESENTATIVE AS SIMPLE RANDOM SAMPLES. AVOID SYSTEMATIC RANDOM SAMPLING IF SOME UNDERLYING PATTERN OR PERIODICITY IS IN YOUR SAMPLING FRAME….SEE THE HANDOUT. PERIODICITY IS RARE SO DON’T BE PARANOID ABOUT IT!

    15. STRATIFIED RANDOM SAMPLING: STRATIFIED RANDOM SAMPLING USES POPULATION INFORMATION (e.g., CENSUS DATA) TO MAKE SAMPLING MORE EFFICIENT & EASY.

    16. ELEMENTS ARE IDENTIFIED BY STRATA (E.G., GENDER, ETHNICITY, AGE, EDUCATION, RELIGION, GEOGRAPHIC REGION, ETC.). ELEMENTS ARE SELECTED (WITH SIMPLE OR SYSTEMATIC RANDOM SAMPLING) WITHIN EACH STRATA.

    18. WHY IS STRATIFIED RANDOM SAMPLING MORE EFFICIENT? SIMPLE OR SYSTEMATIC RANDOM SAMPLING CAN YIELD DISPROPORTIONATE SUB-GROUPS IN SAMPLE. PROPORTIONATE STRATIFIED SAMPLING CAN ELIMINATE THIS SOURCE OF RANDOM SAMPLING ERROR. DISPROPORTIONATE STRATIFIED SAMPLING ENABLES YOU TO DO STATISTICAL ANALYSES WITH UNWEIGHTED DATA; AND STATISTICAL ESTIMATION WITH WEIGHTED DATA.

    19. WHY WOULD YOU WANT TO USE DISPROPORTIONATE STRATIFIED SAMPLING? TO ENSURE ENOUGH CASES ARE INCLUDED FROM SMALL STRATA SO MEANINGFUL ANALYSES & COMPARISONS CAN BE PERFORMED.

    20. CLUSTER SAMPLING: REQUIRES LESS PRIOR INFORMATION THAN STRATIFIED SAMPLING. USEFUL FOR SURVEYS OF A LARGE, DISPERSED POPULATION & DEVELOPING SOCIETIES…..SAMPLING FRAMES ARE HARD TO CONSTRUCT.

    21. CLUSTER SAMPLING: A CLUSTER IS A NATURALLY OCCURRING GROUP OF ELEMENTS IN A POPULATION (E.G., UNIVERSITIES, CITY BLOCKS, ETC.) EACH ELEMENT APPEARS IN ONE AND ONLY ONE CLUSTER. DRAWING A CLUSTER SAMPLE IS AT LEAST A 2- STAGE PROCESS….CAN INVOLVE MORE STAGES DEPENDING ON # OF CLUSTERS.

    22. CLUSTER SAMPLING: FIRST STAGE …RANDOM SELECTION OF CITY BLOCKS. SECOND STAGE…RANDOM SELECTION OF HOUSEHOLDS FROM CITY BLOCKS IN 1ST STAGE. THIRD STAGE …RANDOMLY SELECTED PERSON FROM EACH HOUSEHOLD IN 2ND STAGE.

    24. CLUSTER SAMPLING: AS A RULE, RANDOM SAMPLING ERROR WILL BE MINIMIZED, AND PRECISION OF STATISTICS MAXIMIZED, IF….... 1. # OF CLUSTERS IS MAXIMIZED. 2. # OF SELECTIONS WITHIN EACH CLUSTER IS MINIMIZED. NOTE….THIS ADDS TO THE COST!

    25. CLUSTER SAMPLING: SAMPLING ERROR HIGHEST IN CLUSTER SAMPLING. ERROR INCREASES AS # OF CLUSTERS DECREASE. ERROR DECREASES AS THE HOMOGENEITY WITHIN CLUSTERS INCREASES.

    26. DETERMINING SAMPLE SIZE TIME AND MONEY CONSTRAINTS INFLUENCE SAMPLE SIZE. THE LOWER YOUR SAMPLING ERROR MUST BE, THE LARGER YOUR SAMPLE MUST BE. THE MORE DIVERSE YOUR POPULATION IS, THE LARGER YOUR SAMPLE MUST BE. THE MORE COMPLEX YOUR ANALYSIS, THE LARGER YOUR SAMPLE MUST BE. THE STRONGER YOUR EXPECTED RELATIONSHIPS, THE SMALLER YOUR SAMPLE CAN BE.

    27. SAMPLE SIZE…RULES OF THUMB: LOCAL OR REGIONAL STUDIES….250 - 750. PROVINCIAL OR NATIONAL STUDIES…1000 - 2500. NATIONAL STUDIES WITH MULTIPLE AND COMPLEX RESEARCH GOALS…..10,000 - 15,000.

    28. ASSESSING MEASUREMENT: VALIDITY AND RELIABILITY VALIDITY IS THE CONFIDENCE WE HAVE THAT A SURVEY QUESTION OR QUESTIONS (FORMING AN INDEX) ARE REALLY MEASURING…… 1. WHAT WE WANT THEM TO MEASURE! 2. WHAT WE THINK THEY ARE MEASURING! 3. THE CONCEPT WE INTEND THEM TO MEASURE!

    29. MAJOR TYPES OF VALIDITY 1. FACE VALIDATION 2. CONTENT VALIDATION 3. CRITERION VALIDATION 4. CONSTRUCT VALIDATION THERE ARE NO PERFECT MEASURES….VALIDITY OF QUESTIONS VARY WITH TIME, PLACE, & SAMPLE.

    30. FACE VALIDITY IS THE QUESTION APPROPRIATE “ON ITS FACE” OR IN AN OBVIOUS WAY. The following question: “What is your favourite vegetable?” is obviously not a valid measure of how often a respondent drinks booze. EVERY QUESTIONNAIRE ITEM MUST BE EXAMINED FOR FACE VALIDITY. NOT ALL QUESTIONS HAVE OBVIOUS MEANINGS….SO FACE VALIDATION IS OFTEN NOT ENOUGH.

    31. CONTENT VALIDITY Do measure(s) cover the full range of the concept’s meaning? Range is determined by consulting: experts, research literature, sociological imagination, theory, experience, commonsense. MAST (Michigan Alcoholism Screening test includes 24 questions measuring the full range of alcohol use (recognition by self/others; legal, social familial, work problems; help-seeking behavior; health problems; etc.).

    32. CRITERION VALIDITY The score obtained on one or more measure(s) is statistically associated with another measure(s) of the same phenomenon with proven validity. Criterion can be measured during or after measuring the variable to be validated. Predictive Criterion Validity (LSAT test cores and future grades in Law School). Concurrent criterion validity (Sales aptitude test scores with recent sales performance of sales representatives).

    33. More on criterion validity… Validate an index measuring religious beliefs by correlating it with attendance at religious services. Validate an index measuring political beliefs by correlating the index with voting behavior, political donations, membership in a political party.

    34. CONSTRUCT VALIDITY Measure(s) is/are related to other measures from theory or existing research. Discriminant validity is tested where scores on the measure to be validated are compared to scores on another measure of the same variable AND to scores on variables that measure different but related concepts (e.g., Addiction Severity Index related more to the Addictive Personality Index than to various problems associated with drug and alcohol abuse).

    35. RELIABILITY A measure is reliable if provides CONSISTENT SCORES when measuring a phenomenon. Reliable measures have little random error. Unreliability: 1. You get substantially different measurement results when the thing being measured has not changed. 2. Answers to questions forming an index are poorly related. 3. Similar versions of a measure garner very different answers. 4. Ratings by two or more trained observers are poorly correlated. Valid measures will ALWAYS be reliable…but reliable measures are not necessarily valid!!

    36. TEST-RETEST RELIABILITY Do two measures of a phenomenon yield identical (or very similar) results at two different times.

    37. INTER-ITEM RELIABILITY Are multiple items measuring a single concept (an index), strongly correlated or statistically associated with each other.

    38. ALTERNATE FORMS RELIABILITY Are slightly different versions of the same index strongly associated or correlated? Split-halves reliability is a variant of alternate forms reliability.

    39. INTEROBSERVER RELIABILITY Are the measures or observations from two (or more) trained observers strongly associated or correlated?

    40. CRONBACH’S ALPHA: A STATISTICAL MEASURE OF RELIABILITY Cronbach’s alpha varies from 0 to 1. A higher alpha value means higher index reliability. Simple interpretation…the average correlation between the set of questions in an index. Theoretical (complex) interpretation…the correlation between your index and all other possible indices measuring the concept with the same number of questions.

    41. MORE ON CRONBACH’S ALPHA Value of Cronbach’s alpha is determined by: 1. Number of questions in your index; 2. The correlations among the items. An index with a large number of items in it can have a high Cronbach’s alpha even if the correlations between the items are modest. A small index can have a high Cronbach’s alpha only if the correlations between the items are high. SPSS RELIABILITIES provides data that allow you to optimize the value of Cronbach’s alpha.

More Related