110 likes | 313 Views
Working with Array in Stata. Vivien W. Chen Izumi Mori. Do you …. W rite repetitive programs for a set of variables or data files? Use longitudinal data that repeat the same data management for multiple years? Want to reduce the chance of coding errors due to repeated programming?.
E N D
Working with Array in Stata Vivien W. Chen Izumi Mori
Do you … • Write repetitive programs for a set of variables or data files? • Use longitudinal data that repeat the same data management for multiple years? • Want to reduce the chance of coding errors due to repeated programming?
The Purpose of This Workshop • Introduce the corresponding programs to SAS array in Stata: forvalue& foreach • Include the use of macro local & global • Introduce programs for temporary files and variables: tempvar&tempvar • Reshapedata structure for longitudinal data
Example Data (Source data: PISA 2000, 2003, 2006)
The Use of forvalues & foreach • forvalues is a loop that executes commends with a defined numeric local macro. Specific numbers or a range of numbers can be added in the macro before braces “{” and “}”. A range may be 1/5, meaning 1 to 5; or 0(5)100, indicating 0 to 100 in steps of 5.
Example 1: replace missing values for XXX in each year file One solution is: use file2000,clear replace XXX=. If XXX<0 use file2003,clear replace XXX=. If XXX<0 use file2006,clear eplace XXX=. If XXX<0 Alternatively: forv n=2000 2003 2006 { use file`n’,clear replace XXX=. If XXX<0 }
foreachis a more generally useful statement. It can work with a list of variables, numbers, and names. It also can create new variable list. • Example 2: (the same task as example 1) foreach n of numlist 2000 2003 2006 { use file`n’,clear replace XXX=. If XXX<0 }
Example 3: In each year file, (1) create dummy variables for mother’s education; (2)create interactions terms for gender, foreign born with mother eduand parental occ; (3) create a year ID in each file; and (4) append those files together. foreach n of numlist 2000 2003 2006 { use file`n’,clear tab med, gen(med) foreach x of varlist female fborn { gen `x’_occ=`x’*occ foreach n2 of numlist 0/6 { gen `x’_med`n2’=`x’*med`n2’ } } gen yr=`n’ } use file2000,clear append using file2003 file 2006 1 2 3 4
Are these the alternatives? foreach n of numlist 2000 2003 2006 { use file`n’,clear tab med, gen(med) foreach x of varlist female fborn { gen `x’_occ=`x’*occ gen `x’_med=`x’*med (?) } gen yr=`n’ append using file`n’ (?) } If you want to interact with the continuous variable of mother’s education, you can do so. This won’t work, because it’s in the loop of each year.
Make Use of Macro –local & global • A local macro is created in a do file and ceases while the do file terminates. • A global macro terminates while a user exits Stata program
Example 4: regression analysis local x1 female fborn local x2 occ med1-med6 cultposs /* global x1 female fborn global x2 occ med1-med6 cultposs */ regp1vmath `x1’ reg p1vmath `x1’ `x2’ /* reg p1vmath $x1 reg p1vmath $x1 $x2 */ Can I just use ”by” instead of “foreach” country?