1 / 15

SJTU CMGPD 2012 Methodological Lecture Day 3

SJTU CMGPD 2012 Methodological Lecture Day 3. Position and Status Variables. Variables for position. The basic and analytic files include a variety of indicator variables for whether a male holds position These are based on the statuses recorded in the registers

nolcha
Download Presentation

SJTU CMGPD 2012 Methodological Lecture Day 3

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SJTU CMGPD 2012Methodological LectureDay 3 Position and Status Variables

  2. Variables for position • The basic and analytic files include a variety of indicator variables for whether a male holds position • These are based on the statuses recorded in the registers • File with hanyu pinyin for raw occupations has been released • DS 6 • Occupations with original Chinese characters are released as PDF • Turned out to be difficult to include Chinese characters in the released data

  3. Variables for position • In the original data, entries included the official positions held by males. • Coders assigned a numeric code to each new position, and entered the code into the dataset. • Codes started again for each new dataset • Transcribed the original Chinese into a codebook • Can use DATASET and POSITION_CODE to look up original Chinese in the appendix to the Analytic release codebook • DS 6 allows merging of hanyu pinyin for code, if you want to create your own position variables from the originals.

  4. Position variables • We have provided a variable of flag variables identifying different kinds of position • We have a separate file that for each combination of dataset and numeric position code specifies the hanyu pinyin and Chinese characters. • This file provides flag and other variables describing characters of positions. • These flags are merged back into the main file to provide variables for analysis.

  5. Created Position Variables • HAS_POSITION • Any salaried official position or purchased title • Doesn’t include miding, piding, etc. Those were statuses, not salaried official positions • ESTIMATED_INCOME • Imputed income based on stipends associated with the position(s) held by an individual • RANK • Bureaucratic rank, based on specification of pin in the position

  6. Position variables • BI_TIE_SHI, ZHI_SHI_REN, and flags for specific positions • JUAN, DING_DAI etc. for presence of modifiers • EXAMINATION for any examination-related title • NO_STATUS indicates that no status at all was recorded for a male, even though we would have expected one.

  7. Name variables • HAS_SURNAME • DIMINUTIVE_NAME • RUSTIC_NAME • NON_HAN_NAME • NUMBER_NAME

  8. Creating New Variables • DS-6 contains pinyin for positions • DATASET and POSITION_CODE are the basis of a merge back to the data files • POSITION_PINYIN is the ‘raw’ position, as transcribed by the coders • POSITION_CORE is a stripped down version that includes modifiers • Chinese characters are in an appendix to the Analytic File codebook

  9. Creating new variables • STATA lets you search strings for particular values, and return an indicator if a string is fine. • Can use this for occupations of special interest • For example, • generate artisan = index(POSITION_PINYIN,"jiang") > 0 • generate juanna = index(POSITION_PINYIN,”juanna”) > 0 • Can code positions manually using Chinese characters in the appendix of the Analytic File codebook

  10. Studying attainment • We have mainly used event-history • Determinants of chances of attaining position by next register • Allows for consideration of time-varying characteristics • Characteristics of kin • An alternative would be to look at determinants of attaining a position by a specific age, with one observation per person

  11. Creating variables to identify attainment of position by next register generate at_risk_position = SEX == 2 & PRESENT & NEXT_3 & HAS_POSITION == 0 bysort PERSON_ID (YEAR): generate next_position = at_risk_position & HAS_POSITION[_n+1] bysort AGE_IN_SUI: egentotal_at_risk_position = total(at_risk_position) bysort AGE_IN_SUI: egentotal_next_position = total(next_position) generate p_next_position = total_next_position/total_at_risk_position bysortAGE_IN_SUI: generate first_in_age = _n == 1 twoway line p_next_position AGE_IN_SUI if AGE_IN_SUI >= 1 & AGE_IN_SUI <= 80 & first_in_age, ytitle("Proportion attaining position by next register") scheme(s1mono)

  12. bysort • bysort groups the records in the dataset according to the values of the specified variables. • Each set of records defined by a unique value of the specified variables is treated as a distinct block of records when the command is executed. • If a variable is in parentheses, the data is sorted on that variable, but not divided according to the unique values of that variable. • [ ]allows access to values from other observations in the same block. [1] says to draw the value of a variable from the first record in the block, [_N] from the last record, [_n+1] the next record and so forth • _n refers to the location of the current record within the block

  13. x y 1 3 1 7 1 8 1 12 2 15 2 21 2 22 2 -5 3 -10 3 10 4 8 4 2 • Create a variable with the record number within x: • bysort x (y): generate a = _n • Create a flag identifying the first record within x: • bysort x (y): generate b = _n == 1 • Create a flag identifying the last record within x: • bysort x (y): generate c = _N == _n • Create a variable with the total number of records with that unique value of x: • bysort x (y): generate d = _N • Create a variable with the y from the next record within x: • bysort x (y): generate e = y[_n+1]

  14. Results x y a b c d e 1 3 1 1 0 4 7 1 7 2 0 0 4 8 1 8 3 0 0 4 12 1 12 4 0 1 4 2 -5 1 1 0 4 15 2 15 2 0 0 4 21 2 21 3 0 0 4 22 2 22 4 0 1 4 3 -10 1 1 0 2 10 3 10 2 0 1 2 4 2 1 1 0 2 8 4 8 2 0 1 2

More Related