1.17k likes | 1.42k Views
Session I How to use STATA & Basic Data Management Commands. What will be covered?. Introduction to STATA Software General Guidelines in Data entry Data Management in STATA. Introduction to STATA. Open & Close the Output File. To open the log file log using “directorypathfilename.log”
E N D
What will be covered? • Introduction to STATA Software • General Guidelines in Data entry • Data Management in STATA
Open & Close the Output File To open the log file log using “directory\path\filename.log” log using d:\trials\zinc.log To close log close zinc.dta
Append & Replace the Existing Log File To append the existing log file log using d:\trials\zinc.log, append To replace the existing log file log using d:\trials\zinc.log, replace
Open the Data File To open the data file use “directory\path\filename.dta” use d:\trials\zinc.dta To save save zinc.dta zinc.dta
General Guidelines in Data Entry Rows in the datasheet should contain individual information - Record. Each column should contain values of a single entity of all the individuals – Variable. Variable name should not exceed more than eight characters. Variables can be either numeric or string or alphanumeric. A numeric variable must posses only numbers. In any datasheet, identification number is must.
Data Management using STATA • Inputting Data • Editing Data • Creating and Changing Variables • Saving and Reusing Data • Data Reorganization • Merging and Appending datasets
Inputting Data • Enter data from keyboard • input varlist • input str25 name age str1 sex • Best way is copy from excel and directly paste the data to STATA editor • Transfer from other programs
Arithmetic Operators + (Addition) - (Subtraction) * (Multiplication) / (Division) ^ (Raise to power)
Relational Operators > (greater than) < (less than) > = (greater than or equal) < = (less than or equal) = = (equal) != (not equal)
Logical Operators & (and) | (or) != (not equal)
Expressions If – used when expression is to be specified with the condition In – used when range is to be specified in the condition
Editing Data • Edit using Data Editor • edit [varlist] [if] [in] • edit treatment centre age • edit treatment age if centre==3&age>25
Browsing Data • List using Data Editor • browse [varlist] [if] [in] • browse treatment centre age • browse treatment age if centre==3&age>25
Do this Exercise… zinc.dta • Edit the following: • pcode, treatment and cough only for centre 4 • browse for the same and feel the difference
Creating & Changing Variables • Create new variable • generate newvar = exp [if] [in] • gen totstl24= s1_tstool_wt+ s2_tstool_wt+ s3_tstool_wt
Do this Exercise…… zinc.dta Generate total stool output from 0-48 hours
Creating & Changing Variables…contd • Change contents of existing variable • To replace • replace oldvar =exp [if] [in] • replace sodium1 = . if sodium1==0 • To recode • recode varlist (erule) [(erule) ...] [if] [in] • recode age min/6=1 7/11=2 12/max=3 , gen(agecat)
Do this Exercise…… Ex 1: Replace all zeros in serum Potassium as missing. Ex 2: Recode pre admission diarrhea duration into 0-24h, 25-72h and > 72h zinc.dta
Creating & Changing Variables…contd • Rename the existing variable • rename oldvarname newvarname • ren tlc_t2 tlc2 • ren tlc_t3 tlc3 • Eliminate the existing variable • To drop • drop varlist • drop name address • To keep • keep varlist • keep idno age sodium albumin-tlc zinc.dta
Saving & Reusing Data in Stata Format • To Save data • save filename.dta • save zinc, replace • clear • To reuse data • use filename • use zinc zinc.dta
Data Reorganization • Sorting observations and changing variable order • To sort • sort varlist [in] {ascending} • sort pcode • Move specified variables to front of dataset • order varlist • Move one variable to specified position • move varname1 varname2 • Alphabetize specified variables and move to front of dataset • aorder [varlist] zinc.dta
Data Reorganization …contd • Convert data from wide to long • reshape long stubnames, i(varlist) j(varname) • reshape long albumin, i(pcode) j(time) Wide Shape Data Long Shape Data
Data Reorganization …contd Convert data from long to wide reshape wide stubnames, i(varlist) j(varname) reshape wide albumin, i(pcode) j(time) Long Shape Data Wide Shape Data
Do this Exercise… Convert serum zinc from wide to long shape data using zinclab.dta zinclab.dta
Answer!!! zinclab.dta
Merging & Appending Datasets • To append datasets • append using filename • use zinc1.dta • append using zinc2.dta • To merge datasets • merge [varlist] using filename • use zinclab • sort pcode • save zinclab, replace • use zincprognostic • sort pcode • merge pcode using zinclab zinclab.dta
Do this Exercise… Merge file 1 (zinclab.dta)with file 2 (zincprognosis.dta) zinclab.dta
Preparing Data for Analysis Inclusion criteria ≤ 35 months old children
Do this Exercise… Inclusion criteria for the study was pre admission diarrhea duration < 7 days Ex 1: Convert pre admission diarrhea duration from hours to days using zincclean.dta Ex 2: Find values beyond expected range zinc.dta
Do this Exercise… Do similar exercise for hemoglobin using zinc.dta zinc.dta
Preparing Data for Analysis …contd What do you mean by 1 & 2??? zinc.dta
Preparing Data for Analysis …contd Label name
Preparing Data for Analysis …contd zinc.dta What is wrong and how to correct it???