450 likes | 730 Views
Stata statistical software. Desirability. Complete, integrated statistical package for data analysis, data management, and graphics Fast and easy to use Broad suite of statistical capabilities
E N D
Desirability • Complete, integrated statistical package for data analysis, data management, and graphics • Fast and easy to use • Broad suite of statistical capabilities • can quickly and easily import datasets from other statistical packages, spreadsheets, and databases. • Publication-quality graphics • combines an objected-oriented with an in-line execution interface, i.e. We can either use the menus and point and click objects to do the desired task (i.e. object oriented) or we can simply use the command Window to enter the instructions in Stata language (in-line execution).
Statistical capabilities • standard methods, e.g. • linear and generalized linear models (GLM), regressions with count or binary outcomes, ANOVA/MANOVA, ARIMA, cluster analysis, standardization of rates, case–control analysis, and basic tabulations and summary statistics. • Advanced techniques, e.g. • dynamic panel data (DPD) regressions, generalized estimating equations (GEE), multilevel mixed models, models with sample selection, multiple imputation, ARCH, and estimation with complex survey samples
Data-management facilities • Can combine and reshape datasets • Can manage variables, and collect statistics across groups or replicates. • Also has advanced tools for managing specialized data such time-series data, panel/longitudinal data, categorical data, multiple-imputation data, and survey data.
Graphics • distinctly styled graphs, including regression fit graphs, distributional plots, time-series graphs, survival plots, and contour plots. • One can write scripts to produce graphs and export them to TIF for publication, to PNG for the web, or to PDF for viewing. • Or, with the integrated Graph Editor, its possible tp click to change anything about a graph or to add titles, notes, lines, arrows, and text. • Possible to choose between existing graph styles or create your own.
Reading data into STATA Opening an existing Stata Dataset • Select Open under the File Menu and browse for the Stata files (.dta) • Memory note: By default, Stata starts with 4 megabytes of memory. Often, your dataset will be larger than this and you will need to increase the amount of memory Stata uses. • Files larger than 4,000,000 will not be loaded into Stata unless you increase the memory with the "set memory" command: set mem 10m
Reading data into STATA • Must be able to read and transfer file into Stata. • Using the data editor • Although Stata has its own integrated spreadsheet to introduce data manually it is not as powerful as other stand-alone spreadsheet applications, e.g. Excel. So, it is preferable to get your raw data ready in Excel and then transfer it to STATA. Once you have all the data you want to use properly arranged in Excel then you can simply ‘copy and paste’ into the Stata data editor. • Stata uses columns to store variables and rows to store the observations .
Reading data into STATA • Must be able to read and transfer file into Stata. • Using the data editor • While working in Excel use the first row for the variable names, try to use easy to identify names and avoid spaces or special characters. The data must follow immediately the row containing the labels. If you copy and paste directly from your spreadsheet to Stata make sure you select the row containing the variables names and paste it into the first cell of the Stata editor. • Handling missing values is also important. Stata automatically recognize blank cells as missing values. So you can leave missing value cells blank or use “.” to indicate there is no data in a given cell. If your spreadsheet includes formulas, make sure that there are no errors messages, e.g. #DIV/0!, #N/A, Stata will not recognize this cells and this may cause an error.
Reading data into STATA • Importing data using the insheet instruction • you can import your data from a spreadsheet directly. This procedure requires two steps: • Save the dataset in your spreadsheet with a .csv file extension • Go to the File menu, then to Import and then select the option: ASCII data created by spreadsheet. This will pop-up a dialog box, click on the Browse button to search for your dataset and once you find your file click OK.
Starting your session • Using “logs” before you get started • It is always a good idea to record your Stata session and to save your output for later viewing and/or printing. • Retyping all the instructions every time you work in Stata can be really boring, and therefore Stata features some options that automatically keep a record of every command you type and of the results you obtain in every session.
Starting your session • Using “logs” before you get started • After you open Stata type in: log using [your log name] • This will start a new file where Stata will save everything you type on the command window and its corresponding result. • The text option tells Stata to create a text file that you can open in any Notepad viewer or any other word processor, if you don't include this instruction Stata will create a log file in a different format (with .smcl extension) that you can only view in Stata (although you can convert it to a text file easily). • The file will include the time you opened and closed your Stata session, error messages, tables, regression results, etc.
Log Files • Logs capture all the text printed in the results window. • To open a log file, in the command window type: log using y:\mylog, or use the drop down menu going to File and opening Log. • You now have a log file in your y drive called mylog that will record everything you type in the command window and the output that you see on the screen in the results window. • To turn the log off, you simply type: log close • To write over an existing log file you type: log using y:\mylog, replace • And to append to an existing file you type: log using y:\mylog, append • To temporarily stop logging: type in the Command: window log off • To resume logging: type in the Command: window log on • Stata saves your log file as a .smcl so we need to translate it to text to take a look at it in word: translate y:\mylog.smcl y:\mylog.txt • Alternatively, when you open a log file you can use the command, text which automatically saves everything in a text file
Do Files • A Do file is extremely important. It is a record of operations that you are carrying out. just as you would type them in one-by-one during a regular Stata session. • Any command you use in Stata can be part of a do file. • Do files are an easy way to clean and document your data, to replicate programs later on, to replicate a program with different data, and many other things. • doedit opens a text editor which allows you to edit do-files and other text files.
Do Files • You can again use the drop down menu: File -> Do, to open one. • Alternatively, you can click on the icon that looks like a pencil on an envelope • When using a do file it is a good idea to makes notes to yourself: • to do that type */ in the beginning of a line. Everything you type afterwards will be disregarded by Stata but it will remind you what you did in the program. • To run a do file you highlight the part of your program you want to run and simply click on the icon that looks like a note paper with a downwards arrow next to it (you can also highlight command lines and run them separately).
Finishing your session • Before you finish your session, type in the command line log close, text • In your next session if you want to use a different log file you can do so, or you can just use an old one: log using [your log name], append text • This command tells Stata to open your log file and save any information added during the new session. • If you don't include the instruction append Stata will try to create a new log file, and if that file already exits the program will display an error message.
View your log file • Simply look for the folder where you stored the log file (usually the default directory is c:\data\) and open your log file in the word processor of your choice. Notice that all the instruction you input through the command line will be preceded by a dot “.” and the resulting output follows this lines.
Displaying the data in Stata • Once you have loaded your data or if you open a .dta file created with Stat Transfer you can use any of the following commands to open the data editor edit browse
Stata commands & Syntax • There are a number of ways to request stata to run commands: • typing them in the command window, • running them through a do-file, or • pointing and clicking in the drop down menus. • Drop down menus are useful when you are unsure of commands to run, or options available. However, once you have a working knowledge of the commands, it is easier and faster to run them from the command line or through a do-file (which we will discuss later). • The basic syntax of any Stata command: COMMAND Variable-list Restrictions, options
Stata commands & Syntax • Stata instructions • “this font” • arguments of a Stata instruction • [text in brackets] • Stata commands are case sensitive • all instructions are typed using lower case. If you get an error message after executing a command, check first if you typed it correctly without using any upper case letters.
Stata commands & Syntax Basic Commands: • Stata handles two different types of variables: • numeric variables (whose values are only real numbers), and • string variables whose values are combinations of alphabetic and/or numeric variables. • Operators in Stata: + addition, - subtraction, * multiplication, / division, ^ raise to a power, > greater than, < less than, >= greater than or equal to, <= less than or equal to, == equal to (the relational operator for equality is a pair of equal signs), ~= not equal to, & and, | or, ~ not, abs(x) absolute value, exp(x) exponential, ln(x) natural logarithm, log(x) natural logarithm, log10(x) logarithm to base 10, sqrt(x) square root
Stata commands & Syntax Basic mathematical operations in stata: • Stata has a simple calculator function: display or di, e.g. display 2+2 di sqrt(2)/2 di normprob(-1.1)
Stata commands & Syntax Working with data, e.g. Descriptive Statistics describe landha reg list landha reg sum landha reg table landha reg ttest reg=kongovillage ttest reg=0 corr landha reg
Stata commands & Syntax Manipulating Data gen: generates variables Let’s say we want to split the “Farmha” variable into small or large farm, and we want to create a dummy variable: gen landha2=1 if l landha>5 But what happens to those that have a landha of < 5? Stata assigns them . We need to assign them a numeric value, and since we want to create a dummy, let’s give them a 0: replace landha2=0 if landha2==. We can also do this using the drop-down menu. All the commands for data manipulation are found under Data → Create or Change Variables. As it shall be shown, in the output window, after using the drop down menu Stata places the same command used above in the output window and runs that command. So, if one does not know the command before, s/he can then copy it and keep it somewhere else (like in a do-file) for later use. Of course, it is also now stored in a log-file for later use.
Stata commands & Syntax Regression • All regression commands can be found in the drop-down menu by going to Statistics and then choosing the appropriate sub-menu. • regression or reg: runs a simple OLS regression regression depvariable, independent variables • To add variables to the command line, you may either type the variable or click on the variable name in the variable window.
Stata commands & Syntax Graphing • The drop down menus in Stata will be most useful for graphing. • These are some of the more frequently used graphs, that can be found by typing the following commands, or using the drop down menu: hist landha for a histogram scatter landha for a scatter plot Use the drop-down menu to insert titles and legends. • To save a graph, right-click on it and choose save as. • Or, you can copy and paste your graph directly into word by using the right mouse button.
Demonstration (Kongo village data)
Getting help • To get help on a particular command, type: help commandname eg: help regression • To obtain all references to a topic, type: search topic e.g: search regression • An easy way to get help, especially if you don’t know the command name is to use the drop down menu. Go to help and then choose either stata command, search or contents. • For additional help, stata manuals are available in the statlab and online help is available at: http://www.stata.com/
Training manual • Nicholas Minot (2009), ‘Using Stata for Survey Data Analysis,’ International Food Policy Research Institute Washington, DC, USA