300 likes | 324 Views
Summer SAS Workshop Lecture 3. SAS Workshop Website. www.musc.edu/~simpsona/SASWorkshop/. Part I of Lecture 3. Thinking through a programming problem Programming logic Subsetting data. How to program?. What are your goals? What does your data look like?
E N D
SAS Workshop Website www.musc.edu/~simpsona/SASWorkshop/
Part I of Lecture 3 Thinking through a programming problem Programming logic Subsetting data
How to program? • What are your goals? • What does your data look like? • How does your data need to look to accomplish your goals? • What is the first thing you type and run when you open SAS and want to start coding?????
Dropping or Keeping Variables • You often get big data sets that you only want to use part of. • First Drop the variables that you don’t want. (or keep the ones you do want) Data newdata; set annie.olddataset; Keep name ssnumber visdate dob; Run; Or Data newdata; set annie.olddataset (Keep= name ssnumber visdate dob); Run;
Conditional Logic [If-Then-Else] • Frequently, you want an assignment statement to apply to some observations but not all - under some conditions, but not others. • This is also how you create new variables by recategorizing the old variables into new groupings. 1) IF condition THEN action; 2) IF condition THEN action; ELSE IF condition THEN action; ELSE IF condition THEN action; 3) IF condition THEN action; ELSE IF condition THEN action; ELSE action;
IF-THEN-ELSE Rules • A single IF-THEN statement can only have one action. If you add the keywords DO and END, then you can execute more than one action (put it in a loop). • You can also specify multiple conditions with the keywords AND and OR *Remember SAS considers missing values to be smaller than non-missing values.
Comparison Operators • These operators can be coded using Symbols or Mnemonics. SymbolMnemonicMeaning = EQ Equals ~= NE Not Equal > GT Greater Than < LT Less Than >= GE Greater than or Equal <= LE Less than or Equal & AND All comparisons must be true | OR Only one comparison must be true
Subsetting • Often programmers find that they want to use some of the observations in a data set and exclude the rest. The most common way to do this is with a subsetting IF statement in a DATA step. Syntax: IF expression; Ex: IF Sex = ‘f’;
Subsetting (cont.) • If the expression is true, then SAS continues with the DATA step. If the expression is false, then no further statements are processed for that observation; that observation is not added to the data set being created; and SAS moves to the next observation. • While the subsetting IF statement tells SAS which observations to include, the DELETE statement tells SAS which observations to exclude: IF expression THEN DELETE; IF Sex = ‘m’ THEN DELETE; (same as If Sex = “f”;)
Open SAS code from website • Go through the code. • Run the program. • Questions? • How could we make a new data set with only Males in it?
Part II of Lecture 3 Merging data sets SAS Functions
Combining Data Sets Using One-to-One Match Merge • When you have two data sets with related data and you want to combine them. • If you merge two data sets, and they have variables with the same names – besides the BY variables, then variables from the second data set will overwrite any variables having the same names in the first data set. • All observations from old data sets will be included in the new data set whether they have a match or not.
Match Merge Example Proc Sort Data = Rat; BY RatID Date; Run; Proc Sort Data = Rat2; BY RatID Date; Run; DATA BigRat; MERGE Rat Rat2; BY RatID Date; Run;
SAS Functions • Previously created SAS functions are used to simplify some complex programming problems • Usually arithmetic or mathematical calculations Syntax of Function used in an expression: NewVar = FunctionName (VariableName);
Log ( ); Log10 ( ); Sin ( ); Cos ( ); Tan ( ); Int ( ); SQRT ( ); Weekday ( ); MDY ( , , ); Round (x, 1); Mean ( ); RANUNI ( ); Put ( ); Input ( ); Lag ( ); Dif ( ); N ( ); NMISS ( ); Common Functions
Part III of Lecture 3 Debugging
Debugging? “If debugging is the process of removing bugs, then programming must be the process of putting them in.” –From some strange, but insightful website
Syntactic Errors vs. Logic Errors • We will focus mainly on syntax errors; however, it is also possible for SAS to calculate a new variable using syntactically correct code that results in inaccurate calculations, I.e. a logic error. • For this reason, it is always wise to check values of a new variable against values of the original variable used in the calculation.
READ THE LOG WINDOW!! • I know that I spout this all of the time, and that is because too many people begin skipping this step and then can’t figure out why their program isn’t working • If you have an ERROR message, look at that line as well as a few of the lines above it • Don’t ignore Warnings and Notes in the log simply because your program seems to have run, they could indicate a serious error that just did not happen to be syntactically incorrect, in this case, check your logic or add some Proc Prints to understand what is going on inside your program
Debugging: The Basics The better you can read and understand your program, the easier it is to find the problem(s). • Put only one SAS statement on a line • Use indentions to show the different parts of the program within DATA and PROC steps • Use comment statements GENEROUSLY to document your code
Know your colors • Make sure that you are using the enhanced editor and know what code is generally what color (i.e. comments are green)
Scroll Up • Remember that your output and log windows are scrolled to the very bottom of the screen, scroll ALL the way up and check the whole thing. • Look for common mistakes first (Semicolons and spelling errors!) • Make sure you haven’t typed an ‘O’ where you want an ‘0’ or vice versa, this can cause SAS to think that your numeric or character variable should be change to the other variable type. SAS may do this automatically when you don’t want it done!
What is wrong here? *Read the data file ToadJump.dat using a list input Data toads; Infile ‘c:MyRawData\ToadJump.dat’; Input ToadName$ Weight Jump1 Jump2 Jump3; Run;
Here is the log window… ___________________________________________________________ *Read the data file ToadJump.dat using the list input Data toads; Infile ‘c:\MyRawData\ToadJump.dat’; ------ 180 ERROR 180-322: Statement is not valid or it is used out of proper order. Input ToadName$ Weight Jump1 Jump2 Jump3; ------- 180 ERROR 180-322: Statement is not valid or it is used out of proper order. Run; __________________________________________________________ 1 2 3 4 5
SAS is still running… • You need to check the message above the menu on the Log window. • If it says, as in this example, "DATA STEP running", then steps must be taken to stop the program from running. • Even though SAS will continue to process other programs, results of such programs may be inaccurate, without any indication of syntax problems showing up in the log.
SAS is still running… • Several suggestions to stop the program are: • Submit the following line: '; run; • Submit the following line: *))%*'''))*/; • If all else fails, exit SAS entirely (making sure that the revised program has been saved) and re-start it again.
TPA Data Practice • Go to the website and download the a TPA sample data set. Save it in a place that you can successfully write the Libname to point to! • Either find a SAS program that you can change to fit the current problem or begin writing the code with a blank Editor page. • The Goal: See how much reproduction of Tables 1 and 2 from the published paper you can recreate with your sample. • We will practice Proc Boxplot together.