1 / 78

Chapter 1: Introduction

Chapter 1: Introduction. Chapter 1: Introduction. Objectives. List the tasks in the SAS Programming 3 course. Explain the naming convention that is used for the course files. Compare the three levels of exercises that are used in the course.

sdickson
Download Presentation

Chapter 1: Introduction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 1: Introduction

  2. Chapter 1: Introduction

  3. Objectives • List the tasks in the SAS Programming 3 course. • Explain the naming convention that is used for the course files. • Compare the three levels of exercises that are used in the course. • Describe, at a high level, how data is used and stored at Orion Star Sports & Outdoors. • Navigate to the Help facility.

  4. Tasks in the SAS Programming 3 Course • The course topics include techniques for the following data management tasks: • compressing SAS data sets • creating indexes for a quick retrieval of subsets • performing table lookups using arrays, hash objects, or formats • combining data by merging, using the SQL procedure, or using multiple SET statements • combining summary and detail data • sorting and grouping data • developing a program quickly

  5. Resource Utilization • As programmers, you want to perform these tasks as efficiently as possible and optimize the use of the following resources: • programmer time • I/O • CPU • memory • data storage space • network bandwidth

  6. Business Scenarios • The business scenarios are opportunities to compare multiple techniques for performing the tasks. • For example: • Task: Table Lookups • Possible Techniques: • DATA step MERGE statement • PROC SQL joins • Formats in PUT functions or in FORMAT statements • DATA step arrays • DATA step hash objects

  7. 1.01 Multiple Answer Poll • What type(s) of SAS programs do you write? • Data manipulation with the DATA step • Data analysis with procedures • Report writing • A combination of the above • SAS training only; no programs written • Other

  8. Filename Conventions p304d01x course ID chapter # type item # placeholder p304a01 p304a02 p304a02s p304d01 p304d02 p304e01 p304e02 p304s01 p304s02 Example: The SAS Programming 3 course ID is p3, so p304d01 = SAS Programming 3, Chapter 4, Demo 1.

  9. Three Levels of Exercises  You are not expected to complete all of the exercises in the time allotted. Choose the exercise or exercises that are at the level with which you are most comfortable.

  10. Orion Star Sports & Outdoors Orion Star Sports & Outdoors is a fictitious global sports and outdoors retailer with traditional stores, an online store, and a large catalog business.The corporate headquarters is located in the United States with offices and stores in many countries throughout the world.Orion Star has about 1,000 employees and 90,000 customers, processes approximately 150,000 orders annually, and purchases products from 64 suppliers.

  11. Orion Star Data As is the case with most organizations, Orion Star has a large amount of data about its customers, suppliers, products, and employees. Much of this information is stored in transactional systems in various formats. Using applications and processes such as SAS Data Integration Studio, this transactional information was extracted, transformed, and loaded into a data warehouse. Data marts were created to meet the needs of specific departments such as Marketing.

  12. The SAS Help Facility

  13. 1.02 Quiz • Start your SAS session. • Open the Help facility. • Determine the path to use to obtain information about the SAS component objects.

  14. 1.02 Quiz – Correct Answer Determine the path to use to obtain information about the SAS component objects. Information relevant to this course can be found by following these paths in the SAS Help facility: Contents tab  SAS Products  Base SAS SAS 9.2 LanguageReference Dictionary • Dictionary ofComponentObject LanguageElements

  15. SAS OnlineDoc Information relevant to this course can be found by following these paths in SAS OnlineDoc: Contents tab • Products DocumentationA-Z  Base SAS SAS 9.2 LanguageReference Dictionary • Dictionary ofComponentObject LanguageElements You can also obtain information from SAS OnlineDoc.

  16. Chapter 1: Introduction

  17. Objectives • Identify the resources used by a SAS program. • Report computer resource usage using SAS system options. • Interpret resource usage statistics in your operating environment. • Benchmark resource usage.

  18. Running a SAS Program • What resources are required to run a SAS program? • The programmer must perform the following tasks: • determine program specifications • write the program • test the program • execute theprogram • maintain theprogram

  19. Running a SAS Program • The computer must perform the following actions: • load the required SAS software into memory • compile the program • read the data • execute the compiled program • store output data files • store output reports

  20. What Resources Are Used? CPU programmertime I/O resources used networkbandwidth memory data storagespace

  21. 1.03 Multiple Answer Poll • Which of the following resources do you need to conserve? • CPU • I/O • Memory • Data storage space • Network bandwidth • Your time

  22. Understanding Efficiency Trade-offs When you decrease the use of one resource, the use of other resources might increase. Resource usage is dependent on your data. A specific technique might be more efficient with one data set and less efficient with another.

  23. Data Data Space 12 12 9 3 9 3 6 6 CPU Understanding Efficiency Trade-offs Often Implies ... Decreasing the size of a SAS data set can result in an increase in CPU usage.

  24. I/O Memory Understanding Efficiency Trade-offs Often Implies Decreasing the number of I/O operations comes at the expense of increased memory usage.

  25. Deciding What Is Important for Efficiency Your Programs Your Site Your Data

  26. Understanding Efficiency at Your Site Hardware Operating Environment System Load SAS Environment

  27. 1.04 Multiple Choice Poll • This class uses SAS 9.2. • What is the latest version of SAS that are you running? • SAS 8.2 • SAS 9.1 • SAS 9.2 • Other

  28. Knowing How Your Program Will Be Used • The importance of efficiency increases with the following: • the complexity of the program and/or the size of the files being processed • the number of times that the program will be executed

  29. Knowing Your Data

  30. 1.05 Multiple Answer Poll • What type(s) of data do you use? • SAS data sets • External files • Data from a relational database – for example, Oracle, Teradata, or SQL Server • Excel spreadsheets • OLAP cubes • Information maps • Other

  31. Considering Trade-Offs • In this class, many tasks are performed using one or more techniques. • To decide which technique is most efficient for a given task, benchmark, or measure and compare, the resource usage of each technique. • You should benchmark with the actual data to determine which technique is the most efficient. The effectiveness of any efficiency technique depends greatly on the data with which you use the technique.

  32. Running Benchmarks: Guidelines continued... • To benchmark your programming techniques, do the following: • Turn on the appropriate options to report resource usage. • Test each technique in a separate SAS session. • Test only one technique or change at a time, with as little additional code as possible. • Run your tests under the conditions that your final program will use (for example, batch execution, large data sets, and so on).

  33. Running Benchmarks: Guidelines • Run each program several times and base your conclusions on averages, not on a single execution. (This is more critical when you benchmark elapsed time.) • Exclude outliers from the analysis because that data might lead you to tune your program to run less efficiently than it should. • Turn off the options that report resource usage after testing is finished, because they consume resources. In a multi-user environment, other computer activities might affect the running of your program.

  34. 1.06 Multiple Choice Poll • Which of the following SAS programs should be benchmarked? • A report that shows all the customers in the United Kingdom in March 2006 • A report that calculates trends in sales at the end of every day for every department • A report showing the projected total cost of a 5% cost-of-living increase in employee salaries for a Human Resources project conducted on January 1, 2007 • A yearly report that calculates the average sales of a line of apparel for the clothing manager

  35. 1.06 Multiple Choice Poll – Correct Answer • Which of the following SAS programs should be benchmarked? • A report that shows all the customers in the United Kingdom in March 2006 • A report that calculates trends in sales at the end of every day for every department • A report showing the projected total cost of a 5% cost-of-living increase in employee salaries for a Human Resources project conducted on January 1, 2007 • A yearly report that calculates the average sales of a line of apparel for the clothing manager

  36. Tracking Resource Usage STIMER SASOptions STATS (z/OS only) MEMRPT (z/OS only) FULLSTIMER

  37. Tracking Resources with SAS Options OPTIONS STIMER | NOSTIMER; OPTIONS NOFULLSTIMER | FULLSTIMER; STIMER | NOSTIMER OPTIONS STATS | NOSTATS; OPTIONS MEMRPT | NOMEMRPT; OPTIONS NOFULLSTIMER | FULLSTIMER; • Windows, UNIX • z/OS • Invocation option only

  38. Business Scenario • You should benchmark to determine the most efficient technique for creating a new variable based on a condition. • The following methods can be used: • IF-THEN with an assignment statement • IF-THEN/ELSE with an assignment statement • SELECT/WHEN with an assignment statement

  39. 1.07 Quiz Open and submit p301a01a. Record the user CPU: ____________ Exit SAS. Start SAS. Open and submit p301a01b. Record the user CPU: ____________ Exit SAS. Start SAS. Open and submit p301a01c. Record the user CPU: ____________ Which technique is most efficient? In z/OS, record the CPU.

  40. Sample Windows Log • 5 options fullstimer; • 6 data _null_; • 7 length var $ 30; • retain var2-var50 0 var51-var100 'ABC'; • do x=1 to 100000000; • 10 var1=10000000*ranuni(x); • 11 if var1>1000000 then var='Greater than 1,000,000'; • 12 if 500000<=var1<=1000000 then var='Between 500,000 and 1,000,000'; • 13 if 100000<=var1<500000 then var='Between 100,000 and 500,000'; • 14 if 10000<=var1<100000 then var='Between 10,000 and 100,000'; • 15 if 1000<=var1<10000 then var='Between 1,000 and 10,000'; • 16 if var1<1000 then var='Less than 1,000'; • 17 end; • 18 run; • NOTE: DATA statement used (Total process time): • real time 1.26 seconds • user cpu time 0.98 seconds • system cpu time 0.04 seconds • Memory 278k • OS Memory 4976k • Timestamp 6/29/2010 12:39:21 PM p301a01a Partial SAS Log

  41. Sample UNIX Log 1 options fullstimer; 2 data _null_; 3 length var $30; 4 retain var2-var50 0 var51-var100 'ABC'; 5 do x=1 to 10000000; 6 var1=10000000*ranuni(x); 7 if var1>10000000 then var='Greater than 1,000,000'; 8 if 500000<=var1<=1000000 then var='Between 500,000 and 1,000,000'; 9 if 100000<=var1<500000 then var='Between 100,000 and 500,000'; 10 if 10000<=var1<100000 then var='Between 10,000 and 100,000'; 11 if 1000<=var1<10000 then var='Between 1,000 and 10,000'; 12 if var1<1000 then var='Less than 1,000'; 13 end; 14 run; NOTE: DATA statement used (Total process time): real time 6.62 seconds user cpu time 5.14 seconds system cpu time 0.01 seconds Memory 526k OS Memory 5680k Timestamp 6/29/2010 11:55:32 AM Page Faults 82 Page Reclaims 0 Page Swaps 0 Voluntary Context Switches 91 Involuntary Context Switches 48 Block Input Operations 91 Block Output Operations 0 p301a01a Partial SAS Log

  42. Sample z/OS Log p301a01a Partial SAS Log

More Related