330 likes | 430 Views
Displaying Query Results. Group 8- Last but not the Least Arturo Cantillep/Greg Lynch/Nora Troy-Shaw/Yanxin Zhao. Displaying Query Results. .1 Introduction Presenting Data/What the Boss Wants .2 Its Power Ordering Data/Producing an Ordered Report Enhancing Query Output
E N D
Displaying Query Results • Group 8- Last but not the Least • Arturo Cantillep/Greg Lynch/Nora Troy-Shaw/Yanxin Zhao
Displaying Query Results .1 Introduction Presenting Data/What the Boss Wants .2 Its Power Ordering Data/Producing an Ordered Report Enhancing Query Output Usage (Business Scenario) .3 Summarizing Data/Summary Functions .4 Count Functions .5 Data Grouping Groups and Subgroups .6 using the where clause .7 using the having clause .8 Using the find function .9 using boolean expression .10 Summary
Displaying Query Results • 0.1Introduction 0-3 • 0.2A Section Title 0-4 • <Type title of demo here.> 0-5 • Exercises 0-6 • 0.3Chapter Summary 0-7 • 0.4Solutions 0-8 • Solutions to Exercises 0-8 • Solutions to Student Activities (Polls/Quizzes) 0-8 • Presenting Data .1 Introduction • SQL – a one stop shop alternative for the following: • Data Step – it can create new table,new variables within a table , add insert data • Proc Report – do summary, count,mean, in a nice format too… • Proc Print – present the result to the user
.1 Presenting Data Objectives • Display a query’s result in a specified order • Use SAS formats, labels, and titles to enhance the appearance and useability of a query’s output Business Scenario ( a very realistic one) • You had been instructed by by your boss to cut company costs by producing a list of old, highly paid sales staff and who had served the longest in the company
.1 What the Boss Wants • Data on the service years of the employees ranked (descending) • Data on the Salary (descending) of each employee • Data on the age of each employee (descending) Sample Report • Mr. X 25 yrs. service yrs $50,000 salary 40yrs.old age of employee….
.2 Ordering Data Use the ORDER BY clause to sort query results in a specific order Descending order (by following the column name with the DESC keyword) Order the query results by specifying the following: Any column name from any table in the from clause, even if the column is not on the select list A column name or number representing the position of an item in the select list an expression a combination of any of the above individual items separated by commas
.2 Ordering Data (Our Boss wish list…) From the Sales staff database, list the employee ID, employee name,their years of tenure with the company, their salary and their age Example1: Arrange it by descending years of tenure (Chopping block 1) PROC SQL; select employee_id, employee_name, int(yrdif(emp_Hire_date,today(),"ACTUAL")) as tenure, Salary,int(yrdif(Birth_date,today(),'ACTUAL')) as Age from arturo.salesstaff order by tenure desc; quit;
.2 Ordering Data PARTIAL PROC SQL OUTPUT …. The SAS System 17:38 Wednesday, May 26, 2010 48 Employee ID Employee_Name tenure Salary Age --------------------------------------------------------------------------------------------_______ 120172 Comber, Edwin 36 $28,345 66 • 120174 Simms, Doungkamol 36 $26,850 66 • 121086 Plybon, John-Michael 36 $26,820 65 • 120151 Phaiyakounh, Julianna 36 $26,520 65 • 121035 Blackley, James 36 $26,460 66 • 120166 Nowd, Fadi 36 $30,660 65 • 121060 Spofford, Elizabeth 36 $28,800 65 • 121073 Court, Donald 36 $27,100 61 • 121075 Sugg, Kasha 36 $28,395 65 • 120167 Tilley, Kimiko 36 $25,185 56 • 121138 Tolley, Hershell 36 $27,265 61 • 120154 Hayawardhana, Caterina 36 $30,490 65 • ** - Tenure is in descending order
.2 Producing an Ordered Report • Remember to sort the output in descending order of tenure and then by Age Proc SQL; select employee_id, employee_name, int(yrdif(emp_Hire_date,today(),"ACTUAL")) as tenure, Salary, int(yrdif(Birth_date,today(),'ACTUAL')) as Age from arturo.salesstaff where salary gt 30000 order by tenure desc, salary desc;
Producing an Ordered Report • The Output Sample The SAS System 17:38 Wednesday, May 26, 2010 53 Employee Annual Employee ID Employee_Name tenure Salary Age --------------------------------------------------------------------------------------- 120166 Nowd, Fadi 36 $30,660 65 120154 Hayawardhana, Caterina 36 $30,490 65 121081 Knudson, Susie 34 $30,235 61 120125 Hofmeister, Fong 31 $32,040 55 120129 Roebuck, Alvin 24 $30,070 45 120159 Phoumirath, Lynelle 23 $30,765 46 120158 Pilgrim, Daniel 22 $36,605 45 121080 Chinnis, Kumar 22 $32,235 51 121021 Farren, Priscilla 16 $32,985 35
.2a Enhancing Query Output You can use SAS formats and tables to customize PROC SQL Output. In the SELECT list, after the column name, but before the commas that separate the columns, you can include the following: • Text in quotation marks (ANSI) or the label = Column modifier (SAS enhancement) to alter the column heading ie use labels instead of variable names • The FORMAT = column modifier to alter the appearance of the values in that column ie. Formatting cash amounts with dollar sign and commas PROC SQL: procsql; select employee_id label= "Employee Identifier" , employee_name label= "Employee Name" , int(yrdif(emp_Hire_date,today(),"ACTUAL")) as tenure label="Years of Service", Salary label="Income" format = dollar12., int(yrdif(Birth_date,today(),'ACTUAL')) as Age label="Employee Age" from learn.salesstaff where salary gt 30000 order by tenure desc, salary desc; quit;
Enhancing Query Output PARTIAL PROC SQL OUTPUT …. The SAS System 13:22 Thursday, May 27, 2010 2 Employee Years of Employee Identifier Employee Name Service Income Age ------------------------------------------------------------------------------ 120166 Nowd, Fadi 36 $30,660 65 120154 Hayawardhana, Caterina 36 $30,490 65 121081 Knudson, Susie 34 $30,235 61 120125 Hofmeister, Fong 31 $32,040 55 120129 Roebuck, Alvin 24 $30,070 45 ** - Tenure is in descending order
.2b Business Scenario Produce a report of salary listing of active employees who has wages above $30000 + their 7% bonus. The requestor provided this sketch of the desired report. Proposed Annual Savings Employee Number 9999 Salary + bonus: $32012.32 Additional Techniques to use: • define a new column containing the same constant character value for every row • Using SAS titles and footnotes • Use a combination of these techniques to produce the proposed Annual Savings Report
.2 Enhancing Query Output The code: PROC SQL; title 'Annual Savings Plan 2011'; procsql; select employee_id label="Employee ID", Salary*1.07 as NewSalary label="Salary 2011" format=dollar12.2 from arturo.salesstaff where Salary > 30000 and emp_term_date < 1 order by NewSalary desc; quit; • TITLE and FOOTNOTE statements must precede the SELECT statement. • PROC SQL has an option, DQUOTE=, which specifies whether PROC SQL treats values within double quotation marks (" ") as variables or strings. • With the default, DQUOTE=SAS, values within double quotation marks are treated as text strings. • With DQUOTE=ANSI, PROC SQL treats a quoted value as a variable. This feature enables you to use reserved words such as AS, JOIN, GROUP, or DBMS names and other names that are not normally permissible in SAS, such as table names, column names, or aliases. The quoted value can contain any character. • Values in single quotation marks are always treated as text strings.
Enhancing Query Output PARTIAL PROC SQL OUTPUT …. Annual Savings Plan 2011 14:51 Thursday, May 27, 2010 13 Employee ID Salary 2011 ------------------------------------ 120158 $39,167.35 121063 $38,509.30 121021 $35,293.95 121099 $35,015.75 120135 $34,764.30 121085 $34,491.45 121080 $34,491.45 121022 $34,464.70 120166 $34,446.51
We need to find the total salary for all employees in the company. /*Find total salary for all active employees*/ procsql; Title 'Total Salary For Active Employees'; select "TOTAL:" , sum(Salary) format =comma12.2 from greg.salesstaff where Emp_Term_Date is missing; quit;
.4 The COUNT Function The COUNT function returns the number of rows returned by a query. General form of the COUNT function: COUNT (*| argument ) argument can be the following: * (asterisk), which counts all rows n a column name, which counts the number n of non - missing values in that column 33 33 Total Salary For Active Employees 23 08:24 Saturday, May 29, 2010 ------------------------------- TOTAL: 3,512,777.25
Summary Functions Example: Determine the total number of current employees. proc sql; select count(*) as Count from learn.salestaff where emp_term_date is missing ; quit; PROC SQL Output Count ƒƒƒƒƒƒƒƒ 308 s103d07 34 34
.5 Grouping Data We can produce output calculated by group using SQL. Here, we calculate the average salary by gender: odsrtffile="grouping1.rtf"; procsql; title"Grouping by Gender"; select Gender, avg(Salary) as Average_Salary format=dollar8.from learn.salesstaff groupby gender; odsrtfclose; run;
Groups and Subgroups odsrtffile="grouping2.rtf"; procsql; title"Counts by Job Title"; select Job_Title as Title, Gender, count(*) as Counts from staff groupby Job_Title, gender; odsrtfclose; run; We can produce output calculated by group and subgroup. Here, we calculate counts for each gender by job title:
.6 Using the WHERE Clause odsrtffile="grouping3.rtf"; procsql; title"Average Salary of Women Employees"; select job_title as Title, avg(salary) format=dollar8.as Average from staff where gender="F" groupby job_title; odsrtfclose; run; We can create output specified within particular restraints by using the WHERE clause. Here, we have the average salaries for women employees by job title:
.7 Using the HAVING Clause odsrtffile="grouping4.rtf"; procsql; title"Count by Title of Women Employees"; select job_title as Title, count(*) as Women from staff groupby Job_title, gender having gender eq "F" orderby Count desc; odsrtfclose; run; Alternatively, we can use the HAVING clause to specify which data we want to display. Here, we count the number of women employees by job title.
.8 Using the FIND Function odsrtffile="grouping5.rtf"; procsql; title"Summary Information of Women Employees"; select Job_title, count(find(gender, "F", "I")>0) as Women, avg(salary) format=dollar8.as Average from staff groupby Job_title, gender having gender="F"; odsrtfclose; run; The FIND function is a useful tool for locating data and counting data. Here, we use the FIND function to count the number of women employees and display salaries by job title.
The find function • The FIND function returns the starting position of a substring within a string. NOTICE: the string must be character value. • The general form of FIND function is: FIND(string,substring<,modifier(s)><,startpos>) STRING --- constant, variable, or expression to be searched. SUBSTRING --- constant, variable, or expression sought within the string. MODIFIER(S) --- i=ignore case, t=trim trailing blanks. STARTPOS --- an integer specifying the start position and direction of the search.
The find function • EXP: find the starting position of the substring F in the character variable Gender. procsql; select gender, find(Gender,"F","t") "female_employee" from learn.Salesstaff ; quit; Because “F” is in the first position of the first substring, so the the value returned by FIND function is “1”; and “F” is not is the second substring , so the returned value is “0”
.9 Using Boolean Expressions • Boolean expressions evaluate to TRUE(1) or FALSE(0). • They are used in this SELECT list to distinguish rows that have “F” in the Gender column. procsql; select Job_Title,Gender, (find(Gender,"F","i")>0) "female_employee" from learn.Salesstaff ; quit; The boolean expression will produce the value 1 when Gender contains “F” and 0 when is does not.
Using Boolean Expressions Female_Employee to Male_Employee Partial output Employee female_ Employee Job Title Gender employee ========================================== Sales Rep. I F 1 Sales Rep. I F 1 Sales Rep. II M 0 Sales Rep. III F 1 Sales Rep. I F 1 Sales Rep. II F 1 Sales Rep. II M 0 Sales Rep. II M 0 Sales Rep. IV M 0 Sales Rep. I M 0 Sales Rep. II M 0 Sales Rep. II M 0 Sales Rep. II F 1
Using Boolean Expressions Use the counted value to calculate the percentage. procsql; title "Female_Employee to Male_Employee Ratios"; select job_title, sum((find(Gender,"F","i")>0)) as Female_employee, sum((find(Gender,"F","i")=0)) as Male_employee, calculated Female_employee/calculated Male_employee "F/M Ratio" format=percent8.1 from learn.salesstaff group by job_title ; quit;
Using Boolean Expressions Female_Employee to Male_Employee Ratios Female_ Male_ F/M Employee Job Title employee employee Ratio ================================================================= Sales Rep. I 21 42 50.0% Sales Rep. II 25 25 100.0% Sales Rep. III 15 19 78.9% Sales Rep. IV 7 9 77.8%
Summary • In summary, the SQL procedure provides the following capabilities: • ordering of report • enhanced report production through labels and formats • summarization of data • use of counts and data grouping • use of selection criteria like where and having • use of find function and boolean expressions These characteristics makes the analysis of data a breeze…. thanks to PROC SQL.