200 likes | 704 Views
Correlated Subqueries in PROC SQL. Barry Campbell Reproductive Care Program of Nova Scotia. Overview. Quick review of PROC SQL Subqueries and examples Correlated subqueries and examples. Quick review of PROC SQL. SQL: Structured Query Language Talk to relational databases
E N D
Correlated Subqueriesin PROC SQL Barry Campbell Reproductive Care Program of Nova Scotia
Overview • Quick review of PROC SQL • Subqueries and examples • Correlated subqueries and examples
Quick review of PROC SQL • SQL: Structured Query Language • Talk to relational databases • PROC SQL is SAS’s implementation SQL • Integrates elements of SAS language: e.g., functions, formats, labels • Alternative to Data Step when you want to think relationally
What is a subquery? • SELECT statement embedded in a SQL query instead of a column name, table name or expression • Use in SELECT, FROM, WHERE, or HAVING clauses • Scalar or vector depending on context • Can refer to same or different table • AKA “Nested” or “Inner” queries
Subquery in SELECT clause SELECT student_id, (SELECT COUNT(*) FROM courses WHERE student_id = S.student_id) num_courses, (SELECT MAX(grade) FROM grades WHERE student_id = S.student_id) best_grade FROM students S Pull number of courses and best grade from other tables. Could also be done with a JOIN.
Subquery in FROM clause Create inline table C to join with A and B using IDs A (admits) B (patients) C (DISTINCT doctors)
Subquery in WHERE clause Systolic blood pressure
What is a correlated subquery • Results of inner query constrained by outer • Uses a common variable or key to correlate inner and outer • Typically in the WHERE clause
What’s it good for? • Selection depends on aggregate results from the same table • Selection depends on data about same entity in a different table • Combine multiple steps into one • Pass-through queries (execute on DBMS)
What’s it good for? Examples • Improving grades: • Show students whose average grade improved at least 10% over last year’s • Flooding events: • List cities and dates on which rainfall was at least 10x the city’s daily average
Correlated Subquery Example SELECT student_id, name FROM students S WHERE (SELECT MEAN(grade) FROM grades WHERE student_id = S.student_id AND year = 2011) > (SELECT MEAN(grade) FROM grades WHERE student_id = S.student_id AND year = 2010) * 1.1 Who improved their average by at least 10%?
Correlated Subquery Example SELECT city_id, rainfall_mm, event_date FROM rainfall_data R WHERE rainfall_mm > (SELECT MEAN(rainfall_mm) * 10 FROM rainfall_data WHERE city_id = R.city_id) Where and when was the heavy rain? Correlation: inner to outer
Correlated Subquery Example From patient visit registry, list all patients and the maximum amount charged for each one
Correlated Subquery Example Which patients had vital signs taken more than once in the same visit?
Correlated Subquery Example Show me profits on CPUs from top performing suppliers
Correlated Subquery Example Build dataset from X but exclude observations found in Y
Summary • Subqueries and correlated subqueries: compact way to write complex data manipulation. • Combine selection with aggregation • Think relationally rather than procedurally. • Good way to improve understanding of relationships among database tables
Resources • PROC SQL for DATA Step Die-Hards, Christianna S. Williams http://www2.sas.com/proceedings/forum2008/185-2008.pdf Various data manipulation tasks using Data Step and SQL in an increasingly complex series of examples, including sub-queries and correlated sub-queries. • Advanced SQL Processing, Destiny Corporation http://www.nesug.org/proceedings/nesug02/hw/hw007.pdf Advanced topics in PROC SQL including HAVING, FULL JOINs and creation of Views, Indexes, and Data sets. Later examples of sub-queries and correlated sub-queries. • Working With Subquery in the SQL Procedure Lei Zhang, Domain Solutions Corphttp://www.nesug.org/proceedings/nesug98/dbas/p005.pdf - Advanced examples of subqueries and correlatedsubqueries in all clauses of the SQL statement. • An Animated Guide: Knowing SQL Internal Processes makes SQL Easy - Russ Laveryhttp://www.phuse.eu/download.aspx?type=cms&docID=597 A graphical representation of the SQL process and some rules for describing/predicting the SQL process. Detailed examples with good explanations of pros and cons of subqueries. • http://beginner-sql-tutorial.com/sql-subquery.htm