600 likes | 787 Views
Geoff Kimber. SQL Reporting. With thanks to Shawn Toy. Objectives. Realize the importance of learning SQL Recognize SQL platform differences Identify relational database concepts Apply SQL syntax to a real-life example Review SQL references. Hierarchical Databases.
E N D
Geoff Kimber SQL Reporting
Objectives • Realize the importance of learning SQL • Recognize SQL platform differences • Identify relational database concepts • Apply SQL syntax to a real-life example • Review SQL references
Hierarchical Databases • Data organized in tree structure with parent and child segments • Parent to child is a one to many relationship • Implies repeating information, generally in child segments • Fileman • Prescription file with refill multiple field
Relational Databases • Data consists of records stored in tables • Each row is unique • Column values are of the same data type • Sequence of rows and columns is insignificant
What is SQL? • Structured Query Language [SQL] • Is an ANSI computer language that is used to interact with many relational databases • Oracle • MS-SQL Server • MySQL, PostgreSQLand others • SQL is a set-based language • SQL is structured like English • SEQUEL (Simple English QUEry Language)
Why Learn SQL? • VA is truly data rich • Each site has between 2-4 TB of data† • non-imaging data • VA FileManager [FileMan] • Only within VistA • Limited functionality for non-programmers • Runs in the production environment • Workflow and resource concerns † - Pham R. The VA Data Lifecycle (Internals, Data Flows, and Business Intelligence)
Why Learn SQL? • Data repositories (warehouses) • Moves the analysis off-line • Provides accessible, yet very secure, data • ODBC, ADO and LINQ • Greater functionality and access control • The “official” data is secured within VistA • Independent of VistA • Microsoft Active Directory based access • Maintain the front-line stance of CPRS/VistA
Why Learn SQL? • Various VA relational databases • Corporate Data Warehouse (CDW) • Regional Data Warehouse (RDW) • VISN Data Warehouse (VDW) • Potential for local data warehouses • Extract raw data from VistA off-hours • UPSERT the data into an SQL server automatically • Users run their own reports and access is secure
Why Learn SQL? • Data warehouses = relational databases • SQL = the language of relational databases • Relational databases are everywhere • Learn a portable skill • Take a systems approach to collaboration • Build reusable code • Work more efficiently
Why SQL Server? • VA has an enterprise license • Microsoft SQL Server is functionally superior to Microsoft Access • Secure • Backups • Procedural code • No database size limit (software) • Efficiency
Microsoft Access & SQL Server • Two of the different SQL flavors • Transact SQL (Microsoft SQL Server) • Manage databases/services, procedural code • Jet SQL (Microsoft Access) • Information retrieval, program component • Each program has its own SQL flavor • Core keywords and functions follow standards • ANSI guidelines are “considered”
A Set-Based Language • In SQL: • Something is either part of a set or it is not • Order is meaningless • SQL Server supports procedural code • VBA can be used in MS Access for procedures
Corporate data warehouse structure • Dimension tables • Fact tables
Dimension tables • Contain attributes that describe records in fact tables • Each record contains unique identifier • Generally more stable • Drug name, provider name, patient name and address • Similar to Drug file (#50), New Person file (#200), Patient file (#2)
Fact tables • Capture operational data • Contain multiple unique identifiers • One identifier for each record • One identifier for each data element in dimension tables • Similar to Fileman IEN
Fact tables • Usually contain numerical and date fields • Prescription fills, Outpatient visits, lab test results • Similar to prescription file (#52), V POV (#9000010.07)
Dimension table examples • Dim.LocalDrug • LocalDrugSID • Sta3n (3 digit station number) • LocalDrugNameWithDose • VaClassification • PricePerDispenseUnit • NationalDrugSID
Dimension table examples • Dim.NationalDrug • NationalDrugSID • DrugNameWithoutDose • DosageForm • Strength
Dimension table examples • sPatient.sPatient • PatientSID • PatientName • PatientSSN • AddressLine1 • City • State • ZipCode
Fact table examples • RxOut.RxOutpat • RxOutpatSID • PatientSID • ProviderSID • LocalDrugSID • IssueDate • RxNumber
Fact table examples • Rxout.RxoutpatFill • RxoutpatFillSID • RxOutpatSID • FillType • ReleaseDateTime • Qty • DaysSupply • UnitPrice
CDW Tables • Metadata • Information about data • Includes CDW to VistA field crosswalk • http://vaww.cdw.r02.med.va.gov/metadata/Reports/Forms/AllItems.aspx
Common SQL Clauses • Clauses perform set-based calculations
Common SQL Functions Microsoft Access supports the aggregate functions FIRST and LASTSQL Server has a COUNT(DISTINCT...) aggregate function
Common SQL Operators Access uses “*” as a wildcard and double quotes; SQL Server uses “%” and single quotesAccess uses the ampersand to concatenate “&”, SQL Server the plus sign “+”
Code example SelectLocalDrugSID ,NationalDrugSID ,LocalDrugNameWithDose fromvdw.dim.localdrug WhereLocalDrugNameWithDoselike '%atenolol%'
Code example • Case is not normally important, but helps with readability • Spaces, tabs, and hard returns have limited importance • Important in text fields • Not important between commands • Help improve readability • Microsoft query analyzer color codes key words for readability
Code example • SQL databases have many different data types • Varchar • Int • Bigint • Smallint • Decimal • Float • Datetime • Money
Code example • Select • Main command used to retrieve data from SQL tables • Followed by list of fields to return • ‘*’ (without quotes) returns all fields in table • Normally, avoid using ‘*’ unless you need all the fields in a table
Code example • From vdw.dim.localdrug • Indicates source of data from which you are selecting • Vdw.dim.localdrug • VDW is database name • VDW => VISN Data Warehouse • RDW => Regional Data Warehouse • CDW => Corporate Data Warehouse • Dim is schema • Localdrug is table name
Code example • Where • Allows you to restrict or filter the table contents to just the results you want • Extremely versatile • Like • ‘%’ is a multicharacter wildcard • ‘%atenolol%’ - contains ‘atenolol’ • ‘atenolol%’ – begins with ‘atenolol’ • ‘%atenolol’ – ends with ‘atenolol’
Code example • Case is not normally important, but helps with readability • Spaces, tabs, and hard returns have limited importance • Important in text fields • Not important between commands • Help improve readability • Microsoft query analyzer color codes key words for readability
Code example SelectLocalDrugSID ,NationalDrugSID ,LocalDrugNameWithDose fromvdw.dim.localdrug Wherelocaldrugnamewithdose like '%atenolol%'
Fileman equivalent Select VA FileMan Option: PRINT File Entries OUTPUT FROM WHAT FILE: DRUG// SORT BY: GENERIC NAME// 'GENERIC NAME["ATENOLOL" WITHIN GENERIC NAME["ATENOLOL", SORT BY: FIRST PRINT FIELD: NUMBER THEN PRINT FIELD: PSNDF:NUMBER THEN PRINT FIELD: GENERIC NAME THEN PRINT FIELD:
Joins • Used to connect two or more tables together using data elements that are shared by individual tables • Similar to fileman jumps, with more power • Inner joins, outer joins and full joins
Inner Joins • Produces a result that includes only the records that are the same in both tables
Inner joins – Example Selecta.localdrugnamewithdose , b.strength fromvdw.dim.localdrug as a inner join vdw.dim.nationaldrug b ona.NationalDrugSID = b.NationalDrugSID whereLocalDrugNameWithDose like ‘%atenolol%’
Outer joins • Results contain all records in one table and only matching records in another table • Left joins – return all results in table specified in ‘from’ statement and all matching tables from ‘joined’ table • Right joins – return all results in table specified in ‘joined’ table ‘from’ statement and all matching tables from ‘from’ statement