220 likes | 315 Views
Data Preparation in the Quadstone System Version 5. Thursday, 17 th February 2005 7.30am PST / 10.30am EST / 3.30pm GMT / 16.30 CET Please join the teleconference call now; if you have any difficulty, contact support@quadstone.com . Starting in 15 minutes. Starting in 10 minutes.
E N D
Data Preparation in theQuadstone System Version 5 Thursday, 17th February 2005 7.30am PST / 10.30am EST / 3.30pm GMT / 16.30 CET Please join the teleconference call now; if you have any difficulty, contact support@quadstone.com. Starting in 15 minutes Starting in 10 minutes Starting in 5 minutes Starting in 2 minutes Starting now
How to ask questions • Return to WebEx Event Manager: • Use Q&A (not Chat): • You can return to full-screen view:
Data Preparation in the Quadstone System V5 • Presenter: Joshua Lewis, Quadstone Consultant • Overview: Interactive data-preparation: sorting, aggregating, joining, and deriving • Audience: Experienced Quadstone System users looking to undertake ad-hoc preparation of data • Format: • A live demo with slides for sign-posting • Follow-up exercises in the form of a workbook and dataset • Duration: 1 hour, including Q&A
Interactive data preparation • Wizards for each action • Right-click in Quadstone System Explorer (QSE), or • Click on (or drag file to) Quadstone System Shortcut Bar • Choose parameters as appropriate • Best practice: keep an audit trail
Customer IDs To be filled Analysis dataset A simple data preparation process Customer data Transaction data SORT SORT DERIVE MEASURE JOIN DERIVE
Joining foci in QSE Join Fields adds fields from a secondary focus to a primary focus. Pre-requisites: • Key fields in both foci, of the exact same datatype • The records in both foci must be sorted by the key fields • Subtly different from Import Fields in Decisionhouse
Field derivation CustomerID StartDate Age Gender 163004 187006 188008 190006 268006 36004 ... 22/10/1998 09/05/1999 01/09/1996 04/11/1997 02/03/1995 22/01/1995 ... NULL NULL 49 43 52 72 ... 1 1 2 1 1 1 ... Derivation CustomerID StartDate Age Gender MonthsTenure 163004 187006 188008 190006 268006 22/10/1998 09/05/1999 01/09/1996 04/11/1997 02/03/1995 NULL NULL 49 43 52 1 1 2 1 1 10 3 35 21 53
Deriving new fields • Requires: • A focus (need not be sorted) • A derivations (.tml) file containing TML descriptions of the fields to be derived • Create a new TML file via right-click in QSE, then right-click again to Edit • Best practice: Develop and debug the FDL expressions interactively in Decisionhouse, before embedding them within the TML file
Derivation syntax create Tenure :=countwholemonths(StartDate,today()); create YoungMan := Age < 30 and Gender = 1; • Creates one output record per input record • The first example counts the number of months since a person became a customer; the second creates a flag to identify a specific segment of customers • General syntax: create<fieldname>:=<FDL expression> ; • See online help: Field Derivation Language (FDL) reference
Sorting foci • Required when combining foci and grouping records • Sort will usually, but not always, be on customer ID • Best practice: check sort order first, to see if a sort is needed • Best practice: sort once, upstream (adding keys if needed), to minimize time-consuming re-sorting
CustomerID Date TransType Value 6434000000 6434000000 6434000000 6434000001 6434000001 6434000002 ... 05/02/1998 07/02/1998 08/02/1998 02/02/1998 05/03/1998 03/02/1998 ... ATM ATM SD SD SD SD ... 30.00 54.42 83.80 29.49 40.00 46.26 ... Rollup (aggregation) CustomerID MostRecentDate MostFreqTrans AverageValue 6434000000 6434000001 6434000002 … 08/02/1998 05/03/1998 03/02/1998 ... ATM SD SD ... 56.07 34.75 46.26 ... Transaction measurement
Simple aggregation • Requires: • A focus (sorted by the grouping key field, e.g., CustomerID) • Selection of an appropriate key field (for grouping records) • An aggregations (.tml) file containing TML descriptions of the aggregations • Ignore the Functions and Statistics options • Example TML files are in the online help and the ext/demo/dbc folderof your installation
Aggregation syntax create NumberOfPurchases :=count(); create ValueOfPurchases :=sum(Amount); • Processes each group of transaction records that share the same CustomerID (grouping key), to create one output record per CustomerID • The first example counts the number of transactions for each customer; the second sums the values in the Amount field for each customer • General syntax: create<fieldname>:=<aggfn>( <arguments> ); • See online help: Transaction Measurement Language (TML) reference
Customers.ftr Visits.ftr IDTotalVisits A 3 C 2 D 4 ID Age A 56 B 23 C 31 Sorted Sorted Join on ID Customers.ftr ID Age TotalVisits A 56 3 B 23 Null C 31 2 Joining foci
Combining datasets Append fields: abut equal-length datasets Join fields: match on common key(s) Merge records: interleave using common key(s) + = + = + =
Metadata • You can import metadata from a previous dataset using a template focus • Includes all derivations, selections, binnings, interpretations and comments • Allows re-use of metadata developed interactively in Decisionhouse • You can import metadata in XML form • Allows metadata (e.g. a data dictionary) to be maintained externally (converted to XML) • Currently supports comments only
FOCUS FOCUS Advanced data preparation • Advanced TML aggregation syntax (filter, split) • Data Build Commands • Data Build Manager qsbuild TRANSFORMING qssort qsrenamefields qsselect ENHANCING qsimportmetadata qsupdate [qsinterp] [qsexportmetadata] EXPORTING qsdbcreatetable qsdbinsert qsdbupdate qsexportflat IMPORTING qsdbaccess qsimportdb qsgenfdd qsimportflat qsimportfocus COMBINING qsjoin qsappendfields qsmerge MANAGING qscopy qslink qsmove qsremove REPORTING qsdescribe qsaudit qsdtsnapshot qsscsnapshot qsxt qsxt2spec [qsinfo] MEASURING qsderive qsmeasure qsmeasuretrack qstml qssettings
Where to find out more • Quadstone System Help; for example: • Working with flat files, database tables, and foci • Transaction Measurement Language (TML) reference • Field Derivation Language (FDL) reference • Quadstone System data-build command and TML reference • Examples of TML • ext/demo/dbc folder of your installation • More example TML and data • Quadstone System Support website: http://support.quadstone.com/ • Advanced Data Preparation training course: contact support@quadstone.com
After the webinar • These slides, a workbook and data are available via www.quadstone.com/training/webinars/ • Any problems or questions, please contact support@quadstone.com
Upcoming webinars See www.quadstone.com/training/webinars/. If there’s a webinar topic you’d like to see, please let us know via support@quadstone.com.
Your feedback Please email support@quadstone.com