1 / 22

Data Preparation in the Quadstone System Version 5

Data Preparation in the Quadstone System Version 5. Thursday, 17 th February 2005 7.30am PST / 10.30am EST / 3.30pm GMT / 16.30 CET Please join the teleconference call now; if you have any difficulty, contact support@quadstone.com . Starting in 15 minutes. Starting in 10 minutes.

treva
Download Presentation

Data Preparation in the Quadstone System Version 5

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Preparation in theQuadstone System Version 5 Thursday, 17th February 2005 7.30am PST / 10.30am EST / 3.30pm GMT / 16.30 CET Please join the teleconference call now; if you have any difficulty, contact support@quadstone.com. Starting in 15 minutes Starting in 10 minutes Starting in 5 minutes Starting in 2 minutes Starting now

  2. How to ask questions • Return to WebEx Event Manager: • Use Q&A (not Chat): • You can return to full-screen view:

  3. Data Preparation in the Quadstone System V5 • Presenter: Joshua Lewis, Quadstone Consultant • Overview: Interactive data-preparation: sorting, aggregating, joining, and deriving • Audience: Experienced Quadstone System users looking to undertake ad-hoc preparation of data • Format: • A live demo with slides for sign-posting • Follow-up exercises in the form of a workbook and dataset • Duration: 1 hour, including Q&A

  4. Interactive data preparation • Wizards for each action • Right-click in Quadstone System Explorer (QSE), or • Click on (or drag file to) Quadstone System Shortcut Bar • Choose parameters as appropriate • Best practice: keep an audit trail

  5. Customer IDs To be filled Analysis dataset A simple data preparation process Customer data Transaction data SORT SORT DERIVE MEASURE JOIN DERIVE

  6. Joining foci in QSE Join Fields adds fields from a secondary focus to a primary focus. Pre-requisites: • Key fields in both foci, of the exact same datatype • The records in both foci must be sorted by the key fields • Subtly different from Import Fields in Decisionhouse

  7. Field derivation CustomerID StartDate Age Gender 163004 187006 188008 190006 268006 36004 ... 22/10/1998 09/05/1999 01/09/1996 04/11/1997 02/03/1995 22/01/1995 ... NULL NULL 49 43 52 72 ... 1 1 2 1 1 1 ... Derivation CustomerID StartDate Age Gender MonthsTenure 163004 187006 188008 190006 268006 22/10/1998 09/05/1999 01/09/1996 04/11/1997 02/03/1995 NULL NULL 49 43 52 1 1 2 1 1 10 3 35 21 53

  8. Deriving new fields • Requires: • A focus (need not be sorted) • A derivations (.tml) file containing TML descriptions of the fields to be derived • Create a new TML file via right-click in QSE, then right-click again to Edit • Best practice: Develop and debug the FDL expressions interactively in Decisionhouse, before embedding them within the TML file

  9. Derivation syntax create Tenure :=countwholemonths(StartDate,today()); create YoungMan := Age < 30 and Gender = 1; • Creates one output record per input record • The first example counts the number of months since a person became a customer; the second creates a flag to identify a specific segment of customers • General syntax: create<fieldname>:=<FDL expression> ; • See online help: Field Derivation Language (FDL) reference

  10. Sorting foci • Required when combining foci and grouping records • Sort will usually, but not always, be on customer ID • Best practice: check sort order first, to see if a sort is needed • Best practice: sort once, upstream (adding keys if needed), to minimize time-consuming re-sorting

  11. CustomerID Date TransType Value 6434000000 6434000000 6434000000 6434000001 6434000001 6434000002 ... 05/02/1998 07/02/1998 08/02/1998 02/02/1998 05/03/1998 03/02/1998 ... ATM ATM SD SD SD SD ... 30.00 54.42 83.80 29.49 40.00 46.26 ... Rollup (aggregation) CustomerID MostRecentDate MostFreqTrans AverageValue 6434000000 6434000001 6434000002 … 08/02/1998 05/03/1998 03/02/1998 ... ATM SD SD ... 56.07 34.75 46.26 ... Transaction measurement

  12. Simple aggregation • Requires: • A focus (sorted by the grouping key field, e.g., CustomerID) • Selection of an appropriate key field (for grouping records) • An aggregations (.tml) file containing TML descriptions of the aggregations • Ignore the Functions and Statistics options • Example TML files are in the online help and the ext/demo/dbc folderof your installation

  13. Aggregation syntax create NumberOfPurchases :=count(); create ValueOfPurchases :=sum(Amount); • Processes each group of transaction records that share the same CustomerID (grouping key), to create one output record per CustomerID • The first example counts the number of transactions for each customer; the second sums the values in the Amount field for each customer • General syntax: create<fieldname>:=<aggfn>( <arguments> ); • See online help: Transaction Measurement Language (TML) reference

  14. Customers.ftr Visits.ftr IDTotalVisits A 3 C 2 D 4 ID Age A 56 B 23 C 31 Sorted Sorted Join on ID Customers.ftr ID Age TotalVisits A 56 3 B 23 Null C 31 2 Joining foci

  15. Combining datasets Append fields: abut equal-length datasets Join fields: match on common key(s) Merge records: interleave using common key(s) + = + = + =

  16. Metadata • You can import metadata from a previous dataset using a template focus • Includes all derivations, selections, binnings, interpretations and comments • Allows re-use of metadata developed interactively in Decisionhouse • You can import metadata in XML form • Allows metadata (e.g. a data dictionary) to be maintained externally (converted to XML) • Currently supports comments only

  17. FOCUS FOCUS Advanced data preparation • Advanced TML aggregation syntax (filter, split) • Data Build Commands • Data Build Manager qsbuild TRANSFORMING qssort qsrenamefields qsselect ENHANCING qsimportmetadata qsupdate [qsinterp] [qsexportmetadata] EXPORTING qsdbcreatetable qsdbinsert qsdbupdate qsexportflat IMPORTING qsdbaccess qsimportdb qsgenfdd qsimportflat qsimportfocus COMBINING qsjoin qsappendfields qsmerge MANAGING qscopy qslink qsmove qsremove REPORTING qsdescribe qsaudit qsdtsnapshot qsscsnapshot qsxt qsxt2spec [qsinfo] MEASURING qsderive qsmeasure qsmeasuretrack qstml qssettings

  18. Where to find out more • Quadstone System Help; for example: • Working with flat files, database tables, and foci • Transaction Measurement Language (TML) reference • Field Derivation Language (FDL) reference • Quadstone System data-build command and TML reference • Examples of TML • ext/demo/dbc folder of your installation • More example TML and data • Quadstone System Support website: http://support.quadstone.com/ • Advanced Data Preparation training course: contact support@quadstone.com

  19. Questions and answers

  20. After the webinar • These slides, a workbook and data are available via www.quadstone.com/training/webinars/ • Any problems or questions, please contact support@quadstone.com

  21. Upcoming webinars See www.quadstone.com/training/webinars/. If there’s a webinar topic you’d like to see, please let us know via support@quadstone.com.

  22. Your feedback Please email support@quadstone.com

More Related