620 likes | 732 Views
The Wonders of Field Derivation Language (FDL). Wednesday, May 18, 2005 2pm UK/Ireland, 3pm Central European, 9am Eastern US Thursday, May 19, 2005 9am Pacific, 12pm Eastern, 5pm UK/Ireland. Please join the teleconference call now; if you have any difficulty, contact support@quadstone.com .
E N D
The Wonders ofField Derivation Language (FDL) Wednesday, May 18, 2005 2pm UK/Ireland, 3pm Central European, 9am Eastern US Thursday, May 19, 2005 9am Pacific, 12pm Eastern, 5pm UK/Ireland Please join the teleconference call now; if you have any difficulty, contact support@quadstone.com. Starting in 15 minutes Starting in 10 minutes Starting in 5 minutes Starting in 2 minutes Starting now
How to ask questions • Return to WebEx Event Manager: • Use Q&A (not Chat): • You can return to full-screen view:
The Wonders of Field Derivation Language (FDL) • Presenter: Eileen MacGregor, Support Manager • Overview: Field Derivation Language (FDL) is one of the most powerful features of the Quadstone System. It allows you to transform or combine fields to produce more useful data for analysis. However, the programmatic nature of FDL can be off-putting. This webinar aims to provide practical examples and useful tips for creating your own derived fields. • Audience: Anyone who wants to enhance their analysis with derived fields. • Format: A live demo, with slides for an introduction • Duration: 90 minutes
Overview • How do I derive a field? • Tips and guidelines • Working with numbers, strings, dates and fields in FDL • Practical applications • Flagging segments • Dealing with Nulls • Converting between data types • Adding reference information • Random selections • Matching patterns in text • Gotchas!
Colour coding in examples • FDL functions in red • Field names in blue • Literals (number/date/string) purple • All other FDL syntax in black
What do derivations do? • Most derivations manipulate existing data, but derivations do not need to refer to any field names — they can also use only functions: Age + 10; countwholeyears( DOB, today() ); if ( rndUniform() < 0.4 ) then "Y" else "N"; today();
FDL “laws” • You can do virtually anything in FDL • There is always more than one way to do everything • A good chance you’ll get it wrong first time — use the Help • ( ) and ; are your friends • You won’t break anything by trying
How to derive a field In Decisionhouse, derivations are done in the Table Viewer, by clicking on the New Field button:
How to derive a field By default, a single line derivation textbox appears:
How to derive a field — tip A multi-line textbox can be accessed by clicking on
How to derive a field — tip • All built-in functions can be accessed by clicking on
How to derive a field — tip • Function and field names can be entered partially, then completed using TAB-completion • Useful when there are lots of similarly-named fields
How to derive a field — tip • Use the Quadstone System Help, which can be accessed by pressing F1
Derivations — guidelines • Field names must be SQL-compliant (no reserved words and with only alphanumeric characters and the underscore _ character)
Derivations — guidelines • Working with fields, strings and dates • Fieldnames can be single-quoted or have no quotes: 'SpendInCents' / 100 is identical to SpendInCents / 100 • Auto-generated fields from DTB and SCB use single quotes
Derivations — guidelines • Working with fields, strings and dates • String values must be enclosed in double quotes Gender = "Male"
Derivations — guidelines • Working with fields, strings and dates • Date values must be preceded by a hash DOB < #01/01/1984 FirstTrans > #2000/06/30
Derivations — what if I get it wrong? If there is a mistake in the derivation, an error will be displayed:
Derivations — what if I get it wrong? If you derive a field, but use the wrong input fields/functions etc. • Either delete the field using and start again • Or simply correct the input FDL and press <Return>
Flagging a segment • To flag a segment of the population using a logical expression: if Gender = "F" and Income > 35000 then "Target" else "No Target"
Handling Nulls in data • Missing data (Nulls) can be transformed: if isnull(Response) then 0 else Response nvl(Response, 0)
Handling Nulls in data • Special functions handle Null data: minnonnull(CallsQ1, CallsQ2, CallsQ3) sumnonnull(Spend1, Spend2, Spend3)
Converting between types There are three datatypes for fields in the Quadstone System: • Numeric tointeger() / toreal() • String tostring() • Date todate()
Converting to numeric • Integer → Real • Real → Integer • Note anything after the decimal point will be truncated • Dates and suitable strings can be converted to numeric field types • Nulls will be generated for unsuitable strings
Converting to numeric • toreal(TotalTax) • tointeger(CurrentBalance) • tointeger(DateField) • tointeger( substr(StateCode, 2, 4) )
Converting to string • Numerics and dates can be converted to string values • Why convert to string? In order to use other string functions.
Converting to string tostring( today() ) concat(tostring(HouseNo), " ",StreetName)
Converting to string Example How do I convert a date field into a string field in the format “YYYY-MM” ?
Converting to string The answer initially given: M1 := month(Mydate); Yr := tostring(year(Mydate)); M2 :=if (M1 < 10) then concat("0", tostring(M1)) else tostring(M1); concat(Yr, "-", M2)
Converting to string The solution I came up with yesterday: x := tostring( Mydate ); concat( substr(x, 6, 9), "-", substr(x, 3, 4) )
Converting to date • Before the type-conversion functions, only individual numeric date components could be used to create a date field, using: • adddays, addmonths, addyears
Converting to date addyears(addmonths(adddays(#01/01/2000, MyDay - 1), MyMonth - 1), MyYear - 2000)
Converting to date • Much easier now • todate(20050515) — integer is in YMD format • todate("15/05/2005") — string is in ‘read’ format set in preferences, e.g. European, American • todate("15-May-2005", "%d-%b-%Y") —specified date format
Using lookup tables • Reference information can be added from lookup tables, using ‘key’ values in focus • Lookup tables generally small • Data does not have to be sorted
Using lookup tables I have a StateCode field in my focus, and I want to add the average income value per state. This data is held in another table
Using lookup tables Use the dblookup() function dblookup("focus:D:\Ref\USIncome.ftr", "US_State", "MeanIncome", substr(StateCode, 0, 1))
Selecting records randomly • Datasets can be split into segments, for example Test/Training: • numericTestTrainSplit() • Equal-sized populations: • sampleEqualSize() • Stratified sampling: • sampleStratified()
Selecting records randomly • To split the population into a 40% Test segment and a 60% Training segment: numericTestTrainSplit(0.4)
Selecting records randomly • To sample 2500 records from a population comprising females: sampleExactNumber(2500)
Selecting records randomly • To sample 2500 records from a segment containing 17219 records, where the segment comprises females, and 5000 records from the remaining population of 82781 records:
Selecting records randomly sampleStratified(2500, 17219, 5000, 82781, Gender = 2)
Regular expression matching • It is possible to ‘find’ patterns within string fields: • match() • And ‘replace’ patterns in strings: • replaceall() , replacefirst()
Regular expression matching • To look for StateCode values which begin with an alphabetic character: match("^[A-Z]" , StateCode )
Regular expression matching • To look for StateCode values which only contain alphabetic characters: match("^[A-Z]\+$" , StateCode )
Removing part of text in a string How can I remove the title information from a name field? replacefirst("^Mr & Mrs ", "", FullName)
Removing part of text in a string To handle all the different titles in one derivation: case match ("^Mr & Mrs ", FullName) : replacefirst("^Mr & Mrs ", "", FullName) ; match ("^Mrs ", FullName) : replacefirst("^Mrs ", "", FullName) ; match ("^Mr ", FullName) : replacefirst("^Mr ", "", FullName) ; …
Removing part of text in a string To handle all the different titles in one derivation: replacefirst( "^Mr & Mrs \|^Miss \|^Mrs \|^Ms \|^Mr \|^Rev \|^Dr " , "", FullName)
Regular expression matching One customer’s use: • To determine all possible entries for a product from a free-text field: match( "123", ProductCode )
Global variables • Global or accumulator variables allow values to be computed over entire sets of records For example, to create an index field: global i := 0; i := i + 1;