Formula Status Update New Usage Patterns for FINREP

Formula Status UpdateNew Usage Patterns for FINREP Herm FischerFormula Working Group 2008-04-20

Formula Status • Status = “Proposed Recommendation” • 45 day wait before Recommendation • Four “known” implementations • Fujitsu & UBmatrix conformant, in production use • CompSci Resources & New Lido implementations • Major stake holders • BdE, BdF, BoJ, SEC deployed • FDIC has early-IWD formulas

Specifications are Extendable • Basic usage patterns in PR spec • Producing fact items (output instance document) • Assertions for • Consistency (produced fact vs. reported fact) • Existence • Value • Single input, single output of instances • Extension usage patterns

Extension usage patterns Implementation prototypes • Message composition • Formula chaining • Tuple generation • Multi-instance processing • Linkbase & footnotes functions • Custom functions implemented in XPath • Assertion sets Interesting, not implemented yet • Very Large Instances processing

Existing prototypes • Chaining and tuple generation • XSB required prototype to assure PR supports it • Multi-instance processing • Became core part of chaining proposal • Prototyped (partially) • Linkbase tree walks • Prototype allows linkbase to control formulas of • Calc linkbase & dimension aggregation roll ups • Movements & totalling by presentation linkbase

Discussing by history vs. Discussing by FINREP urgency • Message generation (needed) • Chaining (cool, but getting by without this) • Tuple generation (just for GL people?) • Multi-instance (it is the chaining solution) • Custom functions with XPath (need this) • Linkbase functions (top priority) • Patterns in linkbase eliminate most formulas • Moves formula semantics from code to linkbase • Critical for success • Assertion sets (in use)

Chaining • Most frequently request usage case • Least needed (based on experience) • Two solutions implemented • Multi-instance approach most flexible • Explicity dependencies useful for tuple output • Facilitates modularization • Helpful to manage large formula projects

Chaining with explicit dependencyPrior result passed as factVariable

Dependency is explicit • The arc assigns prior result to a factVariable • The \author explicitly specifies this dependency • Important for tuple output • Difficult to maintain in large formula sets • Difficult to factor formulae to separate files

Tuple chaining must be explicit

Muti-instance chaining solution • Common solution to multi-instance and chaining • Let’s first look at multi-instance and then come back to chaining that uses it

Other multi-instance solutions • Complete DTS with linkbases needed • For tree walks: movements, totaling, and dimensions • fn:doc() not useful to load instances • Does not discover DTS • Does not load linkbases • Can not handle shared DTS components • Some referenced taxonomies common • Some linkbases evolve

Prior: xfi:inst() was prototyped • Provides a fn:doc() counterpart to load DTS • generalVariables can access these instances • No filtering on multi-instances • factVar, filter on primary instance, genVar, XPath on additional instances • Requires formula execution code to load instances (instead of formula processor infrastructure) • 

New multi-instance features • All instances loaded by infrastructure(doesn’t have to be coded) • Filters, functions, sequence, covering work • Should share formulas & filters - all instances

New multi-instance solution • Instances are represented by instance resource • instance-variable arc to factVariable • If present, specifies non-default source instance • formula-instance arc from formula • If present specifies the instance to receive fact • Instance resources are files or temporary

Instance resources • Could be loaded by processor • E.g., java code in a server loads primary instance and some prior-period or other-company instances • Or user of GUI adds ‘additional’ instances, such as loading prior-period or other-company instances • Default implied source and result instances • Can be temporary in memory only • Used for chaining and modularization

Multi-instance solution • A better approach to chaining • Implements multiple instance documents • Applies to very large instance solution

Multi-source and result instances

Aspect sources, implicit filtering • Formula aspects come from its variables • Variables from different instances contribute aspects • Aspects independent of the instances they come from • Aspect “covering” is by-aspect, not by-instance

A=B+C; C=D+E use case (Explicit dependency chaining) • Formula 1 (C=D+E) • Result is C, factVariables D & E • factVariables D & E are from the source instance • Formula 2 (A=B+C) • Arc from formula 1, name $r given to Formula 1 result • Result is A, factVariables B & C • factVariable B is from source instance • factVariable $r is from result of formula 1

A=B+C; C=D+E (Example 0027 v-01)Explicit dependency chaining

A=B+C; C=D+E use case (Multi-instance chaining) • Formula 1 (A=B+C) • Result is A, factVariables B & C • factVariable B is from source instance (default) • factVariable C is from result instance (has an arc) • Formula 2 (C=D+E) • Result is C, factVariables D & E • factVariables D & E are from the source instance

A=B+C; C=D+E (Example 0026 v-01) Multi-instance chaining

COREP Use case 18: Weighted average of member children • Weighted average of its dimensional children by another primary item

Current single-formula solution • Excel formulas: • Make PD controlling fact, get PD and EV of dimensional children • General variable for PDxEV member matching

Single formula (Example 0017 v-01) difficult to explain

Exposure value formula • Each PD x EV produced by one formula • Result factItem PDxEV is the product for each dimension value • Second formula binds PDxEV’s of dim-children to sequence and EV’s of dim-children to second sequence, value assertion checks result

New idea: multiple result instances • The PDxEV result fact items aren’t needed for a real result instance • Only a value assertion is really needed • A temporary-results instance might be useful • Also a temporary facts DTS would be needed (to define the PDxEV result fact item)

Chained formulas (0026 v-20)

Implementation issues • Multi-instance term binding • Variables can be bound to different source instances • (This already exists in xfi:inst() based solution.) • Each term in XPath ‘knows’ its instance/DTS (in the internal model or DOM of implementation) • Function binding • A function with item results must keep the instance/DTS of the function result (based on the input terms)

Tree walking • Current implementation • Navigation returns concepts and attributes( (c1, c2, c3), (a11, a12, a13), (a21, a22, a23) ) • take subsequences with XPath for-loops • working (geeky) • Change idea (not prototyped yet) • Navigation returns fully resolved relationship nodes • A relationship has reference to arc node attributes • Attributes: rel/@weight, rel/@preferredLabel • Concepts: maybe xfi:from/xfi:to( rel-node )

Use of tree walking • Calculation linkbase checking by formula • Uses xfi function for linkbase tree walk • Roll ups compared • By threshold value • By rounded values same as ordinary calc validation • Extended links managed by formula • EDInet consolidated vs nonConsolidated conflicts • Dimension aggregation by formula • Uses dimension filter child/descendant feature

FINREP formulas • Most current formulas can be custom tree walk • Consider optional/required attribute • Consider fall back values by arc attribute • Consider dimension filter by arc attribute • Other attributes as needed • Replace 72± (BdF count) with few tree walks

Very Large Instances Use Case • Sizes > ½ million facts, > ½ GB DOMs • Census, Tax office, Security exchanges, etc. • Multi-GB heaps not feasible with Java VMs • moribund at couple GB (incl code & data) • Data almost always from Relational DBMSes

Very Large Instances approach • Basic PR formula solution • All facts, all filters, all variable sets in parallel • Not feasible with very large single- or multi-instances • Multi-instance approach • Allows modularizing processing • Stage formulas to work on parts of very large instance • Cooperative filters & (stored) SQL DB interfaces • Intermediate result instances pass between stages

Staged multi-instance strategy Formula Linkbase(s) Very Large Relational DB lazy load early GC instance formula & variables filters SQL interim instance facts SQL formula & variables filters facts interim instance SQL facts formula & variables filters result

Custom functions with XPath • Custom functions in PR require Java code • Examples of custom functions • Taxonomy and linkbase access • Math with exponentials and recursion (loan value calc) • Prototype adds XPath implementation

a(b,c) = $b + $c (Example 0030 v01) • <formula:formula value="my-fn:a($b, $c)“ …> • <function-impl:function xlink:type="resource" xlink:label="cust-fn-a“ name="my-fn:a“ output="xs:decimal“value="$b + $c" > <function-impl:input name="b“ type="schema-element(xbrli:item)" /> <function-impl:input name="c“ type="schema-element(xbrli:item)" /> </function-impl:function>

Precision by unit (Example 0030 v-03) <formula:decimals>my-fn:decimals($b) </formula:decimals> value=" for $unit in local-name-from-QName( xfi:measure-name( xfi:unit-numerator( xfi:unit( $item ))[1] )) return ( if ($unit eq 'JPY') then -5 else -2 ) " >

Recursion (Example 0030 v-04) <function-impl:function xlink:type="resource" xlink:label="cust-fn-a" name="my-fn:power" output="xs:decimal" value=" if ($exp lt 0) then ( 1 div my-fn:power($y, - $exp) ) else ( if ($exp lt 1) then 1 else ($y * my-fn:power($y,$exp - 1)) ) " > <function-impl:input name="y" type="xs:decimal" /> <function-impl:input name="exp" type="xs:decimal" /> </function-impl:function>

Present value (Example 0030 v-05) <formula:formula xlink:type="resource" xlink:label="formula1" value="$amountDue * my-fn:power((1 + $interestRate), $numYears)"

Back to FINREP • Segue to • Use of tree walks to consolidate many formula

Formula Status Update New Usage Patterns for FINREP