220 likes | 745 Views
Joiner Transformation. -. Introduction. Transformations help to transform the source data according to the requirements of target system and it ensures the quality of the data being loaded into target.
E N D
Introduction • Transformations help to transform the source data according to the requirements of target system and it ensures the quality of the data being loaded into target. • The process of manipulating data. Any manipulation beyond copying is a transformation. Examples include cleansing, aggregating, and integrating data from multiple sources. • Transformations are of two types: Active Passive
Overview • Joiner transformation is used to join source data from two related heterogeneous/homogeneous sources residing in different locations or file systems. • Joins two sources with at least one matching port. • Uses a condition that matches one or more pairs of ports between the two sources. • Can add more Joiner transformations to the mapping in case of joining multiple sources.
Case-Sensitive String Comparison If selected, the PowerCenter Server uses case-sensitive string comparisons when performing joins on string columns. • Cache Directory Specifies the directory used to cache master or detail rows and the index to these rows. By default, the cache files are created in a directory specified by the server variable $PMCacheDir. • Join Type Specifies the type of join: Normal, Master Outer, Detail Outer, or Full Outer. • Tracing Level Amount of detail displayed in the session log for this transformation. The options are Terse, Normal, Verbose Data, and Verbose Initialization.
Joiner Data Cache Size Data cache size for the transformation. Default cache size is 2,000,000 bytes. Joiner Index Cache Size Index cache size for the transformation. Default cache size is 1,000,000 bytes. Sorted Input Specifies that data is sorted. Choose Sorted Input to join sorted data. Using sorted input can improve performance.
Defining a Join Condition • You define one or more conditions based on equality between the specified master and detail sources. • Join conditions only support equality between fields. • For example, if two sources with tables called EMPLOYEE_AGE and EMPLOYEE_POSITION both contain employee ID numbers, the following condition matches rows with employees listed in both sources: EMP_ID1 = EMP_ID2
TYPES OF JOINS • Normal • Master Outer • Detail Outer • Full Outer
NORMAL JOIN Discards all rows of data from the master and detail source that do not match, based on the condition. MASTER OUTER JOIN Keeps all rows of data from detail source and the matching rows from the master source. It discards the unmatched rows from the master source.
DETAIL OUTER Keeps all rows of data from the master source and the matching rows from the detail source. It discards the unmatched rows from the detail source. FULL OUTER Keeps all rows of data from both the master and detail sources.
CACHING • When the PowerCenter Server runs a session with a Joiner transformation, it reads rows from the master and detail sources concurrently and builds index and data caches based on the master rows. • The PowerCenter Server then performs the join based on the detail source data and the cache data. • To improve performance for an unsorted Joiner transformation, use the source with fewer rows as the master source. • To improve performance for a sorted Joiner transformation, use the source with fewer duplicate key values as the master.
JOINER TRANSFORMATION TIPS • Perform joins in a database when possible. • Join sorted data when possible. • For an unsorted Joiner transformation, designate as the master source the source with fewer rows. • For a sorted Joiner transformation, designate as the master source the source with fewer duplicate key values.