Joiner Transformation in Informatica , is a connected and active transformation which let you join data from two heterogeneous source (same source system or different source system). Unlike normal SQL joins , we can join data from file system as well.The Joiner transformation joins sources with at least one matching column.In below article we will go through the properties of Joiner Transformation. We will also discuss the steps of adding /configuring Joiner transformation in Informatica Mapping. We will also check the some performance tuning related guidelines for Joiner Transformation.
Business Purpose of Joiner Transformation:
As the name predict itself , the Joiner transformation is use to join data from two heterogeneous sources or data from the same source. The Joiner transformation uses a condition that matches one or more pairs of columns between the two sources. Out of two input , one is consider as master pipeline ( it is loaded first in the data cache) , then the other one is Detail pipeline which is joined with the Master data.
Note: Joiner can join only two source system , if we need to join N sources then we need to use N-1 Joiner to join them.
Properties of Joiner Transformation :
- Active /Passive : Joiner Transformation is a Active transformation as it also enable filter rows based on join condition and join type
- Connected/Unconnected Transformation : Joiner Transformation is a connected Transformation.
Types of ports in Joiner Transformation:
- Input : For reading input Data
- Output : For providing output Data
- Additionally there is an Master check box which enable you to identify the Master Pipeline and Detail Pipeline
Components of Joiner Transformation:
- Joiner data cache: Joiner always caches the MASTER table. We cannot disable caching. It builds Index cache and Data Cache based on MASTER table.
- Join Type : With the help of Join Type , we can control the output of the join condition. We will discuss it in more detail.
- Master/Detail source port: While adding source ports into joiner , we need to defined master and detail source by clicking on the Master check box. To make one source as master , we can check any port of the source on the “M” check box, it will automatically mark the check for remaining port as well
- Sorted input: Select this option to improve session performance. To use sorted input, you must pass data to the Joiner transformation sorted by joining port, in ascending or descending order.
You can configure the Joiner transformation components and options on the Properties and Ports tab.
A normal join will allow only those records which satisfy the joiner condition for both sources. So remaining records, who don’t match the condition get discarded
Master Outer Join
A master outer join will keeps all rows of data from the detail source and the matching rows from the master source. If any of master records don’t satisfy the condition , those rows get discarded.
Detail Outer Join
Just opposite to Master Outer join , it keep all rows of data from Master Source and the matching rows from Details Source. If any of detail records don’t satisfy the condition , those rows get discarded.
Full Outer Join
A full outer join keeps all rows of data from both the master and detail sources.
Configuring Joiner Transformation Ports:
You can configure the following components on the ports tab
- Port name: Add the name of the port.
- Datatype, precision, and scale: Configure the datatype and set the precision and scale for each port.
- Port type : A port can be input, output. There is no change in the data , so both Input/Output check is marked always.
- Master Port: Use “M” check box to identify master source. You can make any source as master by checking any port of it.
Configuring Joiner Transformation Properties:
Modify the Joiner Transformation properties by clicking on the Properties tab.
|Case-Sensitive String Comparison||If selected, the Integration Service uses case-sensitive string comparisons when performing joins on string columns.|
|Cache Directory||Specifies the directory used to cache master or detail rows and the index to these rows. By default, the cache files are created in a directory specified by the process variable $PMCacheDir. If you override the directory, make sure the directory exists and contains enough disk space for the cache files. The directory can be a mapped or mounted drive.|
|Join Type||Specifies the type of join: Normal, Master Outer, Detail Outer, or Full Outer.|
|Null Ordering in Master||Not applicable for this transformation type.|
|Null Ordering in Detail||Not applicable for this transformation type.|
|Tracing Level||Amount of detail displayed in the session log for this transformation. The options are Terse, Normal, Verbose Data, and Verbose Initialization.|
|Joiner Data Cache Size||Data cache size for the transformation. Default cache size is 2,000,000 bytes. If the total configured cache size is 2 GB or more, you must run the session on a 64-bit Integration Service. You can configure a numeric value, or you can configure the Integration Service to determine the cache size at runtime. If you configure the Integration Service to determine the cache size, you can also configure a maximum amount of memory for the Integration Service to allocate to the cache.|
|Joiner Index Cache Size||Index cache size for the transformation. Default cache size is 1,000,000 bytes. If the total configured cache size is 2 GB or more, you must run the session on a 64-bit Integration Service. You can configure a numeric value, or you can configure the Integration Service to determine the cache size at runtime. If you configure the Integration Service to determine the cache size, you can also configure a maximum amount of memory for the Integration Service to allocate to the cache|
|Sorted Input||Specifies that data is sorted. Choose Sorted Input to join sorted data. Using sorted input can improve performance|
|Master Sort Order||Specifies the sort order of the master source data. Choose Ascending if the master source data is in ascending order. If you choose Ascending, also enable sorted input. Default is Auto.|
|Transformation Scope||Specifies how the Integration Service applies the transformation logic to incoming data. You can choose Transaction, All Input, or Row.|
Must Read : Learn more about Expression Transformation als0
How to use Joiner transformation in Informatica:
Use the following procedure to create an Joiner transformation in Informatica mapping.
Steps to create an Joiner transformation:
- In the Mapping Designer, open a Mapping.
- Click Transformation > Create. Select Expression transformation.
You can also select Transformation by clicking function button on Informatica Designer
- Enter a name and click Done.
- Drag all the input/output ports from the first source into the Joiner transformation. The Designer creates input/output ports for the source fields in the Joiner transformation as detail fields by default. You can edit this property later .
- Select and drag all the input/output ports from the second source into the Joiner
The Designer configures the second set of source fields and master fields by default.
- Double-click the title bar of the Joiner transformation to open the transformation
Click any box in the M column to switch the master/detail relationship for the sources.
- Tip: To improve performance for an unsorted Joiner transformation, use the source with fewer rows as the
master source. To improve performance for a sorted Joiner transformation, use the source with fewer
duplicate key values as the master.
- Click the Add button to add a condition. You can add multiple conditions. The master and detail ports must have matching datatypes. The Joiner transformation only supports equivalent (=) joins.
- Click the Properties tab and configure properties for the transformation (as Mentioned above)
- Connect the output ports to a downstream transformation or target.
Note: After you make the transformation reusable, you cannot copy ports from the source qualifier or other transformations. You can create ports manually within the transformation.
Example of Joiner Transformation:
A well detailed example of Joiner Transformation can be found here.
More about Joiner Transformation:
Joiner Transformation Performance Tuning :
- Use Sorter Transformation before Joiner Transformation to improve the session performance.
- Use required data port only in Joiner to reduce the size of data cache which hold this data.
- Try to use Database join directly via source qualifier as it gives better performance.
- Source with with fewer rows and with fewer duplicate keys should be consider as the master and the other source as detail.
Limitation of Joiner Transformation:
- We cannot use Joiner if any input pipeline contain update strategy transformation
- We cannot use sequence generator directly before the joiner Transformation
So , in this article we have gone through properties of Joiner Transformation ,component/Properties of Joiner Transformation , scenarios where to use it , steps to create it and performance related guidelines. Hope you enjoyed the article. Please feel free to give your feedback.