DIFFERENT TYPES OF JOIN IN DATATAP
Join transformation or join Datastreams allows you to join multiple datasets together.
This article describes 4 types of join: left, right, inner and outer.
The examples below are based on 2 tables, which are labeled as Stream A and Stream B.
The Joins will take the field 'ID' as a common key, and the column 'Name' from the Stream B will be joined to Stream A.
"Name" will be added from Stream B based on the "ID" from Stream A. ID 104 does not have a match in Stream B, therefore a value for Column name is None. Other values from Stream B are not joined.
Right join means that table where we join from has a priority:
- Where there is a match, the row with values from Stream A will be kept (e.g. ID 101);
- Where there is no match will be removed (e.g. originally row ID 104 from Stream A);
- All other rows from the right table will be joined to Stream A (e.g. IDs: 102, 103, 105)
- Since Region is not available in Stream B, no value will be assigned to those rows.
Only the rows where IDs are the same (overlapping) will be joined. Rows where ID is not the same will be removed from Stream A.
Whole data from both tables will be joined together. 'None' values will be added for the ones that don't have any mutual match: