Implementing Join Ranks in Data Services

former_member186897 · ‎11-28-2016

This article is a continuation of my previous article which can be accessed here.

Below is a real case scenario where I implemented the Join rank recently for one of my data migration project. Using join rank had significantly improved the performance, 5 times faster.

Scenario:

I have loaded Good Receipt data in SAP and wanted to compare the loaded data with the transformed data what was attempted to load. This is to track what has been loaded successfully in SAP and what failed to load. In order to do that I need to join the SAP table with my transformed table and extract the matching records. So, for this scenario, I will be using Join Rank and explaining below step by step process.

Steps to implement Join Rank in SAP Data Services.

Step 1:

Create a data flow using following transforms which uses two tables as a source, one SAP table and another transformed table. Used inner join with key fields required. Target table will hold to matching rows from both tables.

Step 2:

Open MSEG Table (double click the table) and change the Join Rank to 30 (By default join rank is 0). Highlighted in yellow. This table has approx. 3.9 million rows.

Step 3:

Open INR_GR_ENR Table (Transformed table) and changed the Joni Rank to 10. This table has 53,500 rows.

Table: MSEG rows count from SAP

Implementation:

You can read in detail how Join Ranks work in Data Services in my previous article here.

MSEG is one of the biggest SAP table with 3.9 million rows. My transformed table (i.e. INT_GR_ENR) has 53,500 rows to load in SAP. Let’s assume that 53,000 rows were loaded successfully where 500 Good Receipts failed to load. So, when I join these two tables it will return me matching 53,000 rows.

What happened when Join Rank not used?

If I simply use inner join on these two tables and run the Job without using any join ranks it takes 10 minutes to produce the result.

Job Execution Time: 10 minutes

What happened when Join Rank is used?

So, since full push down is not possible in this case, I used a join rank of 30 on my biggest table i.e. MSEG and join rank of 10 on the second table which is smaller in size. Since MSEG has higher join rank, it will be a driving table. Now the result has significantly improved and I got the Job completed within 2 minutes which is 5 times faster than the previous run.

Job Execution Time: 2 minutes

It is that simple to implement. Let me know if that helps.

Former Member · ‎11-30-2016

Hello Ansari,

Does Join ranks also improve performance in ABAP dataflows?

Regards,
Sree

former_member186897 · ‎11-30-2016

sree.ranganath4,

Yes, it does. You can try.

former_member186160 · ‎12-01-2016

Ansari, thanks for sharing this informative article.

former_member186897 · ‎12-01-2016

Thanks, swetha.n5 for appreciation!

Former Member · ‎12-15-2016

SAP Data Services is a certified ETL (Extraction, Transformation and Loading) tool from SAP to perform batch loading into SAP HANA.
SAP Data Services is a data integration and transformation software application. It allows users to develop and execute workflows that take data from multiple sources and then allows the user to combine, transform, and refine that data, and then send to a destination system.

Business Scenario:

Suppose you are working in an organization where data is stored in various disparate databases like Oracle, DB2 and other legacy systems. You are asked to recommend the best application for consolidating and replicating data into SAP HANA from SAP and Non SAP sources using the ETL method.

More Informations: http://www.saphanacentral.com/p/23-introduction-to-sap-data-services.html

former_member264991 · ‎09-14-2017

Hello Ansari,

Thanks for explaining us the Join rank technique.

By going through all the documents and discussions I understood that the table with higher number of records will be set as Higher rank and will be driving the Join.

However, I have a doubt here, what if we take the same example of yours and instead of using an Inner Join if we perform left outer join between the two tables,

-- NR_GR_ENR Table (Transformed table) (Rank 10) left outer join MSEG (SAP table) (Rank 30)

would the query still be efficient and time saving?

if not can you please let us know how can we optimize the performance using Join Rank in this scenario - where we are using a left outer join with smaller table LEFT OUTER JOIN Larger table.

Regards,

Rajiv

Implementing Join Ranks in Data Services

SAP PI for Beginners

ABAP 7.40 Quick Reference

Fiori: technical installation and configuration of one app from A - Z