Technology Blogs by Members
Explore a vibrant mix of technical expertise, industry insights, and tech buzz in member blogs covering SAP products, technology, and events. Get in the mix!
cancel
Showing results for 
Search instead for 
Did you mean: 
AdemGuler
Participant
In applications where there is a high volume of data creation, changes, and deletion happens, such as sales orders, it is crucial for the application to have a mechanism that provides delta records and for the extractor that needs to provide delta loading. This means that the application should be able to identify and extract only the new, modified or deleted data, ensuring efficient and accurate data loading.

In some of the cases nightly full uploads are not what we exactly want, as time windows for data extractions are limited. Solution for that is coming with ODP framework.

The ODP framework for CDS extraction provides delta capabilities. We have actually two options for delta handling that we can chose from:

  • Change Data Capture (CDC) Delta



  • Generic Timestamp / Date based Delta


Let’s check the two delta mechanisms in detail below.

 

Change Data Capture (CDC) Delta


The easiest way to implement delta loading is to use CDC-enabled CDS views by importing them as Remote tables in Datasphere and enabling Real-time access to them.

Let us see how it is done:
@AbapCatalog.sqlViewName: 'ZSQL_SALES_DELTA'
@AbapCatalog.compiler.compareFilter: true
@AbapCatalog.preserveKey: true
@AccessControl.authorizationCheck: #NOT_REQUIRED
@EndUserText.label: 'CDS View - Delta loading on Sales Document'

@Analytics.dataExtraction.enabled: true
@Analytics.dataCategory: #FACT
@Analytics.dataExtraction.delta.changeDataCapture.automatic: true
define view ZCDS_SALES_DELTA as select from vbap
inner join vbak on vbap.mandt = vbak.mandt
and vbap.vbeln = vbak.vbeln
{

key vbak.vbeln as SalesDocument,
key vbap.posnr as SalesDocumentItem,
vbap.matnr as MaterialNumber,
vbap.kwmeng as CumulativeOrderQuantity,
vbap.vrkme as SalesUnit,
vbap.netwr as NetValue,
vbap.waerk as SalesDocumentCurrency,
vbap.arktx as TextforSalesOrderItem
}

This is our CDS view for delta loading on Sales Documents for the purpose of an example.

If we want to declare a Dimension, Fact, Aggregation Level or Cube we must to include the classification of the CDS view in the header of our CDS view with the following annotation:

@analytics.dataCategory: #VALUE

Replace #VALUE by one of the categories commented before:

  • #CUBE

  • #AGGREGATIONLEVEL

  • #DIMENSION

  • #FACT


A CDS view can be enabled for data extraction by just adding the following annotation:

@analytics.dataExtraction.enabled: true

After adding this annotation, the CDS view is available for extraction by the ODP framework and visible for its data consumers.

The Change Data Capture (CDC) recording mechanism uses database triggers to record any changes to the tables that belong to an ABAP CDS view. For example CDS views such as CDS projection views, this can be done automatically by using the following annotation:

Analytics.dataExtraction.delta.changeDataCapture.automatic

Please refer link for more details: https://help.sap.com/docs/SAP_DATA_INTELLIGENCE/3a65df0ce7cd40d3a61225b7d3c86703/55b2a17f987744cba62...


Adding the CDS view to the Datasphere as a Remote table



Choosing the SQL view name of CDS view



Import and Deploy as a Remote table of CDS view



Deployment has been done as a Remote table in Datasphere


To be able to load initial data to the remote table we have to run first Snapshot for it.


Running Snapshot


Let's take a closer look at some of the data that we initially loaded first, because after using the Real-Time Access feature, we will see that the data that changes in the source system will also change in the Datasphere without reloading or scheduling the data.


The data we are looking at as an example


We will see that the description in the "Short text for sales order item" field of the Sales Documents changes in the source system and is updated in Datasphere using the Real-Time Access feature.

In order to load the changes (delta records) created in the source system into Datasphere, we need to enable the data replication feature for the remote table as "Real-Time Access".


Enabling the Real-Time Access feature



Enabled the Real-Time Access feature


When we look at the details from the Data Integration Monitor menu, it should be as follows:


Data Integration Monitor - general



Data Integration Monitor - details




  • Switching replication type for remote table 'ZSQL_SALES_DELTA' from batch replication to real-time replication.

  • Subscribed to source system data changes.

  • Successful initial data transfer for real-time replication of remote table 'ZSQL_SALES_DELTA'. Now real-time replication will automatically add delta data to remote table 'ZSQL_SALES_DELTA'.


Now let's change the "Short text for sales order item" field description of the Sales Documents in the source system and see how it is reflected in the Datasphere.


In the source system we change the data we specified as an example above



All data updated in the source system are as above


The refresh frequency cannot be adjusted but my observation is it takes place within 1 hour to get updated records in Datasphere.


All data updates that we have as an example


 

Generic Timestamp / Date based Delta


In order to be able to load data on a date-based basis, it is necessary to have date/time information in the relevant application tables that is updated based on changes in the application data.

“upd_tmstmp” is the field in the VBAK table which will trigger delta records in our example.

To define which field will trigger delta, we need to use annotation

@analytics.dataExtraction.delta.byElement.name

This element can be date (ABAP type DATS) or a time stamp (UTC).

Lets update CDS view code with delta element annotation.


Updated CDS view


Please be aware of that: When converting a time stamp to a date, we can use the tstmp_to_dats function as it up or we can use the cast keyword as below.

cast(substring(cast(upd_tmstmp as abap.char(32)),1,8) as abap.dats) as LastChangeDate

Both coding structures will give the same result.

After we have fulfilled our need, we will create a Data Flow and limiting LastChangeDate = CurrentDate-1 at Projection node in Datasphere.


This is our Data Flow - general view


The operation performed at the TO_DATE node: Since the "LastChangeDate" field we created in the CDS view is of string type, need to be converted it to date type here in node.


TO_DATE node


The operation performed at the FILTER node: The code applied to obtain the changes (delta records) from a day ago.


FILTER node


There are two methods we can apply to run the Data Flow we created. They are:

  • The first one is: to run directly the Data Flow by pressing the run button manually or

  • The second one is: to schedule the Data Flow to run daily basis at specified time automatically.


In both of methods, all data that changed (delta records) the day before in the source system will be uploaded to the Datasphere.

As we have chosen to use the second method in our example, the delta records obtained as a result of running this data flow in the target table node one day later are shown below.


The result of changes (delta records) in Datasphere



The result of Scheduled Data Flow in Datasphere


 

Best Regards.

 
18 Comments
JulianJuraske
Participant
0 Kudos
Hello Adem,

thanks for the great summary!

Do we not transfere the whole datatable in the second option from sourcesystem to targetsystem ?
Shouldn't the purpose of Delta be, to transfere only the required data recors?

Is there a way to bypass the Value (Changedate) as a Filter to the Source ? (e.g. in a BW Extractor)
Otherwise I could also just drop the Table and load full in the Dataflow, or implement my own Deltalogic within Datasphere(using SQL View) or use the Append Option in combination with UPSERT.


Best Regards
Julian

MKreitlein
Active Contributor
0 Kudos

Hello Adem,

thanks a lot for that Blog.

What is missing here, for my understanding... how exactly is the Delta transfer happening technically?

I mean, there are two possibilities,

  • the one you showed, where you need the Data Provisioning Agent
  • the other with Dataflows or Replication Flows, where you need the Cloud Connector.

What exactly is happening if there is a network issue or the respective server is down (DPA or CC) before the next Data replication?

Where is the information stored which records have been transferred successfully and which not?

I tried to find the details about this in the Online Help, but I could not really find the answers.

Thanks, Martin

AdemGuler
Participant
0 Kudos

Hello Julian,

For the first question:
I did not see the need to mention this separately in order not to make the blog too long.
The purpose of using delta logic is to exclusively retrieve newly created, modified, and deleted records from the source system. This allows for a more efficient and streamlined process of transferring data to the target system.

For second question:
I do not know. While the first two options are viable, I haven't tested the logic for setting it as UPSERT. However, it's important to note that when a primary key is present in our target table, we can employ that option to avoid any potential duplicate key errors.

Best Regards.

JulianJuraske
Participant
Hello Adem,

In your first Option (CDC) you only Transfer the changed Records (after Initial Full Load) = Delta.

In the second Option (DF) you ALWAYS(not only Initial) transfere the whole Data from Source to Target.
Basicly what you are doing is a Full Load from Source to Target, and Filter out the required Records in the Targetsystem (by setting a Change Timestamp in Source).
The second Option (filter in Target System) can be achieved by many different aproaches (e.g. SQL View comparing Data from today vs yesterday, or the APPEND UPSERT Mode)
However for me this is not a real Delta, since you do not bypass a Filter to the Source to reduces the amound of Records transfered.

Best Regards
Julian
JulianJuraske
Participant
0 Kudos
Hello Martin,

the CDC is using the ODP Framework.

There was a good blog (from 2020)
https://blogs.sap.com/2020/05/30/using-delta-extraction-capability-of-abap-cds-to-sap-data-warehouse...

So I would assume you could repeat the Request send to Datasphere.
MKreitlein
Active Contributor
0 Kudos
Hello Julian,

When reading the blog, I would agree that this is true for Remote Tables.

But I disagree for Replication Flows!

In my project, we have 8 different CDS Views with CDC, all used in Replication Flows, and for none of them I can see entries in ODQMON.

I see all the triggers in DHCDCMON... but there is nothing where you could "repeat" anything, only stop it completely.

The only way to see a request in ODQMON is if I execute a Delta in RODPS_REPL_TEST ... but this is only for testing, not for the real replication to Datasphere.

So my question regarding no 2) is still not answered...

Thanks, Martin
BenedictV
Active Contributor
0 Kudos
Hello Julian,

For the second option,  isn't the DAY-1 filter kind of safety interval we use in FI extraction? I haven't tried this myself, but will the delta still not work as usual? If I have a years worth of data in source I presume the delta only picks DAY-1 records and not the entire data and then filter it in DS. I have to check.
armaansingla1992
Explorer
0 Kudos
Hi Martin Kreitlein,

 

I totally agree with you. To summarise this from my point of view, we have 2 options

  • Remote Table Replication which requires the Data Provisioning Agent.

  • Data flows or Replication Flows, which requires the Cloud Connector.


In option 1 for delta replication, it supports both CDC based and ODP framework (timestamp based delta).

For CDC based delta, it relies on triggers and can be checked in DHCDCMON.

For ODP based delta. it can be checked in ODQMON. The problem with this option is that how to adjust the refresh frequency. As per the below SAP Note, it could 1h or even 1 day. 

https://me.sap.com/notes/3155988/E

Do you know of any option to set the scheduled frequency?

In option 2 for delta replication, it supports only CDC based delta hence relies only on triggers and can be checked in DHCDCMON.

Regards,

Armaan

 
albertosimeoni
Participant
the problem is that

FIRST: all the rows are moved from the ERP to Datasphere.

SECOND: the filter is applied.

 

only if a filter is simple the filter is pushed down to ODP.

just adding a cast after day - 1 and the filter is not valid for pushdown anymore.

 

So the delta is basically useless because the real problem (data fetch from SAP application layer and data transfer) is not addressed
PhilMad
Participant
0 Kudos
Hi Julian, are you sure about the entry in ODQ for option one? The last time I checked, I didn't see it for DSP as a receiver. Cheers.
cardoso
Product and Topic Expert
Product and Topic Expert
0 Kudos
Hi Armaan,

Answering your question #1, it's not possible to change/adjust frequency for real-time replications. The frequency is determined at runtime according to scenario's capabilities.

Regards,
Ismael
armaansingla1992
Explorer
0 Kudos
Hi cardoso,

 

Thanks for your response!

The issue with this is that we expect the data to be replicated in real-time if we are enabling real-time replications. What are the factors influencing this frequency so that those factors can be adjusted to avoid any delay in data replication?

 

Regards,

Armaan
cardoso
Product and Topic Expert
Product and Topic Expert
Hi Armaan,

I'm not aware of all details since the heuristic relies on ODP's source object.
A good rule of thumb could be:

  • 15s for real-time-capable ODPs

  • 1 day for delta-capable ODPs when heuristic expects updates only once per day (e.g. when delta/CDC mechanism is based on a date field (ABAP type DATS))

  • 1 h for any other delta-capable ODP


Regards,
Ismael
amandwivedi
Explorer
0 Kudos

Hello Adem,

Excellent blog.


I tried replicating the same code/scenario but getting this error in Datasphere when enabling real time access


 

Although, if I replicate any standard CDS extractor as a custom Z extractor, it works.


Can you help what I must be missing on here.


 


 

Thanks

 

 
AdemGuler
Participant
0 Kudos

Hello Julian,

Your comment about the second option is wrong because "Data is not accessed locally in SAP Datasphere. It’s read from a virtual table, which accesses the data directly from the source system via the adaptors."

For more details you have to read information on link: https://help.sap.com/docs/SAP_DATASPHERE/be5967d099974c69b77f4549425ca4c0/4dd95d7bff1f48b399c8b55dbd...

What I am doing in the second option with data access "remote" is reading data from a virtual table and filtering out the day before records in the source system and loading the resulting captured records into the target system. As a nutshell, this is called a "delta loading".

Best Regards.

AdemGuler
Participant
0 Kudos
Hello Aman,

I am going to suggest you information at link below. Please check the cases. https://help.sap.com/docs/SAP_DATASPHERE/be5967d099974c69b77f4549425ca4c0/441d327ead5c49d580d8600301...

 

Best Regards.
JulianJuraske
Participant

Hello Ademn,

I never said you persist the Data of the Source Table, but you are transfering the whole Data, so it's going through the Network and that takes time. (Basicly a Full Load)

In a case where you have Millions of Datarecords this is not what you wanna do.

Like Alberto is saying you filtering in the 2nd Step within the Dataflow.

What would be intresting, if you could push down the Filter into Source, so the virtual Table only delivers the requrest Datarecords.

Best Regards
Julian

Arjun_KT
Participant
0 Kudos
Hello Adem,

Excellent blog.

I have created a custom Z Table on the S4HANA system and loaded data in to it.


Later, we created a CDS with Change Data Capture (CDC) Delta and  replicated the initial load in the data sphere .

When I load additional data into the back end Z table, that data is not showing in the data sphere. Even if I checked the ODQMON,  no entries were shown .

The query is Change Data Capture (CDC) Delta with a trigger-based mechanism. Is it only possible with the Standard Tables in S4HANA  or can we enable Change Data Capture (CDC) Delta for the Z tables in S4HANA?

 

S4 HANA version is 1909.

 

Regards

Arjun
Labels in this area