Technology Blogs by SAP
Learn how to extend and personalize SAP applications. Follow the SAP technology blog for insights into SAP BTP, ABAP, SAP Analytics Cloud, SAP HANA, and more.
cancel
Showing results for 
Search instead for 
Did you mean: 
felixbartler
Product and Topic Expert
Product and Topic Expert
Custom Python operators are the most convenient way to develop non-standard integrations in SAP Data Intelligence. The vast range of Python libraries available is virtually unlimited. In this blog post, I will demonstrate how to proxy TCP traffic of third-party libraries through the SAP Cloud Connector, enabling the integration of on-premises systems to be as effortless as internet-accessible ones.

Please note that this blog post is a continuation of my previous post, "Proxy Third-Party Python Library Traffic - General", where I provided a more comprehensive explanation of the approach and libraries used.


Szenario Architecture


Szenario: In this example, I will show how to connect to a PostgreSQL database via the Cloud Connector. The database is running on my Windows machine, and the Cloud Connector is connected to that machine. This replicates a real world on-premises setup.

Prerequisite:


Before proceeding with the steps outlined in this guide, it is essential to have an instance of the SAP Cloud Connector installed. While it is possible to install the cloud connector on a server, for the purposes of this demonstration, we will be using a Windows machine. We recommend following the instructions provided in this blog (https://blogs.sap.com/2021/09/05/installation-and-configuration-of-sap-cloud-connector/) to install and configure the cloud connector.

The second requirement is a BTP subaccount with a Data Intelligence cluster. To this subaccount, we will connect the Cloud Connector.

1. Configuration:


The first step is to create a configuration in the Cloud Connector that connects to our subaccount and exposes the PostgreSQL host as a TCP resource.

 


Cloud Connector configuration


On the BTP end, we can check the cockpit and the connected cloud connectors in the respective menu tabs. If you cannot see this tab, you may be missing some roles. It is important to note that we see the LocationID “FELIXLAPTOP”, which is an identifier that distinguishes multiple cloud connectors connected to the same subaccount.


Registered Cloud Connector Resources



2. Creating a Data Intelligence Connection:


For our purposes, we do not want to hard-code the connection details, because we need a little help from the connection management to access the Connectivity Service of BTP. In the Connection Management application from SAP Data Intelligence we can create connections of all types. We create a connection of type HTTP with host, port and SAP Cloud Connector as the gateway.

In there we specify the internal host of the Cloud Connector resource, its port and the Postgres username and password.

 


Connection with Cloud Connector Gateway


Note: Not all connection types allow you to access via the Cloud Connector. See the official product documentation for details.

3. Developing a Custom Operator:


In the operators menu of Data Intelligence we create a new custom operator based on the Python3 operator.

We build a custom Dockerfile with the required libraries. We use psycopg2 to connect to Postgres, SocketSwap as a local proxy, and sapcloudconnectorpythonsocket to open a socket via the Cloud Connector.
FROM $com.sap.sles.base

RUN python3 -m pip --no-cache-dir install 'psycopg2-binary' --user
RUN python3 -m pip --no-cache-dir install 'sapcloudconnectorpythonsocket' --user
RUN python3 -m pip --no-cache-dir install 'SocketSwap' --user

Now to the key part, the custom script:
import psycopg2
from SocketSwap import SocketSwapContext
import multiprocessing
from operators.<your_operator>.socket_factory import socket_factory

multiprocessing.set_start_method("spawn") # needed because DI Python Operator is itself executed in Threads

postgres_connection = api.config.http_connection

api.logger.info(str(postgres_connection))


def connect_postgres():
"""
This function demos how to easily setup the local proxy using the SocketSwapContext-Manager.
It exposes a local proxy on the localhost 127.0.0.1 on port 2222
The connection factory is provided to handle the creation of a socket to the remote target
"""

with SocketSwapContext(socket_factory, [postgres_connection], "127.0.0.1", 2222):
api.logger.info("Proxy started!")
# Set up a connection to the PostgreSQL database
conn = psycopg2.connect(
host="127.0.0.1",
database="postgres",
user="postgres",
password="password",
port=2222
)
api.logger.info("Postgres connected successfully!")

# Create a cursor object to execute SQL queries
cur = conn.cursor()

# Execute a SELECT query to retrieve data from a table
cur.execute("SELECT CURRENT_TIMESTAMP;")

# Fetch all the rows returned by the query
rows = cur.fetchall()

# Print the rows to the console
for row in rows:
api.logger.info(str(row))

# Close the cursor and connection
cur.close()
conn.close()


def gen():
api.logger.info("Generator Start")

connect_postgres()


api.add_generator(gen)

And we need a second file in which we put the function we are passing into the SocketSwap Proxy. Note: This function is placed in a seperate file on the top level, to make it pickleable.
from sapcloudconnectorpythonsocket import CloudConnectorSocket

def socket_factory(postgres_connection):
cc_socket = CloudConnectorSocket()
cc_socket.connect(
dest_host=postgres_connection["connectionProperties"]["host"], # virtualhost
dest_port=postgres_connection["connectionProperties"]["port"], # 5432
proxy_host=postgres_connection["gateway"]["host"], # connectivity-service-proxy
proxy_port=20004, # 20004 SOCKS5 Proxy Port of connectivity service
token=postgres_connection["gateway"]["authentication"], # auth token
location_id="FELIXLAPTOP"
)
return cc_socket

In the custom operator, script we just use a generator to start the connect_postgres function. Inside that function, we use the standard psycopg functions to make a small SQL query to the database. The only difference here: We wrap it in the SocketSwapContext to redirect its traffic through a CloudConnectorSocket.

Note: The SocketSwapContext takes mainly 3 arguments: A socket_factory, a callable that returns a connected socket, socket_factory_args, the local proxy host and its port. As we can see in the example, I pass the socket_factory function to the context manager. This function uses the already introduced CloudConnectorSocket functionality to open a Python socket using the Cloud Connector as proxy server. Check out my previous post about TCP and the SAP Cloud Connector to learn more details about this.

The local proxy is setup automatically in a background daemon process and listens to the given host, port. Usually this will be the localhost and a free port. In your third-party library you can then connect to that proxy and its traffic will be redirected automatically to the address specified as destination in the cloud connector socket.

4. Test the custom operator:


We can wrap the new operator in a Data Intelligence Graph and fill the configuration parameter with the POSTGRES_ONPREM_SYSTEM connection ID.


Custom Operator Parameters


In this example, we can check the logs and see the following:
Generator Start 
Proxy started!
Postgres connected successfully!
(datetime.datetime(2023, 4, 24, 10, 10, 52, 593842, tzinfo=datetime.timezone.utc),)

The proxy registered successfully and was able to redirect the traffic coming to localhost:2222 to the on-premises PostgreSQL. As a result of the query, we get the current_timestamp in the python datetime object.

I hope this blog gave you a perspective of the possibilities you have even through the Cloud Connector to interact with any system. If you have any questions of scenarios for the SocketSwap package, feel free to leave a comment.

 

 

 
12 Comments
0 Kudos
Great blog Felix!!! Congrats
felixbartler
Product and Topic Expert
Product and Topic Expert

Note: I made some changes to the code example. The original example did not work in all cases. I added the lines:

import multiprocessing
multiprocessing.set_start_method("spawn")

Additionally I improved the SocketSwap library to take in arguments for the socket_factory function. This is available as of SocketSwap==0.1.11

with SocketSwapContext(socket_factory, [postgres_connection], "127.0.0.1", 2222):

See in this snippet, I put the postgres_connection details into the iterable parameter to the Context Manager.

On top of all this, we need to make sure, that the socket_factory is pickleable. This is done by placing it in a seperate file on top-level. You will only need to do this if your code is mainly located in the script.py part of the operator.

johnr19651
Explorer
0 Kudos
Thanks For the Blog Felix,

We are trying to connect to an MS SQL server on-prem using the method you have outlined in this blog but running into an issue with the python script.

Using pyodbc to establish a connection to on-prem MSSQL server via SAP Cloud connector:
When running the test graph I get the following error.
Any advice on resolving this issue?

 

Error:

Python Operator's user provided script: 'Config' object has no attribute 'http_connection' File "<custom script>", line 10, in <script body> ; the following error also happened during shutdown: error with process "mssqlconnectivity1": Error while executing Python Operator's user provided script: 'Config' object has no attribute 'http_connection' File "<custom script>", line 10, in <script body>

Python Code:

import pyodbc
from SocketSwap import SocketSwapContext
import multiprocessing
from sapcloudconnectorpythonsocket import CloudConnectorSocket

multiprocessing.set_start_method("spawn")   # needed because DI Python Operator is itself executed in Threads
mssql_connection = api.config.http_connection
api.logger.info(str(mssql_connection))

def socket_factory(sql_server_connection):
cc_socket = CloudConnectorSocket()
cc_socket.connect(
dest_host=sql_server_connection["connectionProperties"]["host"],      # virtualhost
dest_port=sql_server_connection["connectionProperties"]["port"],      # SQL Server port (usually 1433)
proxy_host=sql_server_connection["gateway"]["host"],                  # connectivity-service-proxy
proxy_port=20004,                                                     # 20004 SOCKS5 Proxy Port of connectivity service
token=sql_server_connection["gateway"]["authentication"],             # auth token
location_id="YOUR_LOCATION_ID"                                        # location ID
)
return cc_socket

def connect_mssql():
"""
This function demos how to easily setup the local proxy using the SocketSwapContext-Manager.
It exposes a local proxy on the localhost 127.0.0.1 on port 2222
The connection factory is provided to handle the creation of a socket to the remote target
"""
with SocketSwapContext(socket_factory, [mssql_connection], "svrsql1-2018.sp.local", 1433):
api.logger.info("Proxy started!")
# Set up a connection to the ms sql database
conn = pyodbc.connect(
host="host",
database="DBV",
user="user",
password="password",
port=1433
)
api.logger.info("MS SQL connected successfully!")
# Create a cursor object to execute SQL queries
cur = conn.cursor()
# Execute a SELECT query to retrieve data from a table
cur.execute("SELECT CURRENT_TIMESTAMP;")

# Fetch all the rows returned by the query
rows = cur.fetchall()
# Print the rows to the console
for row in rows:
api.logger.info(str(row))
# Close the cursor and connection
cur.close()
conn.close()

def gen():
api.logger.info("Generator Start")
connect_mssql()

api.add_generator(gen)

 
felixbartler
Product and Topic Expert
Product and Topic Expert
0 Kudos
Hi John,

have a look at my other Blog Post in Step 3. You need to create a custom operator with the Connection as a parameter. https://blogs.sap.com/2023/04/14/sap-data-intelligence-python-operators-and-cloud-connector-tcp/


Let me know if I can assist any further 🙂
johnr19651
Explorer
0 Kudos
Thanks Felix,

I have reviewed the article and created a custom Gen 1 python operator.  When running the operator in a graph I get the following error.  Any ideas on what I'm missing?

Thanks,

John

Group: group1; Messages: Graph failure: operator.PythonCloudOperator:pythoncloudoperator1: Error while running generators provided by user script: EXCEPTION NEGOTIATIONG STATUS ERROR AFTER COMMAND BYTE FORBIDDEN: Connection not allowed by ruleset. No matching host mapping found in Cloud Connector access control settings, see Configure Access Control (TCP). [file '/home/vflow/.local/lib/python3.9/site-packages/sapcloudconnectorpythonsocket/sapcloudconnectorpythonsocket.py', line 85] Container is terminated. ExitCode=63
felixbartler
Product and Topic Expert
Product and Topic Expert
0 Kudos

Hi John, this message means, that the host you use in the socket is not configured in the Cloud Connector. In step 1 you can see a screenshot of how it should look like in BTP under the Cloud Connectors Tab and there you also see the host that is exposed. Make sure it is reachable.This host you can reference in your code. (Typically called virtualhost) For additional help you can also send me over additional details of your setup (connection management entry, code, cloud connector config screenshot).

Best Regards,

felix.bartler@sap.com

johnr19651
Explorer
0 Kudos
Hi Felix,

We have the TCP for the MS SQL server resource configured and reachable, both in the cloud connector and in DI.  The basis is asking for details on how the HTTP resource should be configured.  Do you have any more detail on the HTTP resource configuration?

 

Thanks,

John
felixbartler
Product and Topic Expert
Product and Topic Expert
0 Kudos
Hi John,

in step 1 you see the cloud connector configuration, it actually needs to be a TCP configuration. (Not HTTP). Maybe if they are finished you can take a screenshot and send that over, then we can have a look together.

Best Regards,

Felix
johnr19651
Explorer
0 Kudos
Thanks Felix,

 

I see.  So the HTTP resource configuration is not applicable.  Sorry, I assumed that the HTTP connection in DI was actually referencing the HTTP resource in the cloud connector.   I'll try the HTTP connection with TCP resource....

Thanks,

John

 

felixbartler
Product and Topic Expert
Product and Topic Expert
0 Kudos
Yes so be careful: 1. screenshot from cloud connector - there we use TCP - 3. Screenshot DI Connection Management (here we just use HTTP as a "dummy" - does not have anything to do with it really. We only need a connection type that supports the cloud connector)
johnr19651
Explorer
0 Kudos
Thanks Felix,

The HTTP connection is now set with TCP resource details but still getting the original error as shown below... Below is also a screen shot of the DI connection, Host field is the same as the virtual host in TCP resource in the cloud connector.  Note that we have a DI MSSQL connection working.

 

Group: group1; Messages: Graph failure: operator.PythonCloudOperator:pythoncloudoperator1: Error while running generators provided by user script: EXCEPTION NEGOTIATIONG STATUS ERROR AFTER COMMAND BYTE FORBIDDEN: Connection not allowed by ruleset. No matching host mapping found in Cloud Connector access control settings, see Configure Access Control (TCP). [file '/home/vflow/.local/lib/python3.9/site-packages/sapcloudconnectorpythonsocket/sapcloudconnectorpythonsocket.py', line 85]

 

felixbartler
Product and Topic Expert
Product and Topic Expert
0 Kudos
Okay, looks good, is the host behind the red box the same thing you can see in the BTP Cockpit under Cloudconnectors? (Under Exposed-Backends see the 2. Screenshot). By the way you can also just send me an email to my work address and we figure it out. felix.bartler@sap.com