Technology Blogs by SAP
Learn how to extend and personalize SAP applications. Follow the SAP technology blog for insights into SAP BTP, ABAP, SAP Analytics Cloud, SAP HANA, and more.
cancel
Showing results for 
Search instead for 
Did you mean: 
LauraNevin
Product and Topic Expert
Product and Topic Expert

As of the QRC 02/2021 release of SAP HANA Cloud, data lake, you can query files in your data lake file container that contain structured data without ever having to load the data into a database.

What is SQL on Files?


SQL on Files is a capability of the Data Lake Files service in SAP HANA Cloud, data lake that allows you to query files that contain structured data that are sitting in your data lake file container

As of the time of this writing, SQL on Files supports the following structured file formats:

    • Optimized Row Columnar (ORC)

 

    • Comma-Separated Values (CSV)

 

    • Apache Parquet



 

SQL on Files is considered a bridge between the Data Lake IQ and Data Lake Files components of SAP HANA Cloud, data lake.



When and why would I use this feature?


Use SQL on Files to lower the cost of analyzing large amounts of data of unknown value that is sitting in files.

SQL on Files allows you to perform some pre-exploration and data filtering on the data before moving aggregations of it, or all of it, into a database such as Data Lake, IQ, NSE disk storage, or SAP HANA Cloud, HANA database. You can even create views on the data.

You can also use SQL on Files in cases where you just want to keep the data in files so that other tools such as Apache Spark can access it.

How do I access this feature?


While SQL on Files is enabled by default in your SAP HANA Cloud, data lake instance, there are a few things you need to do before you can start using it.

    1. Follow the steps in this video, Data Lake Files and SQL on Files, to see how to configure your file container, set up authentication, and create a SQL on Files user in your SAP HANA Cloud, data lake instance.

 

    1. Add files to your file container using the steps found here: Adding Files to a File Container.

 

    1. Visit this topic to find the workflow you follow depending on your SAP HANA Cloud, data lake configuration (stand-alone vs. HANA DB-managed): Use SQL on Files.

 

    1. Query data in the files you added to the container using the steps found here: Queries Using SQL on Files.



Where can I find more information?



 

 

 

 

 

 

 

 

 

 

 



 

~ Happy squealing on files! ~

 

12 Comments
former_member184466
Contributor
Be sure to check out www.sap.com/data-lake for details on SAP HANA Cloud, data lake.
LauraNevin
Product and Topic Expert
Product and Topic Expert
0 Kudos
Great link, Ina; thank-you. I've also added it to the list of helpful resources in the post as well!
kapilpokharna
Associate
Associate
Can we use this feature to directly do SQL queries on files sitting in S3  / Azure Data Lake ?
LauraNevin
Product and Topic Expert
Product and Topic Expert
0 Kudos

Hi Kapil, thanks for commenting.

Yes, Data lake Files uses the concept of a Files container, which is an SAP-managed object store on your hyperscaler (Azure, AWS S3, Google Cloud Storage). You don’t have to create the Files container, the data lake provisioning process automatically does (did) that for you. You then use the SQL on Files feature to query files in the Files container.  These two topics should get you started: Understanding Data Lake Files, and Understanding SQL on Files.

Hope this is helpful.

zili_zhou
Advisor
Advisor
0 Kudos
Hi Kapil,

I understand is no. So far what I see the file container is on something together data lake. I do not see an option from data lake you connect to S3 or Azure DL directly.

AWS has its own SQL on file function in Athena. You could connect then connect Athena to HANA Cloud or DWC use those files as remote table.

 

best regards

Zili
Cocquerel
Active Contributor
0 Kudos
the link to the video (https://www.youtube.com/watch?v=wv2Yc3CSlFg) is private.

How to have access ?
LauraNevin
Product and Topic Expert
Product and Topic Expert
Hi Michael, that's new - thanks for reporting it; it now doesn't work for me either. I was able to locate the video although the URL is slightly different: Data Lake Files and SQL on Files - YouTube

I will fix the link in the blogpost, as well.

Thanks again,

Laura
Cocquerel
Active Contributor
0 Kudos
Thanks
philipp_becker
Explorer
0 Kudos
Hi Laura,

It seems that the link in the blog post is still not fixed. Was confused until I found this comment. Can you adjust the blog post accordingly?

Thanks!
LauraNevin
Product and Topic Expert
Product and Topic Expert
0 Kudos
Hi Philipp, the URL works for me, although there is an unusual start with no visuals at the beginning.

Originally, I scraped the URL from the URL field at the top of the video. Just now, though, I got the link from the Share button on the video. The video ID remained the same but the start of the URL is a bit different. Anyway, I hope it works for you but let me know if not.

If not, searching for "data lake files and sql on files" in YouTube will get you to the video more immediately.

Laura
philipp_becker
Explorer
0 Kudos
Hi Laura,

The link in the blog post points to a private video that may only be accessed by people that the video was shared with, which is probably also the reason that the link works for you.

What works for me, though, is the new link that you posted in your comment on March 8th.

Best regards,
Philipp
LauraNevin
Product and Topic Expert
Product and Topic Expert
0 Kudos
Got it, I have now updated the blogpost link with the link I used in my March 8 comment. Hope this works for all. Thanks for alerting me, Philipp.

best regards,

Laura