NSE: Systematically Measure Memory Savings

danielsblog01 · ‎09-07-2021

This guide will show a methodical approach to finding data with a large memory footprint and calculating the memory storage savings after moving them to disk via Native Storage Extension. The TPC-H dataset is used in this guide.

Assumptions:

Reader has access to a provisioned SAP HANA instance SPS04+

Reader has familiarity with Database Explorer, SAP HANA Cockpit, and HANA System Views

Observing the Memory Footprint

There are a few useful views for viewing the system’s memory footprint: M_CS_TABLES, M_CS_PARTITIONS, M_CS_ALL_COLUMNS.

First, we begin with observing which tables have the largest memory footprint. The following SQL will query the M_CS_TABLES view and converts the values in the size columns from bytes to megabytes.

SELECT

    TABLE_NAME,

    SCHEMA_NAME,

    MEMORY_SIZE_IN_TOTAL / 1024 / 1024 AS MEMORY_TOTAL_IN_MB, 

    MEMORY_SIZE_IN_PAGE_LOADABLE_MAIN / 1024 / 1024 AS MEMORY_SIZE_IN_PAGE_LOADABLE_IN_MB

FROM M_CS_TABLES WHERE SCHEMA_NAME='TPCH' ORDER BY MEMORY_SIZE_IN_TOTAL DESC;

Notice: There is quite a large table, LINEITEM. Let’s observe this tables footprint at the column level.

SELECT

    TABLE_NAME,

    COLUMN_NAME,

    MEMORY_SIZE_IN_TOTAL / 1024 / 1024 AS MEMORY_SIZE_IN_TOTAL_IN_MB,

    MEMORY_SIZE_IN_PAGE_LOADABLE_MAIN / 1024 / 1024 AS MEMORY_SIZE_IN_PAGE_LOADBALE_IN_MB

FROM M_CS_ALL_COLUMNS WHERE TABLE_NAME='LINEITEM' ORDER BY MEMORY_SIZE_IN_TOTAL DESC;

Notice: Columns L_ORDERKEY, L_COMMENT, L_EXTENDEDPRICE, and L_PARTKEY are quite large. Also, there seems to be multiple date columns that could be good candidates for partitions. Let’s observe these columns to get a sense of the data

SELECT

    L_ORDERKEY,

    L_COMMENT,

    L_EXTENDEDPRICE,

    L_RECEIPTDATE,

    L_SHIPDATE,

    L_COMMITDATE

FROM LINEITEM ORDER BY L_SHIPDATE DESC;

Notice: L_COMMENT is long text data that doesn’t immediately appear useful for day-to-day analytics, it makes sense to offload this column to disk. Also, notice that the L_SHIPDATE range is from 2012-01-02 to 2018-12-01. For typical, year-over-year analysis we only need the latest two years of data. We can make a partition and place all data with a SHIPDATE before 2017 on disk as well.

Before beginning to move data it is useful to use the following views to observe the current Buffer Cache configuration. To view results in these views the user running the query requires adequate system permissions.

SELECT * FROM M_BUFFER_CACHE_STATISTICS;

SELECT * FROM M_BUFFER_CACHE_POOL_STATISTICS;

Right now, the Buffer Cache is configured for 1GB. Altering the Buffer Cache size can be done with the following SQL. This requires a user with INIFILE privilege.

ALTER SYSTEM ALTER CONFIGURATION ('indexserver.ini', 'system') SET ('buffer_cache_cs', 'max_size') = '<SIZE IN MB>' WITH RECONFIGURE;

Decide Which Data to Unload

Native Storage Extension supports the ability to move portions of a table to disk. This allows for more flexibility in the performance-cost trade off from storing data on disk.

From the simple analysis of the LINEITEM table, the L_COMMENT column can be offloaded and the L_SHIPDATE column can be used to partition the table.

To partition the table on a date column while a table has a primary key, a multilevel partition is required.

ALTER TABLE LINEITEM PARTITION BY HASH (L_ORDERKEY) PARTITIONS 1, RANGE(L_SHIPDATE) (

    PARTITION '1990-01-01' <= VALUES < '2017-12-02', 

    PARTITION OTHERS

 );

Now, the newly made partition and the L_COMMENT column can be offloaded from memory and the memory savings can be calculated.

First, observe the M_CS_TABLES view again to see the partition sizes that were made.

SELECT

    TABLE_NAME,

    PART_ID,

    MEMORY_SIZE_IN_TOTAL / 1024 / 1024 AS MEMORY_SIZE_IN_TOTAL_IN_MB,

    MEMORY_SIZE_IN_PAGE_LOADABLE_MAIN / 1024 / 1024 AS MEMORY_SIZE_IN_PAGE_LOADABLE_IN_MB

FROM M_CS_TABLES WHERE TABLE_NAME='LINEITEM' ORDER BY MEMORY_SIZE_IN_TOTAL DESC;

Notice: There is a larger and a much smaller partition, partitions 2 and 1 respectively. Now, set the partition with ID 2 and column L_COMMENT to have the PAGE LOADABLE load unit.

Setting the Load Unit

The SQL syntax for setting load units can be found here. In this case, the SQL was the following.

ALTER TABLE LINEITEM ALTER ("L_COMMENT" NVARCHAR(44) PAGE LOADABLE);

ALTER TABLE LINEITEM ALTER PARTITION 2 PAGE LOADABLE;

Verify the changes in the M_TABLE_PARTITIONS and M_CS_ALL_COLUMNS view where the LOAD_UNIT column can be observed. A value of PAGE indicates data stored on disk and COLUMN indicates data stored in memory.

SELECT 

    TABLE_NAME,

    PART_ID,

    MEMORY_SIZE_IN_TOTAL / 1024 / 1024 AS MEMORY_SIZE_IN_TOTAL_IN_MB,

    MEMORY_SIZE_IN_MAIN / 1024 / 1024 AS MEMORY_SIZE_IN_MAIN_IN_MB,

    LOAD_UNIT

FROM M_CS_TABLES 

WHERE TABLE_NAME='LINEITEM' ORDER BY MEMORY_SIZE_IN_TOTAL DESC; 



SELECT

    TABLE_NAME,

    COLUMN_NAME,

    MEMORY_SIZE_IN_TOTAL / 1024 / 1024 AS MEMORY_SIZE_IN_TOTAL_IN_MB,

    MEMORY_SIZE_IN_MAIN / 1024 / 1024 AS MEMORY_SIZE_IN_MAIN_IN_MB,

   LOAD_UNIT

FROM M_CS_ALL_COLUMNS

WHERE TABLE_NAME='LINEITEM' ORDER BY MEMORY_SIZE_IN_TOTAL DESC;

Observe already, that the partition with PART_ID = 2 has gone from using 33.22 MB of main memory to only 0.8194 MB. (33.22-0.8194)/33.22 = 0.975, that's a 97.5% reduction in main memory usage for this partition. Unload the table completely from memory with the following SQL.

UNLOAD LINEITEM;

Load the data back into memory using the following SQL.

LOAD LINEITEM ALL;

This unloads the LINEITEM table completely from memory and loads it back into memory for column loadable data and the buffer cache for page loadable.

Calculate the Total Memory Savings

To get a picture of what the table size was in memory, disk, and disk in page loadable format we can run this query.

ALTER TABLE TPCH.LINEITEM ALTER ("L_COMMENT" NVARCHAR(44) COLUMN LOADABLE);

ALTER TABLE TPCH.LINEITEM ALTER PARTITION 2 COLUMN LOADABLE;

UNLOAD TPCH.LINEITEM;

LOAD TPCH.LINEITEM ALL;

SELECT SUM(MEMORY_SIZE_IN_MAIN / 1024 / 1024), SUM(DISK_SIZE / 1024 / 1024 ), SUM(DISK_SIZE_IN_PAGE_LOADABLE / 1024 / 1024 ) FROM M_TABLE_PARTITIONS WHERE TABLE_NAME='LINEITEM';

We can observe that currently in main memory the table takes up 248.67 MB and like wise in DISK_SIZE if we were to unload it. Also note, since I've made the table completly COLUMN LOADABLE by reversing the load unit change I've made earlier, that currently there is 0 space taken up in the PAGE_LOADBALE disk storage. Now, I can compare this to the results after I've made my partition and L_COMMENT column PAGE LOADABLE.

ALTER TABLE TPCH.LINEITEM ALTER ("L_COMMENT" NVARCHAR(44) PAGE LOADABLE);

ALTER TABLE TPCH.LINEITEM ALTER PARTITION 2 PAGE LOADABLE;

UNLOAD TPCH.LINEITEM;

LOAD TPCH.LINEITEM ALL;

SELECT SUM(MEMORY_SIZE_IN_MAIN / 1024 / 1024), SUM(DISK_SIZE / 1024 / 1024 ), SUM(DISK_SIZE_IN_PAGE_LOADABLE / 1024 / 1024 ) FROM M_TABLE_PARTITIONS WHERE TABLE_NAME='LINEITEM';

The result shows that the table now only takes 83.54 MB of main memory and 170.35 MB in the PAGE LOADABLE disk storage. To compare main memory usage, (248.67 - 83.54) / 248.67 = 0.664. A 66.4% reduction in the main memory footprint for this table.

The same steps can be done on the column level as well. I will place it in a single query below.

ALTER TABLE TPCH.LINEITEM ALTER ("L_COMMENT" NVARCHAR(44) COLUMN LOADABLE);

UNLOAD TPCH.LINEITEM;

LOAD TPCH.LINEITEM ALL;

SELECT SUM(MEMORY_SIZE_IN_MAIN / 1024 / 1024) FROM M_CS_ALL_COLUMNS WHERE TABLE_NAME='LINEITEM' AND COLUMN_NAME='L_COMMENT';



ALTER TABLE TPCH.LINEITEM ALTER ("L_COMMENT" NVARCHAR(44) PAGE LOADABLE);

UNLOAD TPCH.LINEITEM;

LOAD TPCH.LINEITEM ALL;

SELECT SUM(MEMORY_SIZE_IN_MAIN / 1024 / 1024) FROM M_CS_ALL_COLUMNS WHERE TABLE_NAME='LINEITEM' AND COLUMN_NAME='L_COMMENT';

Below are the results for the main memory footprint of the L_COMMENT column as column loadbale and page loadable respectively.

Summary

To calculate the main memory savings of NSE,

Note the MEMORY_SIZE_IN_MAIN value in M_CS_TABLES or M_CS_ALL_COLUMNS for a table or column.

Set the load unit of a table, column, or partition to PAGE LOADABLE.

UNLOAD the data from main memory. The easiest way to do this is to UNLOAD the entire table then LOAD it back.

Use the new MEMORY_SIZE_IN_MAIN value in M_CS_TABLES or M_CS_ALL_COLUMNS for a table or column to calculate the memory savings. (OLD_MEM_VALUE - NEW_MEM_VALUE) / (OLD_MEM_VALUE)

Following these steps, it is possible to calculate the main memory savings of NSE for any table, column or partition.

If you’ve found this blog post useful feel free to give it a like! Questions are welcome in the comment section below or on the community questions page.

Have another NSE topic you want to learn more about? Comment down below!

Cocquerel · ‎09-08-2021

Hello,

Is there a way to estimate the memory saving without doing the conversion ? Same question in case the table is initially in Dynamic Tiering and we would like to estimate the impact of switching to NSE.
Regards,

Michael

jgleichmann · ‎09-08-2021

Hi Michael,

the real memory savings depends on the size of the buffer cache. Nobody can tell you how each column will use the buffer cache. Some objects are fully loaded into the cache, some partly and some stay on disk level.

In the end you can not estimate savings per table. You only can summarize all paged objects and compare it to the buffer cache for a overall view. This means you can estimate the amount of memory which you can page out via NSE per table. But please keep in mind that these are not your savings. This are your best case savings if nothing is loaded into memory.

Regards,

Jens

Cocquerel · ‎09-08-2021

do you mean that, in worse case scenario (table always loaded into buffer cache), a page loadable table may consume more memory than column loadable table ?

danielsblog01 · ‎09-08-2021

Hi Michael,

A page loadable table will not consume more memory than a column loadable table. This is because a page loadable table will exist partly on disk and partly in the buffer cache (in-memory). Where as a column loadable table/partition/column will always be fully loaded in-memory. The memory savings come from the fact that you only ever load the data you need to use from disk into the buffer cache, while the less frequently accessed data resides on disk.

As Jens points out, as your workloads change your memory savings will change as well. The more data you require from disk, the more data that could potentially be loaded into the buffer cache. Hence, altering the exact memory savings you get from NSE.

Hope this helps,

Daniel

jgleichmann · ‎09-09-2021

Hi Daniel,

thanks for sharing the details regarding NSE. It is a great feature to optimize memory usage, license costs and TCO. But it is a complex topic and the candidates should be chosen wisely. The analysis to find the right candidates and estimate the savings are really difficult, because you can not use the advisor in all scenarios because it is costly in context of performance.

Some follow-ups here:

"To partition the table on a date column while a table has a primary key, a multilevel partition is required."

This is not correct, you can use a range partitioning with heterogeneous partitioning (44. What is heterogeneous partitioning?) for attributes which are not part of the primary key without using multilevel partitioning.

"A value of PAGE indicates data stored on disk and COLUMN indicates data stored in memory."

I'm struggling with this statement, because with the COLUMN attributes the data is also stored on disk. Simply all data (nearly all besided temp and caches) is persistent and stored on disk. PAGE attribute means only that the data is handled different on disk and memory level regarding dictionary and buffer cache.

"Use the new MEMORY_SIZE_IN_MAIN value in M_CS_TABLES or M_CS_ALL_COLUMNS for a table or column to calculate the memory savings. (OLD_MEM_VALUE – NEW_MEM_VALUE) / (OLD_MEM_VALUE)"

With old and new memory values you can only take a current snapshot of the loaded data, but it depends on the usage/access of the data. If nobody is using the NSE data you will have 100% savings? I think this is not a good indication for the savings. Another thing is the comparison from memory and disk values. The real savings are the old average loaded memory main values from all affected NSE tables compared to the size of the buffer cache. You can not break it down to one table. With only one table in NSE you can describe it, but this is not the real world. You won't use NSE for only one table.

Additional info: Use M_CS_COLUMNS_PERSISTENCE view if the table is not partitioned and you want to know the disk size.

Regards,

Jens

danielsblog01 · ‎09-09-2021

Hi Jens,

Thank you for your feedback!

The information on single-level range partitioning can be found here. Here you may find documented,

Range partitioning is similar to hash partitioning in that if the table has a primary key on it, the partitioning columns must be part of the key.

You are correct you can achieve a partition on the date column with heterogeneous partitioning. Heterogeneous partitioning is a form of multi-level partitioning as best said in the documentation,

Heterogeneous partitioning offers more flexibility for range-range and range-hash two-level partitioning schemas.

Specifically, it allows us to achieve an "unbalanced" second level partitioning scheme.

As for the difference between PAGE and COLUMN loadable, you are correct data is persisted on disk. This means data is regularly saved here for backup and recovery purposes. When querying the data, the data set to COLUMN loadable are processed in memory. The data set to PAGE loadable will be paged into the buffer cache (if there is enough room) from disk or read from disk if the buffer cache is full.

Regarding the method of calculating memory savings outlined in this blog. This blog outlines the "best possible" savings. It may come across shallow in terms of the real world, but it is a lesson on leveraging HANA system views and NSE to check and monitor the difference in memory after moving some data to NSE.

Best,

Daniel

jgleichmann · ‎09-10-2021

Interesting interpretation of my explanation:)

But as already stated out by Daniel this is not the case. With NSE you only need a little bit more disk space compared to the normal COLUMN variant. This overhead is caused by the dictionary stored together with the NSE variant. This overhead is not a fix value, it depends on the varation of values. You can assume a overhead of 5-10%. But disk space should not be a bottleneck 😉

Regards,

Jens

RobertWaywell · ‎09-10-2021

You absolutely can measure savings per table/column/partition. Those savings will be specific to a given a workload and a given data set, but the measurement should be highly repeatable.

In this particular blog post, Daniel has focused on the system views available to provide insight into memory and disk usage and how to interpret the values returned by those system views. In order to determine the savings for a workload you need to take the additional steps of testing the workload before and after converting objects (tables/columns/partitions) from column loadable to page loadable.

One common approach is to use the workload capture & replay to capture a real world workload, measure the baseline performance of the replay on a test system, make the changes to the LOAD UNIT configuration and then re-run the replay to evaluate the difference both in memory consumption and performance.

You are correct that the buffer cache size sets a maximum limit on how much memory can be consumed by page loadable data. However an improperly sized buffer cache - specifically if it is too large - can eliminate potential memory savings since the caching algorithm will happily make use of the full configured size of the buffer cache before ejecting any stale pages from the cache.

Again, this leads into further evaluation building on the information outlined in this blog. Once you understand the size of the data you are converting to be page loadable, you then need to look at the configured size of the buffer cache and the used size of the buffer cache. I've worked with multiple customers, who at least during their POC's, needed to reduce the size of the buffer cache from the default configuration in order to see paging actually happening and only then could they realize the memory savings from utilizing NSE.

jgleichmann · ‎09-10-2021

Hi Daniel,

the following statement is only true for homogeneous partitioning:

"Range partitioning is similar to hash partitioning in that if the table has a primary key on it, the partitioning columns must be part of the key."

It is not valid for heterogeneous partitioning!

Heterogeneous partitioning is a valid single level partitioning. You can also combine it as range-range or range-hash, but this was not my inital statement. You can use it if you don't want to use a column inside the PKEY. So, you can directly use the needed column without using the PKEY and multilevel partitioning. AFAIK if you use a HASH-RANGE partitioning which is frequenly used with the client column MANDT/MANDANT, only the range partitions are paged out, but this was not tested by myself. Just found it in some official slides of SPS05 (may be you can shed some light on it):

Source: HANA20-SPS05_WhatsNew_MissionCriticalDataCenterOperations-Efficiency

The documentation states out:

"single level heterogeneous partitioning is also possible"

I think there is still some room for improvements for the documentation. I have also tested and verified this behavior with a prominent example table VBAK:

alter table SAPREX.VBAK partition by range (ERDAT) ((

    PARTITION 19700101 <= VALUES < 20180101,

    PARTITION 20180101 <= VALUES < 20190101,

    PARTITION 20190101 <= VALUES < 20200101,

    PARTITION 20200101 <= VALUES < 20210101,

    PARTITION 20210101 <= VALUES < 20220101,

    PARTITION 20220101 <= VALUES < 20230101,

    PARTITION OTHERS

));

Field ERDAT is not part of the PKEY and it is a valid single level heterogeneous partitioning.

VBAK Column

VBAK Heterogeneous partitioning

Another topic is the performance. Homogeneous partitioning will also be faster than heterogeneous. But if the read / write ratio is correct you won't have many reads on the NSE data. But here we are back at the biginning of my comment: 'candidates should be chosen wisely'. There should be some more details and examples in the documentation.

Regards,

Jens

former_member665799 · ‎09-13-2021

Hi Daniel,

Thanks for this beautifully written article.

Just one thing I wanted to add on top of what you mentioned. There is an alternate syntax for the SQL statement to alter a column from COLUMN loadable to PAGE loadable and vice versa. This new syntax, however, will not change the measurements related to memory savings.

ALTER TABLE TPCH.LINEITEM ALTER ("L_COMMENT" ALTER PAGE LOADABLE);

Here instead of using the column definition i.e. NVARCHAR(44), we have used the keyword ALTER.

Kindly note that this new syntax works when TRANSACTION AUTOCOMMIT DDL is ON.

Regards,

Souvik

ruediger_karl · ‎09-15-2021

Hello Jens, Daniel,

let me clarify quickly. In HANA on-prem we introduced the heterogenous partitioning, which allows a more flexible definition for range, range-range and range-hash partition schemas. With heterogenous partitioning in SPS05, there is no requirement to partition on a primary key column.

Daniel didn't use the heterogenous partitioning definition for his example, but referred to the "older" range-partitioning definition which requires to partition on the primary key column.

Hope this clarifies,

Ruediger

jgleichmann · ‎09-16-2021

Hi Ruediger,

thanks for clarification. May be you can say more about performance facts and when heterogenous should be used. Is the statement regarding the picture (see comment above) of "SPS05_WhatsNew" in context of HASH-RANGE still valid?

Regards,

Jens

ruediger_karl · ‎09-16-2021

Hi Jens,

here my comments.

NSE (page loadable) can be applied on range-partitions only.

Heterogenous partition schema might still show some slower performance than the "older" (non-heterogenous) partition schema in SPS05, depending on the queries. That will be improved over the next SPS.

Heterogenous partition schema definition provides more flexibility (location placement on 1-level and 2-level) and efficiency (avoid empty or unneeded partitions), but might show some slower performance in some cases (see 2.). There is no strict guideline, but you would need find the best tradeoff for your use case.

Additionally, here are the supported partitioning schemas in NSE and the required partitioning definition in SPS05:

hash-range (non-heterogenous partitioning)
range, range-range, range-hash (heterogenous partitioning)

Regards, Ruediger

NSE: Systematically Measure Memory Savings

Get Your SAP HANA Idea Incubator Badge Today!

SCN Mission - SAP HANA Quiz Challenge is now retired

Share your #HANAStory and Win