Technology Blogs by SAP
Learn how to extend and personalize SAP applications. Follow the SAP technology blog for insights into SAP BTP, ABAP, SAP Analytics Cloud, SAP HANA, and more.
cancel
Showing results for 
Search instead for 
Did you mean: 
In the past months, I’ve been working on SAP HANA dump analyzer: an intelligent, easy-to-use, Java executable to automatically analyze HANA issues. You can get the latest SAP HANA dump analyzer here. You can also check the changes for HANA dump analyzer.

There will be a series of blogs to explain the essential features in the SAP HANA dump analyzer and to introduce intelligent solutions build on top of SAP HANA dump analyzer (Autonomous Self-Analysis System).

This post is going to give you an overview of the SAP HANA dump analyzer.
The Problem

If you’ve ever supported or managed one SAP HANA system, most likely you worked with SAP HANA Runtime Dumps already. An SAP HANA runtime dump is a text file that provides various information about the current state of the SAP HANA database. HANA runtime dumps are frequently used to troubleshoot technical issues like system hangs, high memory consumption etc. However, HANA runtime dumps are plain text files with massive information including hundreds of callstacks, thread information, HANA statistics etc. This makes it difficult to analyze the dumps and to derive a conclusion within a reasonable time frame.

The very first challenge to analyze the HANA runtime dump is to understand the pattern from thousands of threads call stacks.

One HANA thread call stack from the HANA runtime dump looks like the following:
1817914258[thr=73928]: JobWrk11358 at
1: 0x00007fcf3887dfd9 in syscall+0x15 (libc.so.6)
2: 0x00007fcf3a5e2019 in Synchronization::BinarySemaphore::timedWait(unsigned long, Execution::Context&)+0x255 at LinuxFutexOps.hpp:53 (libhdbbasis.so)
3: 0x00007fcf4008b935 in Executor::X2OldLock::calculate(Executor::X2Statistics&)+0x4a1 at X2OldLock.cpp:609 (libhdbexecutor.so)
4: 0x00007fcf4002bd39 in Executor::PlanExecutor::calculateX2(TRexCommonObjects::TRexApiError&, Executor::X2Statistics&)+0x295 at PlanExecutor.cpp:862 (libhdbexecutor.so)
5: 0x00007fcf4002ca67 in Executor::PlanExecutor::calculate(TRexCommonObjects::TRexApiError&, Executor::X2Statistics&)+0x1b3 at PlanExecutor.cpp:687 (libhdbexecutor.so)
6: 0x00007fcf5cf92038 in JoinEvaluator::JoinAPI::execute(TRexCommonObjects::TRexApiError&)+0x784 at PlanExecutor.h:61 (libhdbcsapi.so)
7: 0x00007fcf5c5a35ca in TRexAPI::JoinSearchImpl::executeSearch(Execution::Context&, TRexAPI::PreparedQuery const&, TRexAPI::QueryRuntime&, ltt::smartptr_handle<TRexCommonObjects::InternalTableBase>&)+0xe16 at JoinSearchImpl.cpp:40 (libhdbcsapi.so)
8: 0x00007fcf5c5e3236 in TRexAPI::SearchAPI::extractResults(Execution::Context&, TRexAPI::Search::RowProjectors*, TRexAPI::Search::RawResultContext*)+0x152 at SearchAPI.cpp:317 (libhdbcsapi.so)
9: 0x00007fcf5c5e3f7e in TRexAPI::SearchAPI::fetchAll(Execution::Context&, bool)+0x3a at SearchAPI.cpp:893 (libhdbcsapi.so)
10: 0x00007fcf4366be57 in ptime::TrexOltpSearch::search(Execution::Context&, bool, bool)+0x93 at trex_oltp_query.cc:194 (libhdbcswrapper.so)
11: 0x00007fcf45c28e89 in ptime::Trex_oltp_search::do_open(ptime::OperatorEnv&, ptime::QEParams, int) const+0x425 at qe_trex_search.cc:4215 (libhdbrskernel.so)
12: 0x00007fcf45b94e6c in ptime::Table::open(ptime::Env&, ptime::QEParams, int) const+0x148 at qe_table.cc:230 (libhdbrskernel.so)
13: 0x00007fcf45c6a2ef in ptime::Itab_materializer::do_open(ptime::OperatorEnv&, ptime::QEParams, int) const+0x27b at qe_itab_materializer.cc:94 (libhdbrskernel.so)
14: 0x00007fcf45b94e6c in ptime::Table::open(ptime::Env&, ptime::QEParams, int) const+0x148 at qe_table.cc:230 (libhdbrskernel.so)
15: 0x00007fcf45c12fb3 in ptime::Trex_oltp_search::evaluateChildren(ptime::OperatorEnv&, TRexAPI::QueryRuntimeData&, ptime::QEParams) const+0x80 at qe_trex_search.cc:3849 (libhdbrskernel.so)
16: 0x00007fcf45c28c87 in ptime::Trex_oltp_search::do_open(ptime::OperatorEnv&, ptime::QEParams, int) const+0x223 at qe_trex_search.cc:4169 (libhdbrskernel.so)
17: 0x00007fcf45b94e6c in ptime::Table::open(ptime::Env&, ptime::QEParams, int) const+0x148 at qe_table.cc:230 (libhdbrskernel.so)
18: 0x00007fcf435f54e5 in ptime::TrexPlanOp::executePtimeOp(ltt_adp::vector<Executor::PlanData*, ltt::integral_constant<bool, true> > const&, ltt_adp::vector<Executor::PlanData*, ltt::integral_constant<bool, true> > const&, TRexCommonObjects::TRexApiError&, Executor::ExecutionInfo const&)+0x111 at trex_plan.cc:385 (libhdbcswrapper.so)
19: 0x00007fcf435f573c in ptime::TrexPlanOp::executePop(ltt_adp::vector<Executor::PlanData*, ltt::integral_constant<bool, true> > const&, ltt_adp::vector<Executor::PlanData*, ltt::integral_constant<bool, true> > const&, TRexCommonObjects::TRexApiError&, Executor::ExecutionInfo const&)+0x38 at trex_plan.cc:266 (libhdbcswrapper.so)
20: 0x00007fcf4008d46d in Executor::X2OldLock::runPopTask(Executor::X2::PopTaskInfo&, int&, ltt::allocator&, ltt::allocator&)+0x14a9 at X2OldLock.cpp:2473 (libhdbexecutor.so)
21: 0x00007fcf4007d2ec in Executor::X2OldLock::runPopJob(Executor::X2Job*)+0x78 at X2OldLock.cpp:2090 (libhdbexecutor.so)
22: 0x00007fcf4007e7c3 in Executor::X2OldLockJob::run(Execution::JobObject&)+0x1f0 at X2OldLock.cpp:4495 (libhdbexecutor.so)
23: 0x00007fcf3a32915b in Execution::JobObjectImpl::run(Execution::JobWorker*)+0x1217 at JobExecutorImpl.cpp:1098 (libhdbbasis.so)
24: 0x00007fcf3a3347d4 in Execution::JobWorker::runJob(ltt::smartptr_handle<Execution::JobObjectForHandle>&)+0x3b0 at JobExecutorThreads.cpp:217 (libhdbbasis.so)
25: 0x00007fcf3a337037 in Execution::JobWorker::run(void*&)+0x1f3 at JobExecutorThreads.cpp:436 (libhdbbasis.so)
26: 0x00007fcf3a38f637 in Execution::Thread::staticMainImp(void**)+0x743 at Thread.cpp:463 (libhdbbasis.so)
27: 0x00007fcf3a390cc8 in Execution::Thread::staticMain(void*)+0x34 at ThreadMain.cpp:26 (libhdbbasis.so)

 

For a busy HANA system, the threads call stacks could look like the following (i.e. the following is an example of around 2000 working HANA threads call stacks):



Understanding the above HANA threads call stacks within limited time feels insurmountable because there is too much data to be studied!
FlameGraph for HANA

Inspired by Brendan Gregg’s FlameGraphs, the HANA threads call stacks are visualized in FlameGraphs. Each column is one HANA thread. Different threads with similar calls stacks are grouped together, e.g. there are many threads on the right side of the following FlameGraph that are blocked by the savepoint (i.e. on call stack frame DataAccess::SavepointLock::lockShared). The FlameGraph visualization provides intuitive result, i.e. you are naturally looking at the bigger section on the threads flame graph with/without deep knowledge on HANA.



There is different variation of the flame graph to analyze from different angles, e.g.

  • Reversed FlameGraph can be used for better visualization if many threads are waiting or the same lock, i.e. the leaf call stack frames are same, though they may come from different parent call stack frames.

  • Memory FlameGraph can be used to visualize HANA allocator memory consumption when M_HEAP_MEMORY is available from the dump.

  • Concurrent FlameGraph visualizes OLAP query execution (i.e. threads hierarchy).


With the better visualization from FlameGraph, it’s much easier to see a pattern from the threads call stacks.
The Auto Analyzer

However, is this already the full package for my desired HANA dump analyzer?

I want one HANA dump analyzer to answer me directly:

  • Is there any issue from the runtime dump?

  • If yes, what is the issue, how is it concluded?

  • What are the possible workarounds or solutions and how to move forward?


With the above Q, I try to seek for an A: the Auto Analyzer feature of HANA dump analyzer is created to automatically and systematically analyze the issue from the runtime dump and create the analysis report. An example of the analysis report looks like the following:


SAP HANA dump analyzer

The following part will describe in detail about the latest version of SAP HANA dump analyzer including auto analyzer and expert mode.

SAP HANA dump analyzer is a Java program that can be executed directly when Java is properly installed, the GUI looks like



It allows drag and drop of runtime dumps, the dump will be automatically analyzed after double clicking the selected runtime dump or clicking “Auto Analyzer” button. The analysis report will be automatically created. The analysis report can be saved as a single HTML page via browser means.

The SAP HANA dump analyzer can be executed via command line as well to analyze from a provided SAP HANA runtime dump and return the analysis report. The help page of SAP HANA dump analyzer command line is available via:
java -jar HANADumpAnalyzer.jar -help

It’s possible to integrate SAP HANA dump analyzer with the monitoring infrastructure together with other tools. This could in the best case achieve an Autonomous Self-Analysis System to self-detect and analyze HANA issues at certain scenarios for SAP HANA. Please understand the Autonomous Self-Analysis System with SAP HANA Dump analyzer may help you to automize certain monitoring tasks and analyze the scenario but you need to know what you want to do. The set up should be test enough as it will be used on “your own risk”.

The SAP HANA dump analyzer can run in Windows, Linux and MacOS environments. It doesn’t connect to the SAP HANA database, so neither credentials nor a running SAP HANA database are required when using it.

Currently the following analyzers have been implemented. The analyzers will automatically analyze the issues. If the analyzers find out issues, it will create tab pages in the analysis report and show the issues also in the summary page. The analysis report summary page also provides further information regarding runtime dump information, e.g. runtime dump name, the time when runtime dump is generated and runtime dump duration.

















































Analyzer Issue to be analyzed by the analyzer Details
Crash Analyzer HANA crash issue Crash Analyzer analyzes whether there is a crash issue from the dump. If there is a HANA crash issue, the crash analyzer will create crash analysis report showing where HANA crashes, e.g. crash call stack, exception violation condition etc.Usually you need to report an SAP Incident for checking the crash issue if the exception violation is not directly clear.
OOM Analyzer HANA OOM issue

OOM Analyzer analyzes whether there is an OOM issue from the dump. If there is an OOM issue, it will create the OOM analysis report including information e.g. global allocation limit& Inter process Memory Management (i.e. IPMM), memory consumption distribution from different connections and heap allocators.

Here is an example of memory consumption distribution analysis from the OOM analysis report. It analyzes how the memory is consumed by different connections. In case there is one expensive query being the biggest memory consumer, the OOM analyzer will provide the conclusion directly:

HANA Workload Analyzer HANA job worker exhaustion issue, i.e.
-All available job workers are busy
– no new job workers can be started anymore
– jobs are queuing up


The workload Analyzer analyzes whether there is a job worker exhaustion issue. If there is a job worker exhaustion issue, the work load analyzer tries to analyze how the job workers are configured and what the job workers are busy with in the analysis report. To provide details on what the job workers are doing, the workload analyzer provides: e.g. OLAP workload concurrency FlameGraph, pie chart visualization of threads number on Application & statement, job worker call stacks visualization in flame graph.

An example of workload analyzer:

High CPU Analyzer High CPU issue, i.e.
– over (60% * PROCESSOR_NUM) number of threads are running i.e. not waiting on synchronization
Or
– there are many running threads, however no CPU_INFO is captured in the runtime dump


The high CPU analyzer analyzes whether there is a (potential) CPU resource exhaustion issue from the runtime dump. In this case, the high CPU analyzer will provide analysis including CPU load statistics, concurrency FlameGraph for visualizing how OLAP load is using threads resources, threads stack flame graph for visualizing threads call stacks.

An example of threads call stack flame graph on the High CPU analysis tab:

Savepoint Analyzer Savpoint blocked issue and many threads are blocked on savepoint

The savepoint analyzer analyzes whether there is a savepoint blocked issue from the runtime dump. If savepoint is blocked and blocks lots of other threads, the savepoint analyzer provide the savepoint blocked analysis including call stack savepoint blocker and further information (e.g. running SQL) of savepoint blocker, threads blocked by savepoint.

An example of savepoint blocked analysis:

Waitgraph Analyzer Waitgraph is detected, many threads are blocked

The waitgraph analyzer analyzes whether there is a blocked situation which is visible from the waitgraph. In this case, the waitgraph analyzer provides the analysis including waitgraph and threads call FlameGraph.

An example of the waitgraph on the analysis report:

Blocked Transactions Analyzer Many transactions are blocked

The blocked transactions analyzer analyzes whether there are many blocked transactions. If there are many blocked transactions, the blocked transactions analyzer provides analysis including blocked transaction graph and threads call stack visualized in threads stack FlameGraph.

An example of the blocked transaction visualization on the analysis report:

IndexHandle State Analyzer Many threads are waiting on acquiring an index handle

The indexHandle state analyzer analyzes whether there are many threads waiting on acquiring the index handle. In this case, the indexHandle state analyzer visualizes the blocking situation and provide the threads stack FlameGraph.

An example of indexHandle internal state analysis from analysis report:



A more detailed documentation can be found here.

If the auto analyzer doesn’t find a known scenario or you want to perform some individual analysis, you can switch to the “Expert Mode” tab and use analysis options provided there, e.g.:


































Task Menu Details
Call stack representation via flame graph Flame Graph -> Stack -> Create Flame Graph

A flame graph represents call stacks in a way that more frequent call stacks are displayed larger than less frequent modules.

Example:



Further options like showing differences between call stacks are available.
Memory allocation representation via flame graph Flame Graph -> Memory -> Create Flame Graph

A similar flame graph can be created for memory allocation.

Example:

Call stack generation via DOT Dot Graph -> Create Dot Graph

A different way to display call graphs is the DOT format. In this case boxes are colored in different shades of red depending on the number of threads in the module. Final modules (where the thread actually works) are marked with a blue frame.

Example:

Extraction of [INDEXMANAGER _WAITGRAPH] locking scenarios Wait Graph -> Create Wait Graph The [INDEXMANAGER_WAITGRAPH] section of a runtime dump may already contain a wait graph in DOT format that is extracted and displayed.
Extraction of monitoring view data Statistics The section [STATISTICS] of runtime dumps contains raw data of specific monitoring views. This can be extracted and opened with Excel. The M_SERVICE_THREADS_STATISTICS is available if the dump contains the section


A more detailed documentation can be found here.

Please feel free to post any feedback of SAP HANA dump analyzer on this blog or write an email to my mailbox nina.li01@sap.com. In case the SAP HANA dump analyzer is not working as expected or need to be fixed, please attach the HANA runtime dump while you write to me. Thanks!
35 Comments
former_member589367
Discoverer
0 Kudos

What a wonderful tool and great article!

It is definitely helpful to many customers and developers too.

I learn a lot from your article.

Thank you ?

Very valuable tool and it should be included in HANA Studio

0 Kudos

Thanks a lot Nina for sharing this not well known tool !

It could be very powerful once automated to provide human readable info.

<external link removed by moderator>

former_member182505
Contributor
0 Kudos
HI

Nice blog and very informative

 

Thanks

Sadiq
former_member182505
Contributor
0 Kudos
Hello

Thanks For the blog,

I was trying the tool and i am facing some issue , I am using windows 10 JDK 8.0 installed,

I issue is when i am doing drag and drop for trace file size 1GB its giving me error (java heap space )

Please check if you can help me to fix it ,

Thanks

Sadiq
0 Kudos

Hi Sadiq,

Thanks for using SAP HANA dump analyzer.

In case you are uploading a large runtime dump, the HANA dump analyzer can run into problems like “GC overhead limit exceeded” or error “Java heap space”. You can increase Java heap (e.g. to 1GB) by starting HANA dump analyzer via command:

java -Xmx1G -jar HANADumpAnalyzer.jar

This is also documented in SAP KBA 2498739.

Please feel free to contact me if it’s still not working.

Kind regards,

Nina

former_member182505
Contributor
0 Kudos
Hello Nina

Thanks for the reply and SAP Note you Given , I have more clear idea Now ,

I Already achieved 1500M  by adding  environment Variable _JAVA_OPTIONS=-Xmx1500M at my windows ,

Do you have any Note which explains more about the dumps

 

Thanks

Sadiq

 
0 Kudos
Sure, Sadiq. SAP KBA 2400007 - FAQ: SAP HANA Runtime Dumps provides more details about the SAP HANA runtime dump. Check if this is detailed enough for you.


 
0 Kudos

Very detailed information. very usefull. 

former_member612251
Participant
0 Kudos
I doubt this will ever happen SAP are not developing Studio since nearly 2 years back.
fei_ma
Member
0 Kudos
Thanks for sharing. Very useful.
0 Kudos
Hello Nina,

 

Could you provide an alternative link to download the jar ? the developer trial portal show 'internal error' when downloading.

 

Thanks
0 Kudos
Hi Theo,

Do you still face the problem? the download link currently works for me.

 

Kind regards,

Nina
0 Kudos
It works now !

 

Thanks for your reply.
jgleichmann
Active Contributor
Hi nina.li3,

thanks for this wonderful tool. Until now it was a mess to get useful details out of the dump via editor. I love the flame graph since I was playing around with it in context of the CPU performance degradation of spectre / meltdown. Really great you brought this 2 things together.

One issue I was facing is the creation of mixed flame graphs under Mac OS. After creation there is a redirect to port 9123: http://127.0.0.1:9123 which is not working. You have to open the file manually in folder private/var/folders/ky/* like the auto analyzer which is working perfectly. Is this just a Mac OS issue?

One general wish: better documentation of the thread details for a better understatement regarding an analyses. Currently only with experience you can handle such issues. Only with more transparency more people will use such tools.

Best Regards,

Jens
former_member673856
Discoverer
 

Hi

Why do I get error when I load tenant dump

For input string:"100gb(107374182400b)"

 

Thank You

Nandam

 
0 Kudos
Hi Nandam,

Thanks for providing feedback. I would assume you've provided a composite OOM dump. This is a known issue that will be fixed in the next release of SAP HANA dump analyzer. As a temporarily workaround, please modify the following line on the [MEMORY_LIMIT_VIOLATION] section

Composite limit=100gb (107374182400b)

to

Composite limit=107374182400

and check if this works?

 

Kind regards,

Nina
former_member673856
Discoverer
0 Kudos
Hi Nina

You are right. I did provide composite OOM dump and right after I change the value, it did work ! thank you - appreciate it . I can analyse.

Next question - Can I also analyse system info dumps, tenant dumps which we generate via python script ? apart from rte dumps which we generate ..? looks like when I open it , I get the same issue, I believe inside the dump there is composite oom dump pointed to 100gb

 

Thank You

Nandam
0 Kudos
Hi Nandam,

The SAP HANA dump analyzer can analyze HANA runtime dump , it is not able to analyze other traces, e.g. indexserver trace etc.

 

Kind regards,

Nina
0 Kudos
Hi Nandam,

 

Just let you know, the latest version HANADumpAnalyzer 1.0.202004281800 fixed the bug that the HANA composite OOM dump is not analyzed with error popup message “For input string:”XXgb(XXXXXXb)”.

 

Kind regards,

Nina
0 Kudos
How can i get Hanadump analyzer, do i need to install it seperatly.
qiansheng_wang
Advisor
Advisor
0 Kudos
Hi Jithu,

SAP HANA Dump Analyzer could be downloaded from the following URL( also in the early part of this blog), or the download link in KBA 2498739 - How-To: Analyzing Runtime Dumps with SAP HANA Dump Analyzer.

It does not need additional installation steps. It is a standalone jar file and could be used directly via double-clicking or command "java -jar HANADumpAnalyzer.jar" if java is installed.

Best regards,
Qiansheng
andruha
Discoverer
0 Kudos
Hi Nina,

Thank you for wonderful tool. It works for us in many cases, like OOM and other type of issues.
However in some cases, it fails.

Question I have - I'm trying to use to analize safe port blockers from autogeterated runtime dump file like: indexserver_hanac1502b.30040.rtedump.20201118-232652.097153.savepoint_blocked.trc

Unfortunately tool fails to parse these type of runtime dump trace file. I'm getting:

Runtime Dump Incomplete


The runtimedump is incomplete (i.e. STACK_SHORT or STATISTICS is missing from the runtime dump), please analyze it manually!

It seems as if certain blocks are missing in that specific type of trace. I wonder if that can be fixed?

Best Regards!
Andrei
nicholas_chang
Active Contributor
0 Kudos
Hi,

 

The file is not found:

"File Not Found"

 

Could you please help.

 

Thanks!
qiansheng_wang
Advisor
Advisor
0 Kudos
Hi Nicholas,

I will need more information to solve this issue.

May I know the steps to reproduce this issue in your environment? If you like, you can also send me an email to "qiansheng.wang AT sap.com" with the example runtime dump and the screenshots. I will check and update you.

Best regards,
Qiansheng Wang
qiansheng_wang
Advisor
Advisor
0 Kudos
Hi Andrei,

Thank you very much for the helpful feedback. I will support the savepoint_blocked runtime dump and improve the scenario in the new release.

Thank you very much for your feedback again!

Best regards,
Qiansheng
nicholas_chang
Active Contributor
0 Kudos
Hi qiansheng.wang

Thanks for your reply. Actually i just want to download the Dump Analyzer from the link below but ended up "FIle not found" when i click the download button after registration.

 

Thanks!
qiansheng_wang
Advisor
Advisor
0 Kudos

Hi Nicholas,

It is strange. I tried the link in your comment, and I can download use that URL, and the download is started automatically after a successful logon.

Would you please try whether the following URL works?
https://developers.sap.com/trials-downloads.html?search=SAP+HANA+dump+analyzer

Best regards,
Qiansheng Wang

nicholas_chang
Active Contributor
0 Kudos
I'm getting this:

"Your request for information is being reviewed. Once approved, you’ll receive an e-mail with additional instructions."
qiansheng_wang
Advisor
Advisor
0 Kudos
Hi Nicholas,

Thanks for the update, and it is happy to know that there is progress.

Based on my experience, the review could usually be completed within one day, and then you will get another email that contains a direct link to download.

The executable file could be got following the download link in the email,
Feel free to let me know if there is any other issue.

Have a nice day.

Best regards,
Qiansheng Wang
nicholas_chang
Active Contributor



Hi Qiansheng,


Thanks for your help. It is very helpful. Will let you know how it goes once i get the reply. Thanks again!


nicholas_chang
Active Contributor
0 Kudos
HI Qiansheng,

I received the link and click the download button and getting below message:

The file you have requested cannot be found. Please try your download again with a valid URL. If the problem persists, please report an incident under component XX-SER-SAPSMP-SWC.

 

Thanks!
qiansheng_wang
Advisor
Advisor
0 Kudos
Hi Nicholas

Is it OK to forward the email to me on "qiansheng.wang AT sap.com"? I would like to follow up on it, e.g. where the link points to.

Thank you very much.

Best regards,
Qiansheng
satdreddy4pypl
Discoverer
0 Kudos
Hello Nina,

the latest version of dump analyzer gives a java heap space error ( despite setting 2GB or even 3GB of memory Java. ex:\ Xmx2048m in runtime parameters of java ) , any suggestions how we can avoid these errors?

Thanks

Satish
qiansheng_wang
Advisor
Advisor
0 Kudos
Hi Satish,

Sometimes the runtime dump is very big, and it might need more memory to complete the analysis.
Would you please try to increase the heap space to, e.g. 5GB or 10GB to have a test?

If the issue is still there, is it possible to send your runtime dump to me on "qiansheng.wang AT sap.com", so I can check the details. If need, I could also create a container for you to upload the big files.

Thank you very much.

Best regards,
Qiansheng Wang