Technology Blogs by SAP
Learn how to extend and personalize SAP applications. Follow the SAP technology blog for insights into SAP BTP, ABAP, SAP Analytics Cloud, SAP HANA, and more.
cancel
Showing results for 
Search instead for 
Did you mean: 
tao_shen
Associate
Associate
0 Kudos

In this blog, we will explore the concept of workload patterns within the HANA Platform, including how to define and understand these patterns. Our primary focus will be on identifying opportunities to make the HANA workload more comprehensible and manageable through various workload management methods. Gaining an understanding of workload patterns is fundamental, as it lays the groundwork for further investigation into workload-related issues, such as those pertaining to CPU usage, memory allocation, and expensive SQL statements.

What is Workload Pattern in HANA Platform?

In the SAP HANA Platform, a workload pattern represents the distinct characteristics of database operations within a specific timeframe. It includes not only the types of queries executed and their frequency but also the execution times and resource usage like CPU and memory. Additionally, it encompasses the pattern of jobs originating from different applications, such as S/4 HANA or Fiori Launchpad, and those from external systems through RFC.

The combination of these characteristics leads to a complex workload pattern in SAP HANA, which manages both OLTP (Online Transaction Processing) and OLAP (Online Analytical Processing) workloads. Analyzing these patterns is vital for database administrators and developers. It helps prioritize tasks to stabilize the system and then reduce resource consumption and address performance issues, particularly at the query level. Understanding these patterns is the first step in tackling workload complexities in SAP HANA.

How to Capture the Workload Pattern in HANA Platform?

To effectively analyze workload patterns in SAP HANA, abstract concepts must be translated into monitorable metrics. This includes CPU utilization, memory consumption, HANA thread samples, locks, and other information gathered by the statistics server. As a starting point for workload analysis, the following monitoring results are particularly insightful:

  1. CPU Utilization: This encompasses both user CPU (used by activities within the HANA platform) and system CPU (primarily related to OS activities). Monitoring CPU utilization is vital as it reveals the overall workload intensity, identifies critical time frames, and indicates the maximum CPU usage.

  2. Memory Consumption: Memory management in HANA is multifaceted, involving not just the data loaded into memory but also aspects like working memory, buffering, and caching. Even the residual memory used by the OS is significant. It’s important to ensure that the memory used by HANA remains well below its allocated limit and that there are no significant fluctuations over time due to issues like expensive statements, unmanaged workloads, or memory leaks.

  3. Thread Samples: Analyzing thread samples is crucial in workload assessment. Detailed thread information helps to correlate CPU and memory data and identify major contributors to workload during peak business hours, such as specific application users, job names, code sources, and the number of threads. Thread samples are used extensively in this blog series to illustrate various examples.

  4. Other Metrics: As previously mentioned, nearly all data collected by the statistics server can be valuable. Different issues require different metrics; for memory-related problems, SQL Cache and table information might be relevant. While for CPU issues, aspects like savepoints, garbage collection, table partitioning activities, or missing secondary indexes could be important. This requires extensive data collection and substantial experience.

The list of HANA system monitoring view could be found on the SAP help portal: Monitoring Views | SAP Help Portal

Also the monitoring scripts can be found through SAP Note 1969700.

Draw a Workload Pattern

SAP HANA offers a variety of tools to gather crucial workload information without the need for initial scripting. These include the HANA Cockpit, HANA Studio, DBACockpit, Solution Manager, the Health Monitoring tool from SAP Cloud, and the SAP BTP cockpit. For more in-depth insights, however, I recommend becoming familiar with HANA's native monitoring views, which can provide a deeper understanding of the system.

For instance, a database administrator seeking detailed thread sample information can use the script "HANA_Threads_ThreadSamples_FilterAndAggregation" from SAP Note 1969700. This script allows the administrator to select specific criteria such as time frame, thread status, application user, or passport action for targeted analysis.

The output from this script is typically a comprehensive table that can easily be exported to an Excel file. I often use pivot tables in Excel to condense and organize this data. Additionally, creating a pivot chart can be extremely helpful for a visual representation and further analysis of the workload.

For this part, please refer to SAP Note 3001300 - How to: Analyze HANA Issues Using Graphical Thread Sample Results for more information.

What Can We Learn from the Workload Pattern?

Understanding what can be gleaned from a workload pattern is crucial. In this section, I’ll illustrate this with two examples that closely mimic real-life analyses. These examples will shed light on the typical appearance of a workload pattern and the insights that can be derived from it. It's important to remember that there's no single, definitive format for a workload pattern. The most effective pattern is one that clearly explains the situation or pinpoints specific issues or concerns.

Example 1 - Analyze the Workload Pattern by Measuring the Contributor of No. of Running Threads

In this example, we've analyzed the workload distribution by capturing specific job information (or statement) and identifying the application users executing these jobs. The measurement was conducted by gathering thread activity from the SAP HANA platform over a designated period. We compiled the top 10 contributors, pairing application user names with their respective job names, and ranked them by the number of running threads they accounted for during this interval. The chart illustrates each user-job combination as a percentage of the total workload within that timeframe.

The number of running threads consumption could also be considered as the number of logical CPUs consumption.

tao_shen_0-1708096373869.png

So what can we read from this chart? Let me explain:

  • 'BATCH_1 + Job_A' is the predominant workload, accounting for 34% of the total running threads. This suggests that Job_A, run by the batch user BATCH_1, is likely a resource-intensive batch job crucial for business operations.
  • 'EndUser_1 + Job_B' comes in next with 20% of the workload. This significant figure implies that Job_B, linked to EndUser_1, may be a routine task or a recurring query essential to the user's activities. However, if this end user typically does not schedule frequent critical jobs, this activity should be scrutinized to ensure the job's placement is appropriate.
  • 'EndUser_2 + Job_C' occupies 11% of the workload. It's important to ascertain whether this load results from a one-off job with high resource demands or a regularly executed job that consumes substantial CPU intermittently.
  • 'EndUser_3 + Job_B' accounts for 7% of the workload. The recurrence of Job_B with another user, EndUser_3, indicates that this job may be a standardized task or report executed by multiple users. This casts suspicion on the high workload linked to 'EndUser_1 + Job_B', suggesting EndUser_1 might be using an inefficient query or job variable, leading to increased workload.
  • 'BATCH_2 + Job_D' and 'BATCH_2 + Job_E' contribute 7% and 6% to the workload, respectively, totaling 13%, which is noteworthy. The existence of multiple jobs under BATCH_2 hints at a set of related tasks that together draw on considerable resources. Implementing workload management techniques, such as workload classes, and setting a total concurrency limit for BATCH_2 could prove beneficial.
  • 'various + SAPMHTTP' is responsible for 5% of the threads. Typically, SAPMHTTP represents jobs originating from network activity. The 'various' label suggests these are initiated by numerous external users. A surge in simultaneous connections could impose a heavy load on the system, posing a challenge in managing these activities.
  • '_sys_statistics + ?' makes up 4% of the workload. These are generally system statistics server tasks that gather monitoring data periodically. If they occupy a significant workload portion, further investigation into the statistics server jobs is warranted. Moreover, assigning a workload class to the _sys_statistics user could prevent such scenarios.
  • 'SAP_WRFT + Job_F' and 'Basis + Job_G' each represent 1% of the workload. Their smaller shares imply they might be infrequent or not resource-intensive, perhaps routine maintenance or occasional background tasks. SAP_WRFT, a standard user for SAP workflow requests, should be monitored closely, and a workload class may be necessary. Meanwhile, 'Basis' likely pertains to system maintenance; if it leads to substantial workload, a detailed review is advised.

With this analysis, we have a clearer picture of the areas requiring review as the next step. Much clearer, right?

Example 2 - Workload Pattern Analysis with Aggregated Date Integration

The second example illustrates the workload distribution for top jobs across weekdays and weekends, segmented into daily percentages. This visualization enables us to analyze the workload distribution trends not only on a day-to-day basis but also across the broader spans of the entire week. Additionally, it facilitates a focused analysis comparing between different workdays or between different weekends.The data could be further refined to show hourly distributions or focused on peak business hours, depending on the specific requirements of the analysis. Such granularity can reveal patterns and inform decisions on resource allocation and system optimization tailored to the operational demands of the business.

tao_shen_2-1708110391455.png

What insights can the workload pattern provide us this time?

  • Job 1 is prevalent during the weekdays, maintaining a consistent share of the workload daily. This pattern indicates a core business function requiring regular execution. While this job may not necessarily be optimized, it warrants an examination for any costly SQL statements that could be optimized to reduce resource utilization. Moving part of the workload to a secondary site with the HANA active/active read-enabled (AARE) feature in a high-availability environment could also be a strategic option.
  • Job 2 appears every day, suggesting a routine operational task, perhaps related to daily tasks or table replications. The workload volume from this job remains quite unchanged, displaying a stable percentage of the total workload.
  • Job 3 shows activity predominantly on the weekends, which could be associated with weekly maintenance, backups, or batch jobs timed for periods of low business activity. If there is no adverse business impact, it may not require intervention.
  • Job 4 exhibits a varying workload, indicating potential system instability issues, such as system hangs or lost user connections. The erratic nature of this job could stem from inefficient SQL statements lacking optimization, execution on a incorrect HANA engine, or possibly due to inappropriate end-user configuration or job scheduling.
  • Job 5 presents only sporadically but contributes a significant workload spike. This could result from poorly tested expensive statements, unexpected user query criteria, or one-time operations like batch data loads or initial load phases for system landscape transformation (SLT), online table repartitioning, or month-end closing tasks. Jobs in production should be well-planned and finely tuned, with execution carefully monitored. Implementing global workload management parameters and workload classes may mitigate such scenarios.
  • Job (internal) likely includes inherent HANA operations such as merging, compression, or statistics server activities. This internal job's workload increases during weekdays in tandem with business activities and decreases over the weekend. Generally speaking, this workload can occasionally cause additional strain due to factors such as suboptimal scheduling of statistics jobs, large table sizes, or impractical table partitions, which in turn can affect the system's overall performance.

Understanding the nuances in workload distribution is vital for addressing system performance challenges, orchestrating resource allocation, and refining job scheduling to optimize the efficiency of the HANA platform.

The goal of analyzing workload patterns is not solely to evaluate the performance of individual SQL statements but to understand the broader impact a single statement can have on the system. A query might run for an extended period without consuming excessive CPU or memory and could, therefore, seem like a candidate for exclusion from workload analysis. However, this perspective can be misleading.

Long-running jobs can cause issues beyond resource consumption—such as database locks, garbage collection delays, savepoint contentions, or adverse effects on other user operations. In these scenarios, it becomes crucial to scrutinize the query in question for tuning opportunities. Identifying and optimizing such statements is essential to maintain overall system health and prevent disruptions to user activities.

 

🙉🙈🙊

Workload Analysis for HANA Platform Series

This blog post is part of the 'Workload Analysis for HANA Platform Series'. In upcoming posts, we will demonstrate how to analyze the issue related to CPU, threads and NUMA Node . Here's what you can look forward to in this series:

Stay tuned as we explore these aspects in detail, providing insights and strategies to optimize your HANA environment.

1 Comment