Time Series Analysis

LauraTozzo · ‎02-13-2024

When setting up the forecast process, most of the times the business needs are the first and only thing that is taken into account. However, the underlying statistical properties of the data you use for forecasting heavily impact the quality of your forecast and should be considered as well.

The historical data you use as a basis for your forecast can have patterns. Recognizing these patterns (such as trends and seasonality) helps in deciding which forecasting algorithms are more suitable and ultimately helps increase the forecast accuracy.

In SAP Integrated Business Planning, there are two processes aimed at understanding the statistical nature of the data used for forecasting:

Time series analysis: this process looks for properties such as trend, seasonality, and intermittency in the time series.
Change point detection: this process looks for significant level shifts and trend changes in the time series. Both of these are part of what we call Forecast Automation, and are regulated by the parameters you define in the Manage Forecast Automation Profiles app.

The purpose of this blog is to give an overview of what time series analysis and change point detection do, how to best set up the related parameters and how to leverage the results for forecasting. It is designed with experts in mind, specifically those tasked with configuring the SAP IBP system to utilize these features, and thus, it will contain in-depth technical details. If your objective is merely to gather a basic understanding of Time Series Analysis and Change Point Detection, you could find it sufficient to read the opening section of each chapter.

Time Series Analysis

Time series analysis can be used to identify four basic patterns in the historical data:

Continuous: Time series with very few missing values and no other patterns
Intermittent: Time series with a lot of zero values
Lumpy: Intermittent time series in which relatively large deviations can be observed
Irregular: Continuous time series with high volatility but no specific pattern

The following graphic shows how these properties relate to each other:

graphic time series.png

With the exception of irregularity and lumpiness, the above patterns may additionally have the following features and characteristics:

Trend: Time series in which the values change in a consistent direction (downward or upward).
Seasonality: Time series in which the data experiences regular and predictable changes that recur every seasonal cycle (e.g. every year). Seasonality can be additive (for example, the sales value in a given month was 2 tons more than the average sales) or multiplicative (for example, the sales value in a given month was 2.5 times the average sales).

With “Time series property” we mean the combination of these patterns and characteristics for a given time series. For example, a time series property can be “Intermittent with additive seasonality”.

If you are not sure about the nature of time series data that you want to forecast, you can set the system to analyze the values, and save the identified properties in the background. This process is called time series analysis and consists of four steps:

Setting the rules
As an administrator or configuration expert, you set up the rules for the analysis in the Manage Forecast Automation Profiles app. This is the most challenging part of the process and where we are going to deep dive into the different settings.
Running the analysis
As a business user, you schedule and run a Forecast Automation application job in the Application Jobs app, for which you select the Time Series Analysis option.
Checking the results (optional)
You can check the results of the latest forecast automation job opening the corresponding forecast automation profile in the Manage Forecast Automation Profiles app and choose Show Analysis Results.
Leveraging the results
See section “How to use time series analysis to improve your forecast”.

Seasonality

A time series is seasonal if it displays a repeating pattern or cycle over a fixed period - typically over months or weeks. To detect whether a time series is seasonal, a statistical method called seasonality test is used.

The results of a seasonality test can impact the type of predictive model that is most appropriate for data forecasting, and for this reason, understanding seasonality in your dataset is crucial for accurate forecasting.

When setting up a profile in the Manage Forecast Automation Profiles App, there are few settings that are relevant for seasonality.

Seasonality Settings.png

Sensitivity for Seasonality Test

Define how sensitively the system should look for seasonality in the time series. The lower is the value of this autocorrelation coefficient, the more easily seasonality will be identified.

For example, the default 0.3 coefficient means that seasonality is identified in the time series when the highest autocorrelation coefficient is above 0.3. If you choose a lower value, the system will identify seasonality in more cases.
The value 0.3 as default was chosen after internal tests that showed good results in most cases.

Calculate Number of Periods in a Season

You can decide whether the system should automatically detect the number of periods in a season or if you want to define this manually. If you enable this option and set a length for the cycle manually, your data will only be considered seasonal when the autocorrelation coefficient of the specified cycle is higher than the sensitivity coefficient. This way you can eliminate false seasonality detection typically caused by heavy noise in the time series. An example of this could be if you are planning in calendar weeks but expect a yearly seasonality. In this case the automatic detection of seasonality might for example recognize quarterly or half-year cycles that are not relevant for your business. Additionally, in case of high noise, automatic detection of seasonality might result in a false seasonality with a cycle of two periods. 

You should keep in mind that the system will only look for cycles with the exact length you specify, which may lead to distorted results in some cases. For example, if you specify a seasonal cycle of 24 months, with 24 being an integral multiple of 12, it may happen that time series analysis identifies a seasonality cycle of 24 months and doesn’t detect that the actual length is 12 months.

Seasonality Type

Set the preference that the system should apply when identifying the type of the seasonality pattern. You have the following options:

Choose Automatically - the system automatically decides if the pattern can be better described as multiplicative or additive, and saves the outcome of this decision in the target key figure.
Prefer Multiplicative - the system tries to identify each seasonality pattern as multiplicative. If this is not possible for the time series, no seasonality information is saved for it in the target key figure.
Prefer Additive - the system tries to identify each seasonality pattern as additive. If this is not possible for the time series, no seasonality information is saved for it in the target key figure.

Setting your preference for the type of seasonality is useful, for example, if you use the seasonality indices as independent variables in forecast models with algorithms that are capable of handling such variables (for example, gradient boosting of decision trees).

Trend

Trends are patterns or consistent behaviors occurring over time. In statistical terms, a trend is often represented as a linear relationship between a variable (in the case of demand forecasting, the historical sales) and time.

To evaluate the presence of a trend, a so-called trend test is performed. The trend test is able to identify whether a time series has an upward or downward trend or not, and calculate the de-trended time series.

When setting up a profile in the Manage Forecast Automation Profiles App, there are two settings that are relevant for trend.

Trend Settings.png

Significance of Trend Test

The significance value in a trend test provides a measure of the statistical evidence that the observed trend is actually present in the data rather than merely the result of random variation.

If the significance is small then there is strong evidence of a trend. Vice versa if the significance is high, there is weak or no evidence of a trend.
For example, a value of 0.05 (the default) means there is less than a 5% chance that the observed data could occur by random chance alone if there were really no trend.

The significance does not directly provide information about the magnitude or direction of the trend, but rather about the confidence with which we can assert that a trend exists at all.

Consider Change Points

Specify if you want the system to consider change points when performing time series analysis. In case of a trend change, this means that the system only looks for trends in the segment after the last change point.

If Consider Change Points is selected, the Minimum Interval setting for Change Point Detection has an impact on the detection of the trend: if change points are allowed to be detected close to each other, this can lead to a short segment after the last change point, based on which the trend slope will be calculated. The trend slope might therefore be unprecise.

Additionally, it makes sense to use this setting if you have a relatively long history, where splitting the data depending on trend changes will still ensure that the system is able to identify a trend in the last segment. If the history is too short, this might lead to not having enough data points to identify a trend.

Intermittency and volatility

Intermittency and volatility settings.png

Intermittency

A time series is intermittent if it contains a lot of zero values. “A lot” is determined by the parameters of the test.

Intermittency Detection Method

To differentiate between continuous and intermittent time series you can use different methods. You have the following options:

Zeros or Missing Values
The system checks the proportion of zeros or missing values compared to existing values. If the proportion is higher than the percentage you specify, the “intermittent” property will be assigned to the time series. Missing values are automatically replaced with zeros during the analysis.
Average Demand Interval
The average interval between periods with data that is not missing or zero. If there is data in two adjacent periods, the interval is 1. The default value for this setting is 1.33. This means that the system calculates all intervals in the time series and if their average is higher than 1.33 (i.e. on average, one value out of every four values is zero), the “intermittent” property will be assigned to the time series.

Please consider that using Zeros or Missing Values as a Method if the Missing Values are not to be considered as zeros might lead to the false detection of intermittency.

Using the method of Zeros or Missing Values will classify a time series as intermittent even when long periods without any sales are followed by periods of continuous sales, as only the total number of zero values versus the number of non-zero values is relevant in this case. If you want to consider a time series as intermittent only when it consistently shows some zero values between non-zero values, Average Demand Interval would be a better choice.

Volatility

A time series is volatile if the volume of the data changes a lot. This is measured by checking how random the data points are and how much the data varies in relation to the mean of the dataset.

To find out if a time series is volatile, one needs to see whether the observed time series values are just random (i.e., white noise) or if they follow some underlying behavior.

If white noise is detected in the data series, it implies that the fluctuations in the data are completely random and do not follow a specific pattern or trend. If there is a lot of white noise and no trend or seasonality is identified in the data, the time series is considered irregular.

It’s crucial to understand the presence of white noise in your data, because if it's present, traditional time series forecasting methods are likely to be ineffective. Very irregular time series are virtually impossible to forecast, as there's no apparent relationship between past and future values. Excluding the time series from forecasting or using a simple Copy Past Periods method might be a better approach.
You might also consider using Confidence Prediction Intervals (available for all exponential smoothing algorithms) to cover such cases: point forecasting on irregular data is very inaccurate, but a range can be very informative.

Probability for White Noise Test

This setting is needed to specify the level of confidence with which the results of the white noise tests should be taken into account.

The higher you set the probability, the higher the confidence level from which the system will consider the time series irregular.

For example, the default 0.9 level of probability means that the time series is considered irregular only if white noise is identified in it with a confidence level of 90% or more. A low threshold such as 0.2 will lead to false positive results because the system will consider the time series irregular every time when white noise is identified in the data with a confidence level of 20% or more.

If a time series is both intermittent and volatile, then it is called “lumpy”.

Threshold for Lumpy Demand (CV Squared)

To evaluate how lumpy a time series is (i.e. how much the volume of the data changes), the system divides the square of the deviation with the square of the mean, getting the Coefficient of Variation (CV) squared as a result. Here you can specify the threshold for this calculation, the default is 0.5.
A low coefficient of variation (e.g. 0,1) indicates that the data points are close to the mean, meaning low variability, while a high coefficient of variation indicates that there is high variability from the mean.

How to use time series analysis to improve your forecast

You can make use of the results of time series analysis in different applications.

In the Manage Forecast Models app, you can choose the Consider Time Series Properties option after setting the system to utilize multiple forecasts using the Choose Best Forecast method. If you do so, the system checks the time series properties that were identified by the most recent forecast automation job and uses them to filter out the algorithms that are not expected to calculate an appropriate forecast.

If you select the Automatically Generated Seasonality Dummy in the advanced algorithms Multiple Linear Regression or Gradient Boosting of Decision Trees, the system will create an independent variable based on the season cycle found during Time Series Analysis. If no results from Time Series Analysis are available, the system will create a new Seasonality Dummy doing an ad-hoc seasonality test.

Additionally, you can set the outlier correction preprocessing algorithm to consider time series properties. If you do so, the algorithm can detect outliers that don’t vary significantly from the mean or median but do vary from the seasonality or trend pattern in the data.

In the Manage ABC/XYZ Segmentation Rules app, the system considers the results of Time Series Analysis automatically during XYZ segmentation.

And finally, you can use the results of time series analysis in the SAP IBP, add-in for Microsoft Excel, where you can use them to limit and filter your planning views if you have saved the properties in an attribute and other results in key figures.
To find out more on how to save time series properties in an attribute, have a look at the documentation.

Change Points Detection

Change Points detection is a machine learning based algorithm that allows to detect major changes that occurred in the time series and had long-term effects on the data. The following changes may occur:

Level shift: when the mean of the time series values alters significantly
Trend change: when the direction or slope of a trend alters significantly

This is shown in the following graphic, where the red circle marks a level shift and the green circle marks a trend change:

change points 0.png

A level shift may happen, for example, when a new sales channel such as online shopping is opened for a product, a product is introduced for a new market, or the legal environment changes (for example, a medicine becomes subsidized). Such changes often result in a higher mean of time series values. It is also possible that the mean of actual sales decreases; this happens, for example, when a new competitor enters the market or a subsidy is discontinued.

A trend change may happen in the following cases:

If the change between two adjacent sections of the trend slope is more than the minimum percentage defined in the forecast automation profile.
For example, if there was a positive trend in the sales of a product in November because Christmas was approaching and the trend continued with a much steeper slope in December when Christmas was really close, the algorithm will identify a trend change in the time series.
If the change between two adjacent sections of the trend slope is not large enough to meet the minimum requirement but a significant level shift can also be observed between the sections.
For example, if there was a positive trend in the sales of a product in November and the trend continued with the same slope but much higher values in December, the algorithm will identify a trend change even though the slope of the trend didn't change.

If both the level shift and the trend change are significant in a time series, the analysis will only identify the trend change.

Change point detection consists of the following steps:

Set the rules
As an administrator or configuration expert, you set up the rules for change point detection in the Manage Forecast Automation Profiles app.
Run change point detection
As a business user, you schedule and run a Forecast Automation job in the Application Jobs app, for which you select the Change Point Detection option. You can also select Time Series Analysis for the same application job, but this is not mandatory.
Check the results (optional)
Optionally, you can check the results of the forecast automation job that you just ran. To do so, select or open the corresponding forecast automation profile in the Manage Forecast Automation Profiles app, choose Show Analysis Results, and then Change Points.
Leverage detection results
See section “How to use Change Points detection to improve your forecast”.

Let’s deep dive on the parameters you can set in the Manage Forecast Automation Profiles app related to Change Points detection.

Change Points settings.png

Minimum Interval

When change point detection is performed, the system divides the historical horizon into time intervals that are either bordered by two change points, or by the first and last change points and the respective ends of the historical horizon. By this setting you can specify the minimum length of these intervals to help the system calculate statistically meaningful results.

For example, you can specify that there should be a minimum interval of 6 months between two successive change points, by which you will also define that there should be at least 6 months between the start of the historical horizon and the first change point, as well as between the end of the historical horizon and the last change point.

The interval is defined in terms of the periodicity you have chosen for the target calculation level.

Note that you can only enter an integer equal to or higher than 6 for this setting.

Minimum Level Shift (%) and Minimum Trend Change (%)

The system uses these settings to perform the following steps:

It distinguishes trends from levels by applying the following rule: a slope is a level if its absolute value doesn’t reach 50% of the value selected for minimum trend change (default is 20%). Otherwise, it is a trend.
The system checks the following in order to prevent the identification of change points that are too small to be significant:
- If a different level is found on either side of a change, the system checks whether the difference between the means of the values on those levels is larger than the minimum level shift. If it is larger, the change is identified as a level shift. If it is smaller, no change point is detected.
- If a trend was found on one or both sides of a change, the system checks the following:
  - Whether the change between the two slopes is larger than the minimum trend change
  - Whether the difference between the means of the values on the two slopes is larger than the minimum level shift

If at least one of these conditions is met, the change is identified as a trend change. Otherwise, no change point is detected.

How to use Change Points detection to improve your forecast

You can set the Multiple Linear Regression, Auto-ARIMAX/SARIMAX, Gradient Boosting of Decision Trees and Extreme Gradient Boosting algorithms to consider change points for a more accurate forecast. If you do not consider the change points in any of these algorithms, then you can deactivate the Change Point detection during the Time Series Analysis, as they will not be used anywhere else.
Whenever Change Points are detected in a time series, the series is split to make sure that the algorithm treats each differently. In case of very short time series, this might result in worse forecast.

If the Consider Change Points option is selected for these algorithms in the Manage Forecast Models app and change points were previously found in the time series, the system divides the time series into segments between the adjacent change points.

Example

Let’s say that there are two change points in the time series, both of which were caused by external factors and may not occur again in the future. Change point detection will result in the following micro chart in this case:

change points 1.png

When the change points are not considered by the Multiple Linear Regression algorithm, the time series is interpreted as one having an upward trend and the forecast is calculated as if this trend was expected to continue, which is probably not accurate. This is shown in the following chart:

change points 2.png

However, when the change points are considered, the ex-post forecast fits the historical data perfectly and no further changes are predicted for the future horizon. This is illustrated by the chart below:

change points 3.png

Additional settings in Forecast Automation Profiles

In addition to all the settings related to the algorithmic details of times series analysis and change points detection, you can specify how you want the results to be saved. These are optional outputs, you can always review the results directly in the Manage Forecast Automation app.

Attribute for time series properties
To save the time series property, e.g. “Intermittent with upward trend” in an attribute, it can be used to filter for time series of similar type.
A typical use case for saving the time series properties to an attribute is to define different forecast jobs with different forecast models and use the time series property as a planning filter for the jobs. For example, you could define one job for sporadic items without outlier correction and another job for the other items with outlier correction.
Key figure for seasonality patterns
To save the seasonality pattern found by the seasonality test in a key figure. This key figure, together with the key figure for trend pattern and the key figure for residual pattern, forms the original input key figure. By saving all of them to key figures you can visualize the different statistical components of your time series. It can be used as an independent variable for advanced algorithms.
Key figures for average demand interval
To save the average demand interval (time-independent) in a key figure. This key figure can be used to understand how sporadic the time series is.
Key figures for trend pattern
To save the trend found by the trend test in a key figure. This key figure, together with the key figure for seasonality pattern and the key figure for residual pattern, forms the original input key figure. By saving all of them to key figures you can visualize the different statistical components of your time series. It can be used as an independent variable for advanced algorithms.
Key figures for residual pattern
To save what is left of the original key figure after removing seasonal pattern and trend to a key figure. This key figure, together with the key figure for seasonality pattern and the key figure for trend pattern, forms the original input key figure. By saving all of them to key figures you can visualize the different statistical components of your time series.

Conclusion

Time Series Analysis and Change Point detection help you understand the pattern in your data and ultimately improve the quality of your forecast. I hope this deep dive into Forecast Automation in SAP IBP has helped you understand the tools we offer to make this process automated and tailored to your needs.

For more information see the detailed documentation here.

SAP IBP: Enhancing Forecast Accuracy with Time Series Analysis and Change Point Detection

Time Series Analysis