How do you calculate average forecast error?

Training and test sets

It is important to evaluate forecast accuracy using genuine forecasts. Consequently, the size of the residuals is not a reliable indication of how large true forecast errors are likely to be. The accuracy of forecasts can only be determined by considering how well a model performs on new data that were not used when fitting the model.

When choosing models, it is common practice to separate the available data into two portions, training and test data, where the training data is used to estimate any parameters of a forecasting method and the test data is used to evaluate its accuracy. Because the test data is not used in determining the forecasts, it should provide a reliable indication of how well the model is likely to forecast on new data.

The size of the test set is typically about 20% of the total sample, although this value depends on how long the sample is and how far ahead you want to forecast. The test set should ideally be at least as large as the maximum forecast horizon required. The following points should be noted.

A model which fits the training data well will not necessarily forecast well.
A perfect fit can always be obtained by using a model with enough parameters.
Over-fitting a model to data is just as bad as failing to identify a systematic pattern in the data.

Some references describe the test set as the “hold-out set” because these data are “held out” of the data used for fitting. Other references call the training set the “in-sample data” and the test set the “out-of-sample data”. We prefer to use “training data” and “test data” in this book.

Functions to subset a time series

The window[] function introduced in Chapter 2 is useful when extracting a portion of a time series, such as we need when creating training and test sets. In the window[] function, we specify the start and/or end of the portion of time series required using time values. For example,

window[ausbeer, start=1995]

extracts all data from 1995 onward.

Another useful function is subset[] which allows for more types of subsetting. A great advantage of this function is that it allows the use of indices to choose a subset. For example,

subset[ausbeer, start=length[ausbeer]-4*5]

extracts the last 5 years of observations from ausbeer. It also allows extracting all values for a specific season. For example,

subset[ausbeer, quarter = 1]

extracts the first quarters for all years.

Finally, head and tail are useful for extracting the first few or last few observations. For example, the last 5 years of ausbeer can also be obtained using

Forecast errors

A forecast “error” is the difference between an observed value and its forecast. Here “error” does not mean a mistake, it means the unpredictable part of an observation. It can be written as \[ e_{T+h} = y_{T+h} - \hat{y}_{T+h|T}, \] where the training data is given by \[\{y_1,\dots,y_T\}\] and the test data is given by \[\{y_{T+1},y_{T+2},\dots\}\].

Note that forecast errors are different from residuals in two ways. First, residuals are calculated on the training set while forecast errors are calculated on the test set. Second, residuals are based on one-step forecasts while forecast errors can involve multi-step forecasts.

We can measure forecast accuracy by summarising the forecast errors in different ways.

Scale-dependent errors

The forecast errors are on the same scale as the data. Accuracy measures that are based only on \[e_{t}\] are therefore scale-dependent and cannot be used to make comparisons between series that involve different units.

The two most commonly used scale-dependent measures are based on the absolute errors or squared errors: \[\begin{align*} \text{Mean absolute error: MAE} & = \text{mean}[|e_{t}|],\\ \text{Root mean squared error: RMSE} & = \sqrt{\text{mean}[e_{t}^2]}. \end{align*}\] When comparing forecast methods applied to a single time series, or to several time series with the same units, the MAE is popular as it is easy to both understand and compute. A forecast method that minimises the MAE will lead to forecasts of the median, while minimising the RMSE will lead to forecasts of the mean. Consequently, the RMSE is also widely used, despite being more difficult to interpret.

Percentage errors

The percentage error is given by \[p_{t} = 100 e_{t}/y_{t}\]. Percentage errors have the advantage of being unit-free, and so are frequently used to compare forecast performances between data sets. The most commonly used measure is: \[ \text{Mean absolute percentage error: MAPE} = \text{mean}[|p_{t}|]. \] Measures based on percentage errors have the disadvantage of being infinite or undefined if \[y_{t}=0\] for any \[t\] in the period of interest, and having extreme values if any \[y_{t}\] is close to zero. Another problem with percentage errors that is often overlooked is that they assume the unit of measurement has a meaningful zero.2 For example, a percentage error makes no sense when measuring the accuracy of temperature forecasts on either the Fahrenheit or Celsius scales, because temperature has an arbitrary zero point.

They also have the disadvantage that they put a heavier penalty on negative errors than on positive errors. This observation led to the use of the so-called “symmetric” MAPE [sMAPE] proposed by Armstrong [1978, p. 348], which was used in the M3 forecasting competition. It is defined by \[ \text{sMAPE} = \text{mean}\left[200|y_{t} - \hat{y}_{t}|/[y_{t}+\hat{y}_{t}]\right]. \] However, if \[y_{t}\] is close to zero, \[\hat{y}_{t}\] is also likely to be close to zero. Thus, the measure still involves division by a number close to zero, making the calculation unstable. Also, the value of sMAPE can be negative, so it is not really a measure of “absolute percentage errors” at all.

Hyndman & Koehler [2006] recommend that the sMAPE not be used. It is included here only because it is widely used, although we will not use it in this book.

Scaled errors

Scaled errors were proposed by Hyndman & Koehler [2006] as an alternative to using percentage errors when comparing forecast accuracy across series with different units. They proposed scaling the errors based on the training MAE from a simple forecast method.

For a non-seasonal time series, a useful way to define a scaled error uses naïve forecasts: \[ q_{j} = \frac{\displaystyle e_{j}} {\displaystyle\frac{1}{T-1}\sum_{t=2}^T |y_{t}-y_{t-1}|}. \] Because the numerator and denominator both involve values on the scale of the original data, \[q_{j}\] is independent of the scale of the data. A scaled error is less than one if it arises from a better forecast than the average naïve forecast computed on the training data. Conversely, it is greater than one if the forecast is worse than the average naïve forecast computed on the training data.

For seasonal time series, a scaled error can be defined using seasonal naïve forecasts: \[ q_{j} = \frac{\displaystyle e_{j}} {\displaystyle\frac{1}{T-m}\sum_{t=m+1}^T |y_{t}-y_{t-m}|}. \]

The mean absolute scaled error is simply \[ \text{MASE} = \text{mean}[|q_{j}|]. \]

Examples

beer2 % residuals[] -> res
res^2 %>% mean[na.rm=TRUE] %>% sqrt[]
#> [1] 6.169

The left hand side of each pipe is passed as the first argument to the function on the right hand side. This is consistent with the way we read from left to right in English. When using pipes, all other arguments must be named, which also helps readability. When using pipes, it is natural to use the right arrow assignment -> rather than the left arrow. For example, the third line above can be read as “Take the goog200 series, pass it to rwf[] with drift=TRUE, compute the resulting residuals, and store them as res”.

We will use the pipe operator whenever it makes the code easier to read. In order to be consistent, we will always follow a function with parentheses to differentiate it from other objects, even if it has no arguments. See, for example, the use of sqrt[] and residuals[] in the code above.

Example: using tsCV[]

The goog200 data, plotted in Figure 3.5, includes daily closing stock price of Google Inc from the NASDAQ exchange for 200 consecutive trading days starting on 25 February 2013.

The code below evaluates the forecasting performance of 1- to 8-step-ahead naïve forecasts with tsCV[], using MSE as the forecast error measure. The plot shows that the forecast error increases as the forecast horizon increases, as we would expect.


				
					

                 
	Bài Viết Liên Quan
	
	 	
		
		   
		   
		   
		
		
			What is the meaning of customer share?

		
	

		
		
		   
		   
		   
		
		
			What is the best method of decontamination CBRN?

		
	

		
		
		   
		   
		   
		
		
			What is the focus of combination in math?

		
	

		
		
		   
		   
		   
		
		
			Which of the approaches to appraisal compares the subject property to similar properties in order to arrive at a value?

		
	

		
		
		   
		   
		   
		
		
			What are fixed costs 4 examples?

		
	

		
		
		   
		   
		   
		
		
			Is defined as the difference between the maximum price a consumer is willing to pay for a product and the actual price?

		
	

		
		
		   
		   
		   
		
		
			The concept that assets should be recorded at the cash-equivalent value on the exchange date

		
	

		
		
		   
		   
		   
		
		
			What does the conditional formatting do?

		
	

		
		
		   
		   
		   
		
		
			When inventory prices are rising which inventory method will result in the lowest net income?

		
	

		
		
		   
		   
		   
		
		
			When the actual cost is more than standard cost then it is what variance?

		
	

		
		
		   
		   
		   
		
		
			Which financial statement shows a firms bottom line its profit or loss after costs expenses and taxes for a specific period?

		
	

	
	




Toplist mới

 
	
	 
		#1
		
			Top 8 sách giáo viên cánh diều lớp 3 2023
			6 tháng trước
		
	



	
	 
		#2
		
			Top 7 con giáp nào nguy hiểm nhất 2023
			6 tháng trước
		
	



	
	 
		#3
		
			Top 8 kinh đô của nước âu lạc đóng ở 2023
			6 tháng trước
		
	



	
	 
		#4
		
			Top 8 đáp án đề thi thử tốt nghiệp thpt 2022 môn văn sở gd&đt hà nội 2023
			6 tháng trước
		
	



	
	 
		#5
		
			Top 6 mạch điện cầu thang 2 công tắc 1 bóng đèn 2023
			6 tháng trước
		
	



	
	 
		#6
		
			Top 3 giáo an điện từ cá vàng bơi 2023
			6 tháng trước
		
	



	
	 
		#7
		
			Top 10 tiểu luận xử lý tình huống sinh con thứ ba 2023
			6 tháng trước
		
	



	
	 
		#8
		
			Top 9 vùng biển nào có diện tích lớn nhất 2023
			6 tháng trước
		
	



	
	 
		#9
		
			Top 4 thâm cung sâu từng bước tập 1 vietsub motphim 2023
			6 tháng trước
		
	






		


	Bài mới nhất
	
	 	
		
		   
		   
		   
		
		
			Chương trình tiếng anh tăng cường là gì năm 2024

		
	

		
		
		   
		   
		   
		
		
			Spa nào làm trắng da mặt hiệu quả năm 2024

		
	

		
		
		   
		   
		   
		
		
			Bán thẻ điện thoại xuất hóa đơn như thế nào năm 2024

		
	

		
		
		   
		   
		   
		
		
			Mắt phải giật liên tục là hiện tượng gì năm 2024

		
	

		
		
		   
		   
		   
		
		
			Lỗi toolbox trong solidworks 2023 không hiện lên năm 2024

		
	

		
		
		   
		   
		   
		
		
			Giáo án môn toán lớp 6 cả năm năm 2024

		
	

		
		
		   
		   
		   
		
		
			26 nguyễn văn cừ p.1 tp đà lạt năm 2024

		
	

		
		
		   
		   
		   
		
		
			Hồ sơ dự toán công trình dân dụng năm 2024

		
	

		
		
		   
		   
		   
		
		
			Các bài toán về phép chia có dư lớp 5 năm 2024

		
	

		
		
		   
		   
		   
		
		
			Lỗi thanh tìm kiếm google chuyển sang tiếng trung năm 2024

		
	

	
	
                 
	Chủ Đề
	
	
	
		  Toplist
		  Hỏi Đáp
		  Địa Điểm Hay
		  Là gì
		  Mẹo Hay
		  programming
		  Học Tốt
		  Nghĩa của từ
		  Công Nghệ
		  Khỏe Đẹp
		  mẹo hay
		  bao nhiêu
		  Top List
		  Bao nhiêu
		  Sản phẩm tốt
		  Xây Đựng
		  Tiếng anh
		  Ngôn ngữ
		  Bài Tập
		  Bài tập
		  đánh giá
		  So Sánh
		  Ở đâu
		  Hướng dẫn
		  So sánh
		  Tại sao
		  Dịch 
		  bao nhieu
		  Đại học
		  Máy tính
		  Món Ngon
		  Thế nào
		  hướng dẫn
		  Bao lâu
		  Vì sao
		  Hà Nội
		  Khoa Học