FOREIGN TRADE UNIVERSITY FACULTY OF INTERNATIONAL ECONOMICS ---***--- ECONOMIC FORECAST MID-TERM ASSIGNMENT Forecasting Vietnam’s export value from October 2019 to December 2020 by t
Trang 1FOREIGN TRADE UNIVERSITY FACULTY OF INTERNATIONAL ECONOMICS
-*** -
ECONOMIC FORECAST MID-TERM ASSIGNMENT
Forecasting Vietnam’s export value from October 2019
to December 2020 by time series analysis method and Box-Jenkins method using seasonal ARIMA model
Lecturer: PhD Chu Thi Mai Phuong Class: KTEE 418.1
Students:
Hanoi, December 12, 2019
Trang 2Contents
Abstract 3
1 Introduction 3
2 Methods and processes 3
2.1 Time series analysis method 3
2.2 Box-Jenkins method and seasonal ARIMA model 4
3 Data and forecast results 6
3.1 Data description: 6
3.2 The process of forecasting 7
4 Conclusion 17
5 References 18
Trang 3Abstract
In this report, we use time series analysis method and Box-Jenkins method using ARIMA model with seasonal component (SARIMA) to forecast the total export value of Vietnam from October 2019 to December 2020 The forecast results provided by both methods is reliable Between two methods, we find that time series analysis is more preferable
1 Introduction
Nowadays, in the era of globalization, trade is dispensable in the economy of each nations and territories It plays a crucial role therefore not only statistics, analysis and evaluation but also the forecasting of import and export is a permanent work of economists, especially policymakers In addition to the state, firms also pay close attention to and forecast import & export situation to facilitate business, in line with the global trend
To understand why economic forecasting plays such an important role, first of all
we need to understand what is forecasting? Forecasting is a prediction based on statistical data and analysis by scientific methods The object of forecasting is the situation and development trend of a future business, science or social activity The forecast is probabilistic but also reliable because the forecasters base on real data to find trends
Vietnam is a country with a favorable geographical position, located in the tropical monsoon climate, with the advantage of diverse agricultural products and rare
minerals The export of agricultural products and minerals is a crucial activity, bringing advantages to Vietnam's economy This is the main source of foreign currency revenue, promoting production, bringing jobs and significant external relations meaning
However, it seems that the lack of complete control of production and quality of agricultural products abroad as well as the dependence on some importing countries are hindering Vietnam's export industry
The exporting products are always closely related by climate, crop, and production patterns throughout the territory of Vietnam Understanding the relationship between these products and the situation of export value will help the government and firms in planning future production and business plans for the most efficient export activities
This is the mission of economic forecasting
With the purpose of clarifying the forecasting method for export activities of Vietnam, our research team uses the econometric software Eviews to run a model to forecast the total export value of Vietnam from October 2019 to December 2020 based
on valuable data collected from General Statistic Office of Vietnam
2 Methods and processes 2.1 Time series analysis method Forecasting process:
Step 1: Identifying the data
Trang 4Testing whether the sequence is multiplicative or additive by observing the fluctuation trend of the sequence
Step 2: Excluding the seasonal factor from the sequence The seasonal factor is adjusted by using the MA ratios:
Calculate the CMA4 if the sequence is sorted by quarter, or CMA12 if the sequence is sorted by month
Calculate the ratio of the observations equaling the ratio between the original series and the moving average series:
Series of ratios: 𝒀𝒀𝒕
𝒕
𝑴𝑨 = 𝒀 𝟏 𝟑𝒀
𝟏 𝟑
𝑴𝑨 ,𝒀 𝟏 𝟒𝒀
𝟏 𝟒
𝑴𝑨 , … ,𝒀 𝒎 𝟐𝒀
(𝒎)𝟐
𝑴𝑨 Calculate the ratios for each quarter / month
Adjust the original series by seasonal indexes: there is a seasonal index every quarter / month that reflects the impact of the season The adjusted series values are:
Multiplicative model:𝒀 𝒋 𝒊𝑺𝑨𝑹 = 𝒀 𝒋 𝒊𝑺𝑹
Additive model: 𝒀 𝒋 𝒊𝑺𝑨𝑫 = 𝒀 𝒋 𝒊𝑺𝑨𝑹 - SDi
Step 3: Estimating the trend function and forecasting Estimate the trend function
Violation tests:
Omitted variables test
Autocorrelation test
Variance test
Normal distribution of noise test Forecast in the sample
Step 4: Combining the trend and seasonal factors to get final forecast result From the forecast result in the sample with the lowest MAPE, we can conduct the forecast outside of the sample to get YSAF
The adjusted series values are:
Multiplicative model: Yf =𝒀𝑺𝑨𝑭 SR
Additive model: Yf = 𝒀𝑺𝑨𝑭 +SD 2.2 Box-Jenkins method and seasonal ARIMA model
Box-Jenkins method, or ARIMA(p, d, q) model, consisting of:
AR(p): the p-order autoregressive model
Y(d): the stationary sequence with the d-order difference
MA(q): the q-order moving average model has the equation:
Y d = c + Φ 1 Y(d) t-1 + … + Φ p Y d t-p + θ 1 u t-1 + … + θ q u t-q + u t
Trang 5The SARIMA model was developed from the ARIMA model to fit any seasonal time series data, whether they are 4 quarters, 12 months in a year or 7 days a week If the observed data series is seasonal, then the general ARIMA model is now called
SARIMA(p, d, q)(P, D, Q), with P and Q respectively is the order of AR and MA, and D is the seasonal difference
Forecasting process:
Step 1: Excluding the seasonal factor from the sequence
Step 2: Applying SARIMA model for the adjusted sequence
Step 2.1 Stationarity test
A time series is stationary if the mean, the variance, and the covariance (at different lags) stay the same over time The sequence must be stationary in order to be used to predict the trend in future periods
Average: E (Yt) = μ = const Variance: Var (Yt) = const Covariance: Cov (Yt, Yt-p) = 0
To see whether the sequence is stationary or not, we can use the auto regression model Yt
= ρYt-1 Ut with the hypothesis:
𝐻0: ρ = 1, Yt is non − stationary
𝐻1: ρ < 1, 𝑌𝑡 𝑖𝑠 𝑠𝑡𝑎𝑡𝑖𝑜𝑛𝑎𝑟𝑦
If the sequence is stationary at level, we have I (d = 0)
If the first difference of the sequence is stationary, we have I (d = 1)
If the second difference of the sequence is stationary, we have I (d = 2)
Step 2.2 Determining the p, q values of ARIMA model After stationarity test, we determine the order of components AR and MA through Auto-Correlation Function (ACF) and Partial Auto-Auto-Correlation Function (PACF)
The p-order regression model, AR(p) is written as follows:
𝑌𝑡 = ∅0 + ∅𝑖𝑌𝑡−𝑖
𝑝
𝑖=1
+ 𝑢𝑡
The value of p is determined through the PACF correlation scheme
The q-order moving average model, MR(q) is written as follows:
𝑌𝑡 = 𝜃0+ 𝜃𝑖𝑢𝑡−𝑗
𝑞
𝑗 =1
+ 𝑢𝑡 The value of q is determined through the PACF correlation scheme
Step 2.3 Testing the hypothetical conditions of the model
Stability and invertibility test
White noise test
Trang 6 Forecast quality test Step 2.4 Forecasting outside of the sample The model is suitable if it passes all of the above tests, and will be used for forecasting
Step 3: Forecasting the original data series After getting the forecasted results of the adjusted series, multiply or add the seasonal factors to get the forecasted results of the original series
3 Data and forecast results 3.1 Data description:
- The data used in this research is the total export value of Vietnam per month (unit:
billion USD) from January 2011 to September 2019, provided by GENERAL STATISTICS
OFFICE of VIETNAM on their website https://www.gso.gov.vn/ in Vietnam, and forecasted using EVIEWS programme
- Resize data
The very first step to do when forecasting by EVIEWS is to expand the observations
to add the periods that you want to forecast In our case, as we attempt to forecast
Vietnam’s export value from October 2019 to December 2020, we click on Workfile window, Range: 2011M01 2019M09 – 105 observations at the Date specification, we change the End date to 2020M12 Now the model have 120 observations, with 15
forecast observations from 2019M10 to 2020M12
- To check whether the data have seasonal factor or not, we click on the data
exportView Graph Seasonal Graph
4,000 8,000 12,000 16,000 20,000 24,000 28,000
Means by Season
EXPORT by Season
Look at the graph, it is clearly apparent thatthe means by season between the
periods has a fluctuated difference, so this data series has a seasonal factor Therefore, when running the model for forecasting, we have to extract the seasonal factor from the data series in order to have our forecast at high accuracy
Trang 73.2 The process of forecasting
- Step 1: Identify the data
By using the command line export, we have the following graph:
4,000 8,000 12,000 16,000 20,000 24,000 28,000
2011 2012 2013 2014 2015 2016 2017 2018 2019 2020
EXPORT
Looking at the graph, it is given that the amplitude is widening over time Thus, we conclude that the data is suitable for multiplicative model
- Step 2: Seasonal Adjustment (Detach the seasonal component)
To detach the seasonal component of this data, we do as follows:
Open file export Proc Seasonal Adjustment Moving Average Methods
At the Adjustment Method box, we choose Ratio to moving average – Multiplicative
At the Series to calculate box, we name the Adjusted series as exportsa, and the seasonal factoras sr
Sample: 2011M01 2020M12 Included observations: 105 Ratio to Moving Average Original Series: EXPORT Adjusted Series: EXPORTSA Scaling Factors:
Trang 88 1.088853
Since the third steps, each method has different approaches:
Time series analysis method with multiplicative model
- Step 3: Estimate the exportsa series based on the trend function
On theCommand window, we type the commands:
genr t=@trend(2011m01)to create trend variable t
ls exportsa c t để estimate exportsa in accordance to trend variable t
Dependent Variable: EXPORTSA Method: Least Squares
Date: 12/11/19 Time: 22:22 Sample (adjusted): 2011M01 2019M09 Included observations: 105 after adjustments Variable Coefficient Std Error t-Statistic Prob
C 6673.193 197.0313 33.86870 0.0000
T 142.5878 3.273565 43.55734 0.0000 R-squared 0.948506 Mean dependent var 14087.76 Adjusted R-squared 0.948006 S.D dependent var 4458.812 S.E of regression 1016.704 Akaike info criterion 16.70538 Sum squared resid 1.06E+08 Schwarz criterion 16.75594 Log likelihood -875.0327 Hannan-Quinn criter 16.72587 F-statistic 1897.242 Durbin-Watson stat 1.169695 Prob(F-statistic) 0.000000
As T has very big T-statistic and P-value =0.0000 < 5% the model is statistically significant at significance level 5%
Omitted Variable Test:
We have the hypothesis: H0: The model does not omit any variable H
1: The model omits variable
On the estimation window, we click View Stability Diagnostics Ramsey RESET Testwe chooseNumber of fitted terms = 1
Specification: EXPORTSA C T Omitted Variables: Squares of fitted values
Value df Probability t-statistic 6.641224 102 0.0000 F-statistic 44.10586 (1, 102) 0.0000 Likelihood ratio 37.73265 1 0.0000
Trang 9According to this result, we haveP-value = 0.0000 <α = 5%Reject H0, accept H1
The model has omitted variable(s)
After adding variables t^2, t^3, the model is statistically significant,but still not pass the Omitted variable test
Specification: EXPORTSA C T T^2
Omitted Variables: Squares of fitted value
Value df Probability t-statistic 2.188769 101 0.0309
F-statistic 4.790708 (1, 101) 0.0309
Likelihood ratio 4.865928 1 0.0274
Specification: EXPORTSA C T T^2 T^3 Omitted Variables: Squares of fitted values
Value df Probability t-statistic 3.320142 100 0.0013 F-statistic 11.02335 (1, 100) 0.0013 Likelihood ratio 10.97988 1 0.0009
We decide to run the least square model of log(exportsa): ls log(exportsa) c t
Dependent Variable: LOG(EXPORTSA) Method: Least Squares
Date: 12/11/19 Time: 23:06 Sample (adjusted): 2011M01 2019M09 Included observations: 105 after adjustments Variable Coefficient Std Error t-Statistic Prob
C 8.962084 0.012285 729.5251 0.0000
T 0.010392 0.000204 50.91531 0.0000 R-squared 0.961786 Mean dependent var 9.502473 Adjusted R-squared 0.961415 S.D dependent var 0.322716 S.E of regression 0.063391 Akaike info criterion -2.660123 Sum squared resid 0.413898 Schwarz criterion -2.609571 Log likelihood 141.6565 Hannan-Quinn criter -2.639638 F-statistic 2592.369 Durbin-Watson stat 1.743267 Prob(F-statistic) 0.000000
As T has very big T-statistic and P-value =0.0000 <α= 5% the model is statistically significant at significance level 5%
On the estimation window, we click View Stability Diagnostics Ramsey RESET Test Number of fitted terms = 1
Specification: LOG(EXPORTSA) C T Omitted Variables: Squares of fitted values
Value df Probability t-statistic 1.315627 102 0.1912 F-statistic 1.730874 (1, 102) 0.1912 Likelihood ratio 1.766833 1 0.1838
As P-value >α = 5% not reject H0 the model has no omitted variable Testing Heteroskedasticity
Trang 10We have the hypothesis:
H0: the model does not suffer from Heterokedasticity
H1: the model suffers from Heteroskedasticity
On the estimation window, we clickView Residual Diagnostics
HeteroskedasticityTestwechoose Breusch – Pagan – Godfrey
Heteroskedasticity Test: Breusch-Pagan-Godfrey F-statistic 4.422442 Prob F(1,103) 0.0379 Obs*R-squared 4.322714 Prob Chi-Square(1) 0.0376 Scaled explained SS 7.775218 Prob Chi-Square(1) 0.0053
It can be seen that P-value = 0.0379 <α = 0.05Reject H0
The model suffers from Heteroskedastictyat significance levelα = 5%
Testing Autocorrelation
We have the hypothesis: H0: The model does not have autocorrelation
H1: The model has autocorrelation
On the estimation window, we clickView Residual Diagnostics Serial Correlation
LM testwe choose Lags to include = 1
Breusch-Godfrey Serial Correlation LM Test:
F-statistic 1.629849 Prob F(1,102) 0.2046 Obs*R-squared 1.651399 Prob Chi-Square(1) 0.1988
It can be seen that P-value = 0.2047 >α = 0.05 Not reject H0
The model does not have autocorrelation Normality Test
We have the hypothesis: H0: Data are normally distributed
H1: Data are not normally distributed
On the estimation window, we clickView Residual Diagnostics Histogram Normality Test
0 4 8 12 16 20 24
Series: Residuals Sample 2011M01 2019M09 Observations 105
Mean -3.56e-16 Median 0.011236 Maximum 0.203016 Minimum -0.217700 Std Dev 0.063086 Skewness -0.350692 Kurtosis 4.738439 Jarque-Bera 15.37422 Probability 0.000459