stock price and return volatility prediction using arimaand arch garch models

Over the past, many efforts have led to the development of a variety of quantitativemodelling methods to forecast stock prices and volatility, ranging from the combination ofautoregressi

Trang 1

NATIONAL ECONOMICS UNIVERSITY Faculty of Mathematical Economics

-*** -ASSIGNMENT SUBJECT: ECONOMETRICS II

Topic

Stock Price And Return Volatility Prediction Using ARIMA And ARCH-GARCH Models

Student name: Nguyen Thi Lan Nhi Student ID: 11219243 Class: Actuary 63 Instructor: Bui Duong Hai

Trang 3

I Introduction

Established in early 2000, the Vietnamese stock market has become a very attractive investment channel for investors, from professional investment organizations to individual investors However, in addition to the high profitability, this is also an activity that always exists with many potential risks because investors do not always accurately predict the trend

of stock prices in the future Therefore, the accurate prediction of stock price fluctuations to have a strategy to serve the business of individuals, organizations is becoming necessary Over the past, many efforts have led to the development of a variety of quantitative modelling methods to forecast stock prices and volatility, ranging from the combination of autoregressive integrated moving average (ARIMA) and generalized autoregressive conditional heteroscedasticity (GARCH) models, to Gaussian process regression (GPR) model and artificial neuron network (ANN) model

Within the scope of the Econometrics II course, this study will focus on applying ARIMA and ARCH-GARCH models to predict the closing prices and volatility of IMP- Inexpharm Vietnam Joint Stock Company’s share based on historical data

2.1 ARIMA Model

ARIMA, which stands for Autoregressive Integrated Moving Average, is a statistical model used for the analysis of time series data and forecasting future data points within the series This model is built on the concept that the current data can be explained by its past values and the cumulative impact of past disturbances, assuming the time series is stationary The ARIMA model consists of three main components, each of which is characterized by a parameter:

The Autoregressive (AR) component, denoted by p, which signifies the number of

lagged values used to predict the current value in the time series This component captures the relationship between the current value in the time series and its previous values

The Integrated (I) component, denoted by , which signifies the order ofd

differencing This component reflects the number of differencing operations required

to make the time series stationary

The Moving Average (MA) component, denoted by q, which represents the order of

moving averages This component captures the relationship between the current value

in the time series and white noise term based on past forecast errors

The ARIMA(p, d, q) model can be generalized by the following expression:

Trang 4

2.2 ARCH – GARCH Models

ARCH Model (Autoregressive Conditional Heteroskedasticity)

The ARCH model was introduced by Robert F Engle in 1982 It assumes that the conditional variance of the error term at each time point is a function of past error terms

The ARCH(p) model is defined as follows:

GARCH Model (Generalized Autoregressive Conditional Heteroskedasticity):

The GARCH model, introduced by Tim Bollerslev in 1986, is an extension of the ARCH model It allows for a more flexible specification of the conditional variance

by including lagged values of both the conditional variance and squared past observations

The GARCH(p, q) model is defined as follows:

In which:

The volatility is stationary if

Thus, the unconditional variance is:

3.1 Data

For this study, I have collected a series of closing share prices of Imexpharm Corporation (HOSE: IMP) from 391 trading sessions between January 1 , 2022 and July 31 , 2023.st st

Inexpharm Corporation is a leading manufacturer and distributor of pharmaceutical products

in Vietnam The company's products include prescription drugs, over-the-counter drugs, and medical devices Imexpharm's products are sold in over 50 countries worldwide Moreover, IMP stocks have been listed on HOSE since December 4 , 2006, it is becoming moreth

popular among Vietnamese investors The company has a strong track record of growth and profitability Imexpharm is expected to continue to grow in the coming years, as the Vietnamese pharmaceutical market is expected to grow significant

Trang 5

60000

70000

80000

2022-01 2022-07 2023-01 2023-07

Date

Moreover, the dataset also includes the 1 diffrence series, a growth rate series and a logst

return series, which have derived from the original stock price series using the following formulae:

Closing price at time t The 1 differencest

Growth rate

Log return

Then, the time series plots for all time series in the dataset is provided below:

Figure 1 IMP stock price series

Trang 6

-3000

0

3000

2022-01 2022-07 2023-01 2023-07

Date

-0.04

0.00

0.04

-0.04

0.00

0.04

2022-01 2022-07 2023-01 2023-07

Date

Figure 2 The 1 difference series

Figure 3 Growth rate series

Figure 4 Log return series

Trang 7

3.2 Methodology

Forecasting stock prices using ARIMA model

The Box-Jenkins method is a systematic process for identifying, estimating, and diagnosing autoregressive integrated moving average (ARIMA) time series models It was developed by George Box and Gwilym Jenkins in the 1970s, and it remains one of the most popular and widely used methods for time series forecasting today

Figure 5 Box-Jenkins method

For my study, 03 series inclusing closing price, growth rate and log return series are applied the Box-Jenkins method forecast the future prices of IMP stock The steps involved in the Box-Jenkins method can be summarized as follows:

Box-Jenkins Step 1: Identification

Stationarity Check

Examine the time series plot to identify any trends or seasonality A stationary time series is often easier to model

Conduct Dickey-Fuller (DF) tests with trend, with drift and without drift to check for the stationarity of the series around the time trend, around the long run mean , and around 0, respectively

o DF test with trend:

If is statistically significant and where , then we can conclude that the series

if stationary around the time trend is statistically insignificant, then proceed

to DF test with drift

o DF test with drift:

Trang 8

stationary around the long run mean If is statistically insignificant, then proceed to DF test without drift

o DF test without drift:

If then we can conclude that the series is stationary around value 0 In all cases, if , we conclude that the series is non-stationary

Differencing

If the time series is not stationary, take differences until stationarity is achieved By doing so, the degree of differencing d for the ARIMA model can be identified

Autocorrelation and Partial Autocorrelation Analysis

Examine the autocorrelation function (ACF) and partial autocorrelation function (PACF) plots to identify potential autoregressive (AR) and moving average (MA) orders

Box-Jenkins Step 2: Estimation

ARIMA Model Selection

Use the information gathered in the identification step to choose tentative orders for the ARIMA model

Estimate the parameters of the chosen ARIMA model using both maximum likelihood estimation (MLE) and ordinary least square (OLS) methods With each model, record the estimated coefficients and their standard errors, together with the AIC and BIC value of the model into a table to compare between models in order to choose the best model for each 03 series – closing price, growth rate and log return of IMP stock

Box-Jenkins Step 3: Diagnostic Checking

Stationary property by unit circle

Use the inverse unit root circle to check for stationarity of the autoregressive terms and moving average terms in the model If all inverse roots are within the unit circle, conclude the AR process and MA process in the ARIMA model are stationary

Residual Analysis

Examine the residuals of the estimated model for patterns or systematic behavior The residuals should ideally be white noise

Check the ACF and PACF of the residuals to ensure there are no significant autocorrelations

Use the Ljung-Box test to check for the absence of autocorrelation in the residuals If the p-value of the test is higher than 5%, conclude that the residual series is a white

Trang 9

Assessing model

Once satisfactory models are obtained, use these models to make forecasts for first 10 observations in August, 2023

o For growth rate series

o For log return series

Forecasting errors

Calculate and compare the forecasting errors RMSE, MAE and MAPE of all 03 series

to select the best model to forecast the closing prices of IMP stock in other days of August, 2023

Forecasting stock return volatility using ARCH-GARCH model

Step 1: Residual Calculation

Obtain the residuals from the selected ARIMA model by subtracting the predicted values from the observed values

Calculate the squared residuals from the selected ARIMA model

Step 2: Test for conditional heteroskedasticity

The heteroscedasticity test aims to discover whether the variance from the data is constant or time varying If σ is homocedastic, so the volatility value is calculated by using the formula of standard deviation If it is heteroscedastic, the volatility value is calculated by using ARCH-GARCH method

Form an auxiliary regression of the squared residuals on lagged squared residuals Using ARCH-LM test to test whether the squared residuals exhibit autoregressive conditional heteroskedasticity

The null hypothesis is that there is no conditional heteroskedasticity If the p-value of the test is lower than 5%, conclude that the presence of conditional heteroskedasticity

Step 3: ARCH-GARCH Modeling

Examine the PACF of the squared residuals to identify potential ARCH orders Examine the significance of ARCH estimated coefficients

Use the identified orders to estimate the GARCH model parameters, which applying the maximum likelihood estimation (MLE) methods

Step 4: Forecasting

Use the ARCH-GARCH model to forecast future volatility

Trang 10

4.1 Forecasting stock prices using ARIMA model

4.1.1 Testing for stationarity

The study uses the Dickey Fuller test to test the stationarity of the stock price series The result is presented in the table below:

Table 2 DF tests for stock price series

It is shown that in all three cases of DF tests with trend, with drift and without drift Thus, at significant level 1%, we can conclude that the stock price series is non-stationary Therefore,

it is necessary to transform this series into the stationary form to have a better forecasting This study will take the 1 difference to the original series, growth rate and log return seriesst

in ARIMA model applications

The table below shows the results of Dickey Fuller test for 1 difference, growth rate and logst

return series

Table 3 DF tests 1 difference series, growth rate series and log return series st

1 st

difference series

Growth rate series

Log return series

Table 4 DF test’s coefficient estimation results

With trend With drift Without drift

1 st difference

Lagged values -1.22494*** -1.20925*** -1.20894***

Growth rate Intercept -0.00293 0.00001

Trang 11

5 10 15 20

Lag

Series dimp

Lag

Series dimp

Lag

Series gimp

Series limp

Lag

Series gimp

Series limp

Log return

Lagged values -1.12500*** -1.11143*** -1.11119***

*,**,***: significant at 10%, 5%, 1%

From Table 3 and Table 4, at significant level 1%, we can conclude that our 1 differencest

series, growth rate series and log return series are stationary around value 0

4.1.2 Autocorrelation and Partial Autocorrelation

Figure 5 ACF and PACF correlograms of 1 difference series st

In the PACF plot, the series has partial correlation at order 1 and 4, same as autocorrelation order in the ACF plot Therefore, the 1 diffrence series has the possible values for lag orderst

p = 1, 4 and for order of moving average q = 1, 4

Figure 6 ACF and PACF correlograms of growth rate series

In the PACF plot, the series has partial correlation at order 4, same as autocorrelation order in the ACF plot Therefore, growth rate series has the possible values for lag order p = 4 and for order of moving average q = 4

Figure 7 ACF and PACF correlograms of log return series

Trang 12

autocorrelation orders are 1 and 4 Therefore, log return series has the possible values for lag order p = 4 and for order of moving average q = 1, 4

4.1.3 ARIMA Model Selection

Table 5 04 best ARIMA models for the stock price series

ARIMA (0,1,4) (1,1,3) (2,1,2) (3,1,2)

-15.6142 -16.0750 -14.5890 -15.4734

0.1436*** -0.6453*** -0.7480** -0.8402***

-0.0335 0.0465 0.6487*** 0.4924**

-Information criteria 6713.66 6716.50 6715.37 6716.40 BIC 6737.46 6740.08 6739.17 6744.16

*,**,***: significant at 10%, 5%, 1%

It can be seen that mean coefficient is not significant at level 10% in all four models above This because the degree of differencing d = 1 for the closing price series and the 1st

difference series is stationary around value 0 Comparing four models, we can firstly reject model ARIMA (1,1,3) as the AR and MA coeficients are not all significant and the AIC and BIC values are almost highest The remain models have an insignificant coefficient, so the information criteria is consider As the AIC and BIC values of ARIMA (0,1,4) are the

smallest, ARIMA (0,1,4) is the most suitable model for the closing price series of IMP

Table 6 04 best ARIMA models for grwoth rate series

ARIMA (0,0,4) (2,0,2) (3,0,2) (4,0,0)

- 0.7450*** 0.7553*** -0.0901*

- -0.6509* -0.5419** -0.0140

Trang 13

3000 0 3000

Residuals from ARIMA(0,1,4) with drift

-0.10 0.00 0.10

0 5 10 15 20 25

0 20 40 60

-6000 -3000 0 3000

*,**,***: significant at 10%, 5%, 1%

Table 7 04 best ARIMA models for log retur series

ARIMA (0,0,4) (2,0,2) (3,0,1) (3,0,2)

-0.0003 0.0000 -0.0003 -0.0003

0.0972** -0.8538*** -0.6926*** -0.8517***

-0.0009 0.7184*** - 0.5955***

-Information criteria -1914.31 -1911.94 -1909.57 -1910.48 BIC -1890.50 -1888.13 -1885.76 -1882.70

*,**,***: significant at 10%, 5%, 1%

From table 6 and table 7, it can be seen that the estimated mean coefficient is approximately zero as both growth rate and log return series are stationary around value 0 Comparing four models in table 6, we can select models ARIMA (2,0,2) as the AR and MA coeficients are all

significant and the AIC and BIC values are almost smallest Therefore, ARIMA (2,0,2) is the

most suitable model for the growth rate series of IMP.

With four models in table 7, since the AR and MA coeficients are not all significant then the information criteria is considered As the AIC and BIC values of ARIMA (0,0,4) or MA (4)

are the smallest, ARIMA (0,0,4) is the most suitable model for log return series of IMP.

4.1.4 Diagnostic Checking

ARIMA (0,1,4) for the stock price series

Figure 8 Unit circle of ARIMA (0,1,4) Figure

9 Residual series of ARIMA (0,1,4)

-1.0

-0.5

0.0

0.5

1.0

Within

Inverse MA roots

Tiêu đề	Stock Price And Return Volatility Prediction Using ARIMA And ARCH-GARCH Models
Tác giả	Nguyen Thi Lan Nhi
Người hướng dẫn	Bui Duong Hai
Trường học	National Economics University
Chuyên ngành	Econometrics
Thể loại	Assignment
Năm xuất bản	2023
Thành phố	Hanoi

Định dạng
Số trang	19
Dung lượng	1,77 MB