Thesis with the aim of focusing on two main issues. The first is time series modeling by states in which each state is a deterministic probability distribution (normal distribution). Based on the experimental results to assess the suitability of the model. Second, combine Markov chains and fuzzy time series into new models to improve forecast accuracy. Expand the model with high-level Markov chains to be compatible with seasonal data.
Trang 1MINISTRY OF EDUCATION
AND TRAINING
VIETNAM ACADEMY OF SCIENCE AND TECHNOLOGY
GRADUATE UNIVERSITY OF SCIENCE AND TECHNOLOGY
Trang 2This work is completed at:
Graduate University of Science and Technology Vietnam Academy of Science and Technology
Supervisor 1: Assoc Prof Dr Doan Van Ban
Supervisor 2: Dr Nguyen Van Hung
Reviewer 1: ………
………
Reviewer 2: ………
………
Reviewer 3: ………
………
This Dissertation will be officially presented in front of the Doctoral Dissertation Grading Committee, meeting at: Graduate University of Science and Technology Vietnam Academy of Science and Technology At ………… hrs …… day …… month…… year ……
This Dissertation is available at:
1 Library of Graduate University of Science and Technology
2 National Library of Vietnam
Trang 3LIST OF PUBLISHED WORKS
[1] Dao Xuan Ky and Luc Tri Tuyen A markov-fuzzy combination
model for stock market forecasting International Journal of Applied
athematics and Statistics TM, 55(3):109–121, 2016
of higher order markov model and fuzzy time series for stock market
forecasting” In Hội thảo lần thứ 19: Một số vấn đề chọn lọc của Công
nghệ thông tin và truyền thông, Hà Nội, pages 1–6, 2016
[3] Đào Xuân Kỳ, Lục Trí Tuyen, Phạm Quốc Vương, va Thạch Thị
Ninh Mô hinh markov-chuỗi thời gian mờ trong dự báo chứng khoán
In Hội thảo lần thứ 18: Một số vấn đề chọn lọc của Công nghệ thông
tin và truyền thông, TP HCM, pages 119–124, 2015
[4] Lục Trí Tuyen, Nguyễn Văn Hung, Thạch Thị Ninh, Phạm Quốc
Vương, Nguyễn Minh Đức, va Đào Xuân Kỳ A normal-hidden
markov model model in forecasting stock index Journal of Computer
Science and Cybernetics, 28(3):206–216, 2012
time series forecasting International Journal of Applied athematics and StatisticsTM, vol 57(3):1-18, 2018
Trang 4Introduction
The time series forcasting with preditve variable object X changing over time in order
to achieve predictive accuracy is always a challenge to scientists, not only in Vietnam but also globally Because it is not easy to find a suitable probability distribution for this
predictive variable object at the point t was born Historical data need to be collected and
analyzed, in order to find a perfect fit It is, however, a distribution can only fit with statistics in a particular time in time series analysis, and varies at other certain point of time Therefore, the use of a fixed distribution for the predictable object is not applicable for this
analysis
For the above mentioned reason, the building of predictable time series forcasting model requires connection and syncognition between historical and future statistics, in
order to set up a dependent model between data obtained at present t and in the past t-1, t-2
If the connection X t1X t12X t2pX t p t 1 t1 q t q is set up, we can generate an autoregressive integrated moving average (ARIMA) [15] model This model is applicatable widely for its practical theory and intergrated into almost current statistical software such as Eviews, SPSS, matlab, R, and etc
It is, however, many real time sequencing shows that they do not change linearly Therefore, model such as ARIMA does not suit R Parrelli pointed it out in [28] that there
is a non-linerable connection in economic or financial time series variance indicators The generalized autoregressive conditional heteroskedasticity (GARCH) [25,28] is the most popular non-linerable time series forecasting analysis to mention The limitation of this model lies in the assumption that statistics vary in a fixed distribution (normally standard distribution), while actual statistics shows that distribution is statistically significant [39] (while standard distribution has a balanced variation) Another time series forecasting is Artificial Neural Network (ANN which was developed recently ANN models do not based
on deterministic distribution of statistics; instead it functions like human brain trying to find rules and pathes to training data, experimental testing, and result summarizing ANN model
is usually used for statistics classification purpose [23] More recently, a new theory of statistical machine learning called Support Vector Machine (SVM) serving as answer to forcast and classification which caught attention of scientiests [36,11,31] SVM is applied widely in many areas such as approximate function, regression analysis and forecast [11,31] The biggest limitation of SVM is that with huge training files, it requires enomous calculation as awell as complexity of the linear regession exercise
To address the limitations and promote the strengths of exisiting models, a new and trendy research method was introduced which is called Combined Anaysis (CA) ie a combination of of different methods to increase the forecast accuracy Numerrous studies have been conducted based on this method, and many combined models have been published [43,5,6] Some methods uses the Markov chain (MC) as well as hidden Markov (HMM) Refiul Hassan [19] developed a united model by matching an HMM with an ANN and GA to generate forecast a day -1 stock price This model aims to identify similar
Trang 5patterns from historical statistics Then ANN and GA models are used to interpolate the neighbor values ò the defined statistics model Yang [41] combined the HMM model using synchoronous clustering technique to increase the accuracy of the forecasting model The weighted Markov model was used by Peng [27] in predicting and analyzing desease transmission rate in Jiangsu, China These combined models proved to bring practical and meaningful results, as well as increase the accuracy in prediction compared to traditional ones [27,41,19] The above mentioned models, despite having improved significantly in terms of accuracy in prediction, still face difficulties with fuzzy statistics (there are uncertain molecules)
To deal with fuzzy statistics, a new research direction was introduced recently, which
is called Fuzzy Time Series (FTS) The first result from this theory worth to mention is Song and Chissom [34] These studies focused on improving the Fuzzy Time Series model and finding ways for the forecasting analysis Jilani and Nan combined Heuristic model with Fuzzy Time Series model to improve the model accuracy [24] Chen and Hwang expanded the Fuzzy Time series model into Binary model [14] and then Hwang and Yu developed it into N-scale model to forecast stock indicators [21] In a recent paper [35], BaiQuing Sun has expanded the Fuzzy Time Series model into multi-order to forecast stock price in the future Qisen Cai [10] combined the Fuzzy Time Series model with ant optimization and regession to obtain a better outcome In Vietnam, the Fuzzy Time Series model was recently applied in a number of specific areas, some to mention include the study of Nguyen Duy Hieu and Partners [2] in semantic analysis Additionally, the study of Nguyen Cong Dieu [3,4] combined The Fuzzy Time Series model with techniques to adjust some parameter in maths or specific charactors of statistics aiming to the forecast accuracy The study of Nguyen Cat Ho [1] used sonographic algebra in Fuzzy Time Series model which showed the higher forecast accuracy compared to several existing modesl
Up to now, inspite of many new models combining existing one aiming to improve the forecast accuracy, there is a fact that these models are complex yet accuracy not improving Therefore, there may arise some other direction aiming to simplify the model while ensure the forecast accuracy
The objective of this dissertation focuses on two key issues Firstly, to modelize time series by states in which each is a deterministic probability distribution (standard distribution) and to evaluate the suitability of the model based on experimental results Secondly, combine Markov chain and new Fuzzy Time series models to improve the forecast accuracy In addition, to expand the advanced Markov chain model to accommodate seasonal statistics
The dissertation consists of 3 chapters – Chapter I presents overall study of Markov chain and hidden Marko and Fuzzy Time Series models; Chapter II presents time series modelling into states in which 1) each state is standard distribution vs average i, variance
2
i
, i1, 2, ,m with m is the state; 2) states over time followed Markov chain The model,
then was tested on VN-Index indicator to evaluate efficiency of model forecast Last chapter presents the analysis of limitations and unmatches between forecasting models and
Trang 6deterministic probability distribution as a motivation for the combined model proposed in Chapter 3 Chapter III presents combined Markov chain and Fuzzy Time Series models in time series forecasting This chapter also presents the expanded and advanced Markov chain with two chain concepts which are conventional higher order Markov (CMC) and improved higher order Markov (IMC) These models, then, were programmed in the R language and tested wit data sets that corresponded exactly with comparision model sets
Chapter 1 - Overview & Proposal
1.1 Markov chain
1.1.1 Definitions
Consider an economic or material system S with m possible states, denoted by I :
1, 2, , .
I m System S evolves randomly in discrete time (t0,1, 2, , , n ), calledC n and set
to a random variable coresponding to the state ò the system S at the time n(CnI)
Definition 1.1.1 Random variable sequense ( C n n, ) is a Markov chain if and only if all
(iwith a condition this probability makes sense)
Definition 1.1.2 Markov chain is considererd comprable if and only if the possiblity in (1.1.1)
is not dependent on n and non-comparable in other cases
For the time being, we consider the comparable case, in which
Pr C c C c , And matrix Γ by definition:
To define fully the development of a Markov chain, it is necessary to fix an iniital distribtuion
for state C0, for example, a vector:
1.1.2 Markov chain classification
Take iIand put d i( ) is the largest general divisor of a set of intgers n such that ( )
0
n ii
Definition 1.2.4 If d i( ) 1 , state i is considered a revolving cycle ( ) d i If d i( ) 1, then sate i is not revolving
Easy to see, if ii 0then i is not revolving However, the opposite is not pretty true
Definition 1.2.5 Markov chain of which all its states not revolving is call irrevolving Markov chain
Definition 1.2.6 A state i is called reaching state j (written i j ) if exist an integer n such that n 0.
ij
i jC means i can not reach j
Trang 7Definition 1.2.7 State i and j is called inter-connected if i j and j i , or if i j. We write
Definition 1.2.11 State iI of Markov chain (C t) is considered regressed if exists state i
jI and n such that n ji 0 Oppositely, i is called forwarding state (moving)
1.1.3 Markov matrix estimation
Consider Markov chain (C t),t1, 2, and suppose to observe n and other states
Define numbers of transfer n ij number of times that state i forwards, follwed by state
j in chain C n, then likelihood looks like:
k k n ij
ij ij
n n
n n
1.2 Hidden Markov Model
A HMM includes two basis components: chain X t t, 1, ,T consists observations and
, 1, , , {1, 2, , }
t
C i t T i m which were generated from those observations In deed, HMM model is a special case of mixed dependent model [16] and C t which are mixed components
Trang 81.2.1 Definition and Symbols
Symbols X( )t và C( )t displayhistorical statistics from point of time 1 to point of time t ,
which can be summarized as the simpliest HMM model as follows:
From now on, m distributesp x i( ) is called dependent dependencies of the model
1.2.2 Likelihood and maximum estimation of likelihood
For discrete observation X t, define u t i Pr C t i với i1, 2, , ,T we have:
P ΓP ΓP P ΓP (1.2.4) Then we have
1
L 1 ΓP x t (1.2.5)
It is easy to calculate L Tby regression algorithm To find the parameter set satisfies L T
maximal, we can perform two methods:
Trang 9Direct estimation of extreme values function L T(MLE): Firsly,from equation (1.2.5) we
need to calculate logarit of L T effectively to advantageous to find the maximum based on the progressive probabilities α t For t 0,1, , ,T we define the vector t t /w t , where
The objective of Viterbi algorithm is to find the best of state sequences i i1, , ,2 i T
corresponding to the observation sequence x x1, 2, ,x T which maximizes the function L T
1
, 1, ,
Note that, when h , h
n Γ moves towards the stop distribution of the Markov chain
Trang 101.3 Fuzzy time series
Trang 11Chapter 2 HIDDEN MARKOV MODEL IN TIMES SERIES
FORECASTING
2.1 Hidden Markov model in the time series forecasting
According to Chapper 1, HMM model consists of two basic components: the chain of observations X t t, 1, ,T and mix components C t i t, 1, , ,T i{1, 2, , }m
To illustrate the HMM model in time series forecasting easily, let us consider the above time series and denoted as X t t, 1, ,T The real problem for investors is to predict the value of the future to know how long the stock index will go from the bottom to the top From observing the fact that the stock index at a new peak will not be at that value (or fluctuate slightly around that value) forever that will go down after some time, similarly with oscillations from the bottom to the top So we can be specified Xmax is the longest time that the stock's value from the bottom to the top Then, 0 X t Xmax(see Figure 2.2.1) Investors want to regulate the state
of affairs withX t, such as "wait fast", "wait quite fast", "wait long", "wait very long" but do not know how to define To solve this problem, we consider each of these states a Poisson distribution with the mean (also the variance) i,i1, 2,3, 4 and is "hidden" in the chain X t Assuming that these states follow a Markov chain, We have a hidden Markov model for the time series forcasting problem
Figure 2.1 1 The Definition of the time series forecasting
2.1.1 HMM model with Poisson distribution
To apply the HMM model for time series forecasting, The dissertation illustrates both parametric estimation methods described in Section 1.3.2 Chapter 1 For MLE estimation, the dissertation performs programming on R for the HMM model with the state as the Poisson distribution Poisson distribution has the parametter 0 both mean and variance Parameter estimation by MLE method is as follows:
Algorithm 2.1 Maximum reasonable function
Input: x,m, lambda0,gamma0
Output: m, lambda0, gamma0, BIC, AIC, mllk
1: procedure POIS.HMM.MLE (x,m, lambda0,gamma0, )
2: parvect0← pois.HMM.pn2pw(m, lambda0,gamma0) { Change model
to free parameter }
3: mod ←nlm(pois.HMM.mllk, parvect0,x = x,m = m) { Estimate the
parameter as a reasonable maximum function }
Trang 124: pn← pois.HMM.pw2pn(m,mod$estimate) { Change the free parameter to
the model parameter pm }
5: mllk ←mod$minimum { Get the max value assigned to mllk }
6: np←length(parvect0) { Count of model parameters }
7: AIC < −2 ∗ (mllk+np) { Calculate the standard AIC }
8: n < −sum(!is.na(x)) { Calculate the number of observations }
9: BIC < −2 ∗mllk+np ∗ log(n) { Calculate the standard BIC }
10: return (lambda, gamma, mllk, AIC, BIC)
2.1.2 HMM model with normal distribution
In the model with normal distribution, the parameters of the Markov chain are still the same, but the parameter of the mix distribution is the mean and the variance, while the number
of states of the model is also the stop distribution of the Markov chain
Calculations of FWP and BWP are performed by the normalization function HMM.lalphabeta (logarithm of FWP and BWP) In which, lalpha, lbeta is the log of FWP and BWP respectively
Algorithm 2.3 Calculate the forward and backward probabilities of LT
Input: x,m,mu, sigma,gamma,delta
Output: lalpha, lb = lbeta
1: procedure NORM.HMM.LALPHABETA(x,m,mu, sigma,gamma,delta )
2: if (is.null(delta)) then delta←solve(t(diag(m)−gamma+1), rep(1,m)) { In the
unlikely event of the initial distribution of the Markov chain}
3: Calculate the probabilities of FWP in (1.2.6) for lalpha
4: Calculates probabilities for BWP in (1.2.7) for lbeta
5: return list(la = lalpha, lb = lbeta)
Here, according to the EM algorithm in Section 1.3.2 of Chapter 1, we can immediately perform the parameter estimation by norm.HMM.EM
Algorithm 2.4 Algorithm EM for Normal-HMM
Input: x,m,mu(), sigma(),gamma(),delta(),maxiter, tol
Output: mu, sigma, gamma, delta, mllk, AIC, BIC
1: procedure NORM.HMM.EM(x,m,mu, sigma,gamma,delta,maxiter, tol )
2: mu.next ←mu(); sigma ←sigma();delta ←delta() { Assign a parameter to the original value }
3: for iter in 1 : maxiter do
4: f b←norm.HMM.lalphabeta(x,m,mu, sigma,gamma,delta= delta) {Calculate
FWP and BWP}
5: llk ←reasonable function value
Trang 138: Calculate gamma[ j,k]
9: Calculate mu[j]
10: Calculate sigma [ j]
11: Calculate delta
12: crit ← sum(abs(mu[j] – mu()[j])) + sum(abs(gamma[jk] – gamma()[jk])) + sum(abs(delta[j] –delta()[j]))+sum(abs(sigma[j]−sigma()[j])) { the converge criteria } 13: if crit < tol then 14: AIC← -2 ∗ (llk−np) { the criteria AIC} 15: BIC← -2 ∗ llk+np ∗ log(n) {the criteria BIC}
16: return (mu, sigma, gamma, delta, mllk, AIC , BIC) 17: else { If not converged } mu0←mu; sigma0←sigma; gamma0←gamma; delta0←delta { Reassign the new original parameter } 18: Not converging later, “maxiter”, loop 2.2 Experimental results for HMM with Poisson distribution 2.2.1 Parameter estimation Table 2.2.1 Estimate parameters of model Poisson-HMM for time.b.to.t with states m=2,3,4,5
2
11,46267 40,90969
0,6914086 0,3085914
0,8 0,2
3
1
2
3
5,78732 21,75877 57,17104
0,3587816 0,5121152 0,1291032
0,46 0,47 0,07 0,33 0,47 0,02
171,1243
4
1
2
3
4
5,339722 16,943339 27,711948 58,394102
0,3189824 0,3159413 0,2301279 0,1349484
0,4 0,46 0,07 0,07 0,53 0,29 0,18 0
0 0 0,51 0,49 0,19 0,56 0,25 0
159,898
5
1
2
3
4
5
5,226109 15,679316 25,435562 38,459987 67,708874
0,31513881 0,28158191 0,22224329 0,10376304 0,07727294
154,6275
Table 2.2.2 Mean and variance compared with sample