Summary of Mathematics Doctoral Dissertation: Apply Markov chains model and fuzzy time series for forecasting

Thesis with the aim of focusing on two main issues. The first is time series modeling by states in which each state is a deterministic probability distribution (normal distribution). Based on the experimental results to assess the suitability of the model. Second, combine Markov chains and fuzzy time series into new models to improve forecast accuracy. Expand the model with high-level Markov chains to be compatible with seasonal data.

Trang 1

MINISTRY OF EDUCATION

AND TRAINING

VIETNAM ACADEMY OF SCIENCE AND TECHNOLOGY

GRADUATE UNIVERSITY OF SCIENCE AND TECHNOLOGY

Trang 2

This work is completed at:

Graduate University of Science and Technology Vietnam Academy of Science and Technology

Supervisor 1: Assoc Prof Dr Doan Van Ban

Supervisor 2: Dr Nguyen Van Hung

Reviewer 1: ………

………

This Dissertation will be officially presented in front of the Doctoral Dissertation Grading Committee, meeting at: Graduate University of Science and Technology Vietnam Academy of Science and Technology At ………… hrs …… day …… month…… year ……

This Dissertation is available at:

1 Library of Graduate University of Science and Technology

2 National Library of Vietnam

Trang 3

LIST OF PUBLISHED WORKS

[1] Dao Xuan Ky and Luc Tri Tuyen A markov-fuzzy combination

model for stock market forecasting International Journal of Applied

athematics and Statistics TM, 55(3):109–121, 2016

of higher order markov model and fuzzy time series for stock market

forecasting” In Hội thảo lần thứ 19: Một số vấn đề chọn lọc của Công

nghệ thông tin và truyền thông, Hà Nội, pages 1–6, 2016

[3] Đào Xuân Kỳ, Lục Trí Tuyen, Phạm Quốc Vương, va Thạch Thị

Ninh Mô hinh markov-chuỗi thời gian mờ trong dự báo chứng khoán

In Hội thảo lần thứ 18: Một số vấn đề chọn lọc của Công nghệ thông

tin và truyền thông, TP HCM, pages 119–124, 2015

[4] Lục Trí Tuyen, Nguyễn Văn Hung, Thạch Thị Ninh, Phạm Quốc

Vương, Nguyễn Minh Đức, va Đào Xuân Kỳ A normal-hidden

markov model model in forecasting stock index Journal of Computer

Science and Cybernetics, 28(3):206–216, 2012

time series forecasting International Journal of Applied athematics and StatisticsTM, vol 57(3):1-18, 2018

Trang 4

Introduction

The time series forcasting with preditve variable object X changing over time in order

to achieve predictive accuracy is always a challenge to scientists, not only in Vietnam but also globally Because it is not easy to find a suitable probability distribution for this

predictive variable object at the point t was born Historical data need to be collected and

analyzed, in order to find a perfect fit It is, however, a distribution can only fit with statistics in a particular time in time series analysis, and varies at other certain point of time Therefore, the use of a fixed distribution for the predictable object is not applicable for this

analysis

For the above mentioned reason, the building of predictable time series forcasting model requires connection and syncognition between historical and future statistics, in

order to set up a dependent model between data obtained at present t and in the past t-1, t-2

If the connection X t1X t12X t2pX t p     t 1 t1 q t q is set up, we can generate an autoregressive integrated moving average (ARIMA) [15] model This model is applicatable widely for its practical theory and intergrated into almost current statistical software such as Eviews, SPSS, matlab, R, and etc

It is, however, many real time sequencing shows that they do not change linearly Therefore, model such as ARIMA does not suit R Parrelli pointed it out in [28] that there

is a non-linerable connection in economic or financial time series variance indicators The generalized autoregressive conditional heteroskedasticity (GARCH) [25,28] is the most popular non-linerable time series forecasting analysis to mention The limitation of this model lies in the assumption that statistics vary in a fixed distribution (normally standard distribution), while actual statistics shows that distribution is statistically significant [39] (while standard distribution has a balanced variation) Another time series forecasting is Artificial Neural Network (ANN which was developed recently ANN models do not based

on deterministic distribution of statistics; instead it functions like human brain trying to find rules and pathes to training data, experimental testing, and result summarizing ANN model

is usually used for statistics classification purpose [23] More recently, a new theory of statistical machine learning called Support Vector Machine (SVM) serving as answer to forcast and classification which caught attention of scientiests [36,11,31] SVM is applied widely in many areas such as approximate function, regression analysis and forecast [11,31] The biggest limitation of SVM is that with huge training files, it requires enomous calculation as awell as complexity of the linear regession exercise

To address the limitations and promote the strengths of exisiting models, a new and trendy research method was introduced which is called Combined Anaysis (CA) ie a combination of of different methods to increase the forecast accuracy Numerrous studies have been conducted based on this method, and many combined models have been published [43,5,6] Some methods uses the Markov chain (MC) as well as hidden Markov (HMM) Refiul Hassan [19] developed a united model by matching an HMM with an ANN and GA to generate forecast a day -1 stock price This model aims to identify similar

Trang 5

patterns from historical statistics Then ANN and GA models are used to interpolate the neighbor values ò the defined statistics model Yang [41] combined the HMM model using synchoronous clustering technique to increase the accuracy of the forecasting model The weighted Markov model was used by Peng [27] in predicting and analyzing desease transmission rate in Jiangsu, China These combined models proved to bring practical and meaningful results, as well as increase the accuracy in prediction compared to traditional ones [27,41,19] The above mentioned models, despite having improved significantly in terms of accuracy in prediction, still face difficulties with fuzzy statistics (there are uncertain molecules)

To deal with fuzzy statistics, a new research direction was introduced recently, which

is called Fuzzy Time Series (FTS) The first result from this theory worth to mention is Song and Chissom [34] These studies focused on improving the Fuzzy Time Series model and finding ways for the forecasting analysis Jilani and Nan combined Heuristic model with Fuzzy Time Series model to improve the model accuracy [24] Chen and Hwang expanded the Fuzzy Time series model into Binary model [14] and then Hwang and Yu developed it into N-scale model to forecast stock indicators [21] In a recent paper [35], BaiQuing Sun has expanded the Fuzzy Time Series model into multi-order to forecast stock price in the future Qisen Cai [10] combined the Fuzzy Time Series model with ant optimization and regession to obtain a better outcome In Vietnam, the Fuzzy Time Series model was recently applied in a number of specific areas, some to mention include the study of Nguyen Duy Hieu and Partners [2] in semantic analysis Additionally, the study of Nguyen Cong Dieu [3,4] combined The Fuzzy Time Series model with techniques to adjust some parameter in maths or specific charactors of statistics aiming to the forecast accuracy The study of Nguyen Cat Ho [1] used sonographic algebra in Fuzzy Time Series model which showed the higher forecast accuracy compared to several existing modesl

Up to now, inspite of many new models combining existing one aiming to improve the forecast accuracy, there is a fact that these models are complex yet accuracy not improving Therefore, there may arise some other direction aiming to simplify the model while ensure the forecast accuracy

The objective of this dissertation focuses on two key issues Firstly, to modelize time series by states in which each is a deterministic probability distribution (standard distribution) and to evaluate the suitability of the model based on experimental results Secondly, combine Markov chain and new Fuzzy Time series models to improve the forecast accuracy In addition, to expand the advanced Markov chain model to accommodate seasonal statistics

The dissertation consists of 3 chapters – Chapter I presents overall study of Markov chain and hidden Marko and Fuzzy Time Series models; Chapter II presents time series modelling into states in which 1) each state is standard distribution vs average i, variance

2

i

 , i1, 2, ,m with m is the state; 2) states over time followed Markov chain The model,

then was tested on VN-Index indicator to evaluate efficiency of model forecast Last chapter presents the analysis of limitations and unmatches between forecasting models and

Trang 6

deterministic probability distribution as a motivation for the combined model proposed in Chapter 3 Chapter III presents combined Markov chain and Fuzzy Time Series models in time series forecasting This chapter also presents the expanded and advanced Markov chain with two chain concepts which are conventional higher order Markov (CMC) and improved higher order Markov (IMC) These models, then, were programmed in the R language and tested wit data sets that corresponded exactly with comparision model sets

Chapter 1 - Overview & Proposal

1.1 Markov chain

1.1.1 Definitions

Consider an economic or material system S with m possible states, denoted by I :

1, 2, , .

I  m System S evolves randomly in discrete time (t0,1, 2, , , n ), calledC n and set

to a random variable coresponding to the state ò the system S at the time n(CnI)

Definition 1.1.1 Random variable sequense ( C n n,  ) is a Markov chain if and only if all

(iwith a condition this probability makes sense)

Definition 1.1.2 Markov chain is considererd comprable if and only if the possiblity in (1.1.1)

is not dependent on n and non-comparable in other cases

For the time being, we consider the comparable case, in which

Pr C c C c   , And matrix Γ by definition:

To define fully the development of a Markov chain, it is necessary to fix an iniital distribtuion

for state C0, for example, a vector:

1.1.2 Markov chain classification

Take iIand put d i( ) is the largest general divisor of a set of intgers n such that ( )

0

n ii

 

Definition 1.2.4 If d i( ) 1 , state i is considered a revolving cycle ( ) d i If d i( ) 1, then sate i is not revolving

Easy to see, if ii 0then i is not revolving However, the opposite is not pretty true

Definition 1.2.5 Markov chain of which all its states not revolving is call irrevolving Markov chain

Definition 1.2.6 A state i is called reaching state j (written i j ) if exist an integer n such that n 0.

ij

 

i jC means i can not reach j

Trang 7

Definition 1.2.7 State i and j is called inter-connected if i j and j i , or if i j. We write

Definition 1.2.11 State iI of Markov chain (C t) is considered regressed if exists state i

jI and n such that n ji  0 Oppositely, i is called forwarding state (moving)

1.1.3 Markov matrix estimation

Consider Markov chain (C t),t1, 2, and suppose to observe n and other states



Define numbers of transfer n ij  number of times that state i forwards, follwed by state

j in chain C n, then likelihood looks like:

k k n ij

ij ij

n n







1.2 Hidden Markov Model

A HMM includes two basis components: chain X t t, 1, ,T consists observations and

, 1, , , {1, 2, , }

t

C i t T i m which were generated from those observations In deed, HMM model is a special case of mixed dependent model [16] and C t which are mixed components

Trang 8

1.2.1 Definition and Symbols

Symbols X( )t và C( )t displayhistorical statistics from point of time 1 to point of time t ,

which can be summarized as the simpliest HMM model as follows:

From now on, m distributesp x i( ) is called dependent dependencies of the model

1.2.2 Likelihood and maximum estimation of likelihood

For discrete observation X t, define u t i Pr C t i với i1, 2, , ,T we have:



 P ΓP ΓP  P ΓP (1.2.4) Then we have

1

L  1   ΓP x t (1.2.5)

It is easy to calculate L Tby regression algorithm To find the parameter set satisfies L T

maximal, we can perform two methods:

Trang 9

Direct estimation of extreme values function L T(MLE): Firsly,from equation (1.2.5) we

need to calculate logarit of L T effectively to advantageous to find the maximum based on the progressive probabilities α t For t 0,1, , ,T we define the vector  t  t /w t , where

The objective of Viterbi algorithm is to find the best of state sequences i i1, , ,2 i T

corresponding to the observation sequence x x1, 2, ,x T which maximizes the function L T

1

, 1, ,

Note that, when h ,  h

n Γ moves towards the stop distribution of the Markov chain

Trang 10

1.3 Fuzzy time series

Trang 11

Chapter 2 HIDDEN MARKOV MODEL IN TIMES SERIES

FORECASTING

2.1 Hidden Markov model in the time series forecasting

According to Chapper 1, HMM model consists of two basic components: the chain of observations X t t, 1, ,T and mix components C t i t, 1, , ,T i{1, 2, , }m

To illustrate the HMM model in time series forecasting easily, let us consider the above time series and denoted as X t t, 1, ,T The real problem for investors is to predict the value of the future to know how long the stock index will go from the bottom to the top From observing the fact that the stock index at a new peak will not be at that value (or fluctuate slightly around that value) forever that will go down after some time, similarly with oscillations from the bottom to the top So we can be specified Xmax is the longest time that the stock's value from the bottom to the top Then, 0 X t  Xmax(see Figure 2.2.1) Investors want to regulate the state

of affairs withX t, such as "wait fast", "wait quite fast", "wait long", "wait very long" but do not know how to define To solve this problem, we consider each of these states a Poisson distribution with the mean (also the variance) i,i1, 2,3, 4 and is "hidden" in the chain X t Assuming that these states follow a Markov chain, We have a hidden Markov model for the time series forcasting problem

Figure 2.1 1 The Definition of the time series forecasting

2.1.1 HMM model with Poisson distribution

To apply the HMM model for time series forecasting, The dissertation illustrates both parametric estimation methods described in Section 1.3.2 Chapter 1 For MLE estimation, the dissertation performs programming on R for the HMM model with the state as the Poisson distribution Poisson distribution has the parametter 0 both mean and variance Parameter estimation by MLE method is as follows:

Algorithm 2.1 Maximum reasonable function

Input: x,m, lambda0,gamma0

Output: m, lambda0, gamma0, BIC, AIC, mllk

1: procedure POIS.HMM.MLE (x,m, lambda0,gamma0, )

2: parvect0← pois.HMM.pn2pw(m, lambda0,gamma0) { Change model

to free parameter }

3: mod ←nlm(pois.HMM.mllk, parvect0,x = x,m = m) { Estimate the

parameter as a reasonable maximum function }

Trang 12

4: pn← pois.HMM.pw2pn(m,mod$estimate) { Change the free parameter to

the model parameter pm }

5: mllk ←mod$minimum { Get the max value assigned to mllk }

6: np←length(parvect0) { Count of model parameters }

7: AIC < −2 ∗ (mllk+np) { Calculate the standard AIC }

8: n < −sum(!is.na(x)) { Calculate the number of observations }

9: BIC < −2 ∗mllk+np ∗ log(n) { Calculate the standard BIC }

10: return (lambda, gamma, mllk, AIC, BIC)

2.1.2 HMM model with normal distribution

In the model with normal distribution, the parameters of the Markov chain are still the same, but the parameter of the mix distribution is the mean and the variance, while the number

of states of the model is also the stop distribution of the Markov chain

Calculations of FWP and BWP are performed by the normalization function HMM.lalphabeta (logarithm of FWP and BWP) In which, lalpha, lbeta is the log of FWP and BWP respectively

Algorithm 2.3 Calculate the forward and backward probabilities of LT

Input: x,m,mu, sigma,gamma,delta

Output: lalpha, lb = lbeta

1: procedure NORM.HMM.LALPHABETA(x,m,mu, sigma,gamma,delta )

2: if (is.null(delta)) then delta←solve(t(diag(m)−gamma+1), rep(1,m)) { In the

unlikely event of the initial distribution of the Markov chain}

3: Calculate the probabilities of FWP in (1.2.6) for lalpha

4: Calculates probabilities for BWP in (1.2.7) for lbeta

5: return list(la = lalpha, lb = lbeta)

Here, according to the EM algorithm in Section 1.3.2 of Chapter 1, we can immediately perform the parameter estimation by norm.HMM.EM

Algorithm 2.4 Algorithm EM for Normal-HMM

Input: x,m,mu(), sigma(),gamma(),delta(),maxiter, tol

Output: mu, sigma, gamma, delta, mllk, AIC, BIC

1: procedure NORM.HMM.EM(x,m,mu, sigma,gamma,delta,maxiter, tol )

2: mu.next ←mu(); sigma ←sigma();delta ←delta() { Assign a parameter to the original value }

3: for iter in 1 : maxiter do

4: f b←norm.HMM.lalphabeta(x,m,mu, sigma,gamma,delta= delta) {Calculate

FWP and BWP}

5: llk ←reasonable function value

Trang 13

8: Calculate gamma[ j,k]

9: Calculate mu[j]

10: Calculate sigma [ j]

11: Calculate delta

12: crit ← sum(abs(mu[j] – mu()[j])) + sum(abs(gamma[jk] – gamma()[jk])) + sum(abs(delta[j] –delta()[j]))+sum(abs(sigma[j]−sigma()[j])) { the converge criteria } 13: if crit < tol then 14: AIC← -2 ∗ (llk−np) { the criteria AIC} 15: BIC← -2 ∗ llk+np ∗ log(n) {the criteria BIC}

16: return (mu, sigma, gamma, delta, mllk, AIC , BIC) 17: else { If not converged } mu0←mu; sigma0←sigma; gamma0←gamma; delta0←delta { Reassign the new original parameter } 18: Not converging later, “maxiter”, loop 2.2 Experimental results for HMM with Poisson distribution 2.2.1 Parameter estimation Table 2.2.1 Estimate parameters of model Poisson-HMM for time.b.to.t with states m=2,3,4,5

2

11,46267 40,90969

0,6914086 0,3085914

0,8 0,2

3

1

2

3

5,78732 21,75877 57,17104

0,3587816 0,5121152 0,1291032

0,46 0,47 0,07 0,33 0,47 0,02

171,1243

4

1

2

3

4

5,339722 16,943339 27,711948 58,394102

0,3189824 0,3159413 0,2301279 0,1349484

0,4 0,46 0,07 0,07 0,53 0,29 0,18 0

0 0 0,51 0,49 0,19 0,56 0,25 0

159,898

5

1

2

3

4

5

5,226109 15,679316 25,435562 38,459987 67,708874

0,31513881 0,28158191 0,22224329 0,10376304 0,07727294

154,6275

Table 2.2.2 Mean and variance compared with sample

Định dạng
Số trang	27
Dung lượng	1,28 MB