1. Trang chủ
  2. » Thể loại khác

DSpace at VNU: HIFCF: An effective hybrid model between picture fuzzy clustering and intuitionistic fuzzy recommender systems for medical diagnosis

20 159 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 20
Dung lượng 4,32 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

This kind of systems often provides the medical diagnosis function based on the historic clinical symptoms of patients to give a list of possible diseases accompanied with the mem-bershi

Trang 1

HIFCF: An effective hybrid model between picture fuzzy clustering

and intuitionistic fuzzy recommender systems for medical diagnosis

VNU University of Science, Vietnam National University, Hanoi, Viet Nam

a r t i c l e i n f o

Article history:

Available online 31 December 2014

Keywords:

Fuzzy sets

Hybrid Intuitionistic Fuzzy Collaborative

Filtering

Intuitionistic fuzzy recommender systems

Medical diagnosis

Picture fuzzy clustering

a b s t r a c t The health care support system is a special type of recommender systems that play an important role in medical sciences nowadays This kind of systems often provides the medical diagnosis function based on the historic clinical symptoms of patients to give a list of possible diseases accompanied with the mem-bership values The most acquiring disease from that list is then determined by clinicians’ experience expressed through a specific defuzzification method An important issue in the health care support sys-tem is increasing the accuracy of the medical diagnosis function that involves the cooperation of fuzzy systems and recommender systems in the sense that uncertain behaviors of symptoms and the clinicians’ experience are represented by fuzzy memberships whilst the determination of the possible diseases is conducted by the prediction capability of recommender systems Intuitionistic fuzzy recommender sys-tems (IFRS) are such the combination, which results in better accuracy of prediction than the relevant methods constructed on either the traditional fuzzy sets or recommender system only Based upon the observation that the calculation of similarity in IFRS could be enhanced by the integration with the infor-mation of possibility of patients belonging to clusters specified by a fuzzy clustering method, in this paper

we propose a novel hybrid model between picture fuzzy clustering and intuitionistic fuzzy recommender systems for medical diagnosis so-called HIFCF (Hybrid Intuitionistic Fuzzy Collaborative Filtering) Exper-imental results reveal that HIFCF obtains better accuracy than IFCF and the standalone methods of intui-tionistic fuzzy sets such as De, Biswas & Roy, Szmidt & Kacprzyk, Samuel & Balamurugan and recommender systems, e.g Davis et al and Hassan & Syed The significance and impact of the new method contribute not only the theoretical aspects of recommender systems but also the applicable roles

to the health care support systems

 2014 Elsevier Ltd All rights reserved

1 Introduction

In recent years, the health care support system or the clinical

decision support system has emerged as an important tool in

ical sciences to assist clinicians in decision making especially

med-ical diagnosis specifying which diseases could be found from a list

of measured symptoms of a patient as well as the most acquiring

disease among them Physicians, nurses and other healthcare

pro-fessionals use the health care support system to prepare a

diagno-sis and to review the diagnodiagno-sis as a means of improving the final

result According to Basu, Fevrier-Thomas, and Sartipi (2011),

as computer applications that support and assist clinicians in improved decision-making by providing evidence-based knowl-edge with respect to patient data This type of computer-based sys-tem consists of three components: a language syssys-tem, a knowledge system and a problem processing system It is able to handle com-plex problems, applying domain-specific expertise to assess the consequences of executing its recommendations There are two main types of the health care support system (Rouse, 2014) The first one uses a knowledge base, applies rules to patient data using

an inference engine and displays the results to the end user Sys-tems without a knowledge base, on the other hand, rely on machine learning to analyze clinical data (Fig 1) Machine learning methods are conducted to examine patients’ medical history in conjunction with relevant clinical researches, which are able to predict potential events ranging from drug interactions to disease symptoms Utilizing the medical diagnosis process, characteristics

of an individual patient are matched to a computerized clinical

http://dx.doi.org/10.1016/j.eswa.2014.12.042

0957-4174/ 2014 Elsevier Ltd All rights reserved.

⇑Corresponding author at: 334 Nguyen Trai, Thanh Xuan, Hanoi, Viet Nam Tel.:

+84 904 171 284.

E-mail addresses: nguyenthothongtt89@gmail.com (N.T Thong), sonlh@vnu.

edu.vn , chinhson2002@gmail.com (L.H Son).

Contents lists available atScienceDirect

Expert Systems with Applications

j o u r n a l h o m e p a g e : w w w e l s e v i e r c o m / l o c a t e / e s w a

Trang 2

knowledge base and patient-specific assessment and

recommen-dations are then presented to the clinical or the patient for a

deci-sion (Rajalakshmi, Mohan, & Babu, 2011)

An important issue in the health care support system is

increas-ing the accuracy of the medical diagnosis Previous researches

con-centrated on improving the machine learning methods/knowledge

systems appeared in Phase 2 of the medical diagnosis process in

 A hybrid evolutionary algorithm between genetic programming

and genetic algorithms (Tan, Yu, Heng, & Lee, 2003)

 Genetic algorithm (Anbarasi, Anupriya, & Iyengar, 2010)

 The combination of a type-2 fuzzy logic with genetic algorithm

 An evolutionary artificial neural network approach based on the

Pareto differential evolution algorithm augmented with local

search (Abbass, 2002)

 Complex modular neural network (Kala, Janghel, Tiwari, &

 Bayesian networks (Gevaert, De Smet, Timmerman, Moreau, &

 C4.5 Rule-PANE, which combines an artificial neural network

ensemble with rule induction (Zhou & Jiang, 2003)

 Support vector machines (Kampouraki, Vassis, Belsis, &

However, these methods often fail to achieve high accuracy of

prediction with real medical diagnosis datasets This is because

that the relations between the patients – the symptoms and the

symptoms – the diseases (Fig 1) are often vague, imprecise and

uncertain For instance, doctors could faced with patients who

are likely to have personal problems and/or mental disorders so

that the crucial patients’ signs and symptoms are missing,

incom-plete and vague even though the supports of patients’ medical

his-tories and physical examination are provided within the diagnosis

Even if information of patients are clearly provided, how to give

accurate evaluation to given symptoms/diseases is another

chal-lenge requiring well-trained, copious-experienced physicians

These evidences raise the need of using fuzzy set or its extension

to model and assist the techniques that improve the accuracy of

diagnosis The definition of fuzzy set is stated below

Definition 1 A Fuzzy Set (FS) (Zadeh, 1965) in a non-empty set X is

a function

x#lðxÞ;

wherelðxÞ is the membership degree of each element x 2 X A fuzzy set can be alternately defined as,

A ¼ fhx;lðxÞijx 2 Xg: ð2Þ

An extension of FS that is widely applied to the medical prognosis problem is Intuitionistic Fuzzy Set (IFS), which is defined as follows

Definition 2 An Intuitionistic Fuzzy Set (IFS) (Atanassov, 1986) in a non-empty set X is,

eA ¼ hx;n leAðxÞ;ceAðxÞijx 2 Xo

wherele

AðxÞ andceAðxÞ are the membership and non-membership degrees of each element x 2 X, respectively

leAðxÞ;ceAðxÞ 2 ½0; 1; 8x 2 X; ð4Þ

0 6leAðxÞ þceAðxÞ 6 1; 8x 2 X: ð5Þ

The intuitionistic fuzzy index of an element showing the non-deter-minacy is denoted as,

peAðxÞ ¼ 1 leAðxÞ þceAðxÞ; 8x 2 X: ð6Þ

whenpe

AðxÞ ¼ 0 for8x 2 X, IFS returns to the FS set of Zadeh Various researches utilizing FS and IFS for the medical diagnosis process can be found in the literature.De, Biswas, and Roy (2001) extended the Sanchez’s approach with the notion of intuitionistic fuzzy set theory for medical diagnosis The information of symp-toms – patients and sympsymp-toms – diseases are fuzzified by intui-tionistic fuzzy memberships, and the possibilities of acquired diseases are calculated based on those membership values and intuitionistic fuzzy relations.Szmidt and Kacprzyk (2001), Szmidt

of intuitionistic fuzzy set to express new aspects of imperfect infor-mation between the sets of symptoms and diagnoses and defined a new similarity measure between intuitionistic fuzzy sets for the

and intuitionistic fuzzy sets to encounter uncertainty in medical pattern recognition The experimental results showed that both fuzzy sets and intuitionistic fuzzy sets have powerful capabilities

to cope with the uncertainty in the medical pattern recognition problems but intuitionistic fuzzy sets especially the measure of Hausdorf and Mitchel yield better detection rate as a result of more accurate modeling which is involved with incurring more compu-tational cost.Own (2009)studied the switching relation between type-2 fuzzy sets and intuitionistic fuzzy sets to deal with the

for medical diagnosis without concerning about how to calculate the best membership function for each fuzzy data.Neog and Sut

extended Sanchez’s approach for medical diagnosis using the notion of fuzzy soft complement.Xiao et al (2012)proposed the concept of D–S generalized fuzzy soft sets by combining Demp-ster–Shafer theory of evidence and generalized fuzzy soft sets A new method of evaluation based on D–S generalized fuzzy soft sets

intuition-istic fuzzy soft set and a new scoring function to compare two

Clinical Data (Patients-Symptoms)

Knowledge System Machine Learning

Consequent Rules

Results (Patients-Diseases)

Phase 1

Phase 2

Phase 3

Phase 4

Trang 3

intuitionistic fuzzy numbers for multi-criteria medical diagnosis.

Sanchez’s approach for medical diagnosis through the arithmetic

mean of an interval valued fuzzy matrix, which is a simpler

tech-nique than that of using intuitionistic fuzzy sets Ahn, Han, Oh,

degrees based on the relation between symptoms and diseases

(three types of headache), and utilized the interval-valued

intui-tionistic fuzzy weighted arithmetic average operator to aggregate

fuzzy information from the symptoms A measure based on

dis-tance between interval-valued intuitionistic fuzzy sets for medical

proposed a new technique named intuitionistic fuzzy max–min

composition to study the Sanchez’s approach for medical

intuitionistic fuzzy sets and fuzzy multisets of Yager Intuitionistic

fuzzy multisets are characterized by the count membership and

the count non-membership functions, and when the sum of these

functions is equal to one, intuitionistic fuzzy multisets returns to

intuitionistic fuzzy sets Intuitionistic fuzzy multisets are used to

model the symptoms by various timestamps Other recent works

could be found inAhn (2014), Bora, Bora, Neog, and Sut (2014),

Bourgani, Stylios, Manis, and Georgopoulos (2014), Das and Kar

(2014), Muthuvijayalakshmi, Kumar, and Venkatesan (2014),

Nguyen, Khosravi, Creighton, and Nahavandi (2014), Sanz, Galar,

Jurio, Brugos, Pagola, et al (2014), Shanmugasundaram and

The limitations of the relevant researches utilizing FS and IFS for

the medical diagnosis process are: Firstly, these works calculate the

relation between the patients and the diseases solely from those

between the patients – the symptoms and the symptoms – the

dis-eases In some practical cases where the relation between the

patients – the symptoms or the symptoms – the diseases is

miss-ing, those works could not be performed This fact is happened in

reality since clinicians somehow do not accurately express the

val-ues of membership and non-membership degrees of symptoms to

diseases or vive versa; secondly, the information of previous

diag-noses of patients could not be utilized That is to say, a patient

has had some records in the patients-diseases databases

before-hand Nevertheless, the calculation of the next records of this

patient is made solely on the basis of both the relations between

the patients – the symptoms and the symptoms – the diseases

Historic diagnoses of patients are not taken into account so that

the accuracy of diagnosis may not be high as a result; thirdly, the

determination of the most acquiring disease is dependent from

the defuzzification method For instance, De et al (2001) used

the hybrid function of membership and non-membership values

for the defuzzification,Samuel and Balamurugan (2012)relied on

the reduction matrix from WPD andSzmidt and Kacprzyk (2001),

Szmidt and Kacprzyk (2003, Szmidt and Kacprzyk (2004), Khatibi

distance functions Independent determination from the

defuzzifi-cation method should be investigated for the stable performance of

the algorithm

Due to these reasons, a combination of fuzzy sets and a machine

learning method is a good choice to eliminate the disadvantages of

the relevant works using FS and IFS Recommender Systems – RS

method, which can give users information about predictive

‘‘rat-ing’’ or ‘‘preference’’ that they would like to assess an item; thus

helping them to choose the appropriate item among numerous

possibilities This kind of expert systems is now commonly

popu-larized in numerous application fields such as books, documents,

images, movie, music, shopping and TV programs personalized

sys-tems Recommender Systems have been applied to medical

proposed CARE, a Collaborative Assessment and Recommendation Engine, which relies only on a patient’s medical history in order to predict future diseases risks and combines collaborative filtering methods with clustering to predict each patient’s greatest disease risks based on their own medical history and that of similar patients An iterative version of CARE so-called ICARE that incorpo-rates ensemble concepts for improved performance was also intro-duced These systems required no specialized information and provided predictions for medical conditions of all kinds in a single

framework expressed in Eq.(7)that assessed patient risk both by matching new cases to historical records and by matching patient demographics to adverse outcomes so that it could achieve a higher predictive accuracy for both sudden cardiac death and

approaches such as logistic regression and support vector machines

Rða; iÞ ¼ raþ

P

b2UnfagSIMða; bÞ  ðrb;i  rbÞ P

b2UnfagjSIMða; bÞj ; ð7Þ

where a; b are patients and iis the considered disease The similar-ity between two patients – SIMða; bÞ is calculated by the Pearson coefficient from the demographic information of patients Rða; i

Þ and rb;i are the possibilities of acquiring disease iof patient a and

b, respectively raand rb are the average possibilities of acquiring all diseases of patient a and b, respectively More works on the appli-cations of RS to the medical diagnosis could be referenced inDuan, Street, and Xu (2011), Meisamshabanpoor and Mahdavi (2012),

and Chau (2010), Son, Cuong, Lanzi, and Thong (2012), Son, Lanzi, Cuong, and Hung (2012), Son, Cuong, and Long (2013), Son, Linh, and Long (2014), Thong and Son (2014), Son (2014a), Son (2014b,

The standalone RS methods such as the works ofDavis et al

one Moreover, they work only if the historic diagnoses of patients for the prediction are provided, and their accuracies of diagnosis are depended on the defuzzification method Therefore, a coopera-tion of fuzzy systems and recommender systems is regarded as an effective strategy to exclude the drawbacks of both the researches using FS and IFS only in the sense that uncertain behaviors of symptoms and the clinicians’ experience are represented by fuzzy memberships whilst the determination of the possible diseases is conducted by the prediction capability of recommender systems Intuitionistic fuzzy recommender systems – IFRS (Son & Thong,

2015) are such the combination, which results in better accuracy

of prediction than the relevant standalone methods constructed

on either the traditional fuzzy sets or recommender systems only This work is the first effort to initiate fuzzy-based recommender systems for the health care support system In this research, new definitions of single-criterion IFRS (SC-IFRS) and multi-criteria IFRS (MC-IFRS) that extend the definition of RS taking into account a feature of a user and a characteristic of an item expressed by intui-tionistic linguistic labels were proposed Next, new definitions of intuitionistic fuzzy matrix (IFM), which is a representation of SC-IFRS and MC-SC-IFRS in the matrix format and the intuitionistic fuzzy composition matrix (IFCM) of two IFMs with the intersection/ union operation were presented and used to design some new sim-ilarity degrees of IFMs such as the intuitionistic fuzzy simsim-ilarity matrix (IFSM) and the intuitionistic fuzzy similarity degree (IFSD) From these similarity functions, a novel Intuitionistic Fuzzy Collab-orative Filtering method so-called Intuitionistic Fuzzy CollabCollab-orative Filtering (IFCF) was presented for the medical diagnosis problem

Trang 4

IFCF has been validated on benchmark medical diagnosis datasets

from UCI Machine Learning Repository in terms of the accuracy of

diagnosis and showed better performance than the standalone

methods of FS and RS

The motivation and contributions of this paper are elicited as

follows IFCF used IFSD to calculate the similarity between two

patients This measure is the generalization of the hard user-based,

item-based and the rating-based similarity degrees in RS (Ricci

integra-tion with the informaintegra-tion of possibility of patients belonging to

clusters specified by a fuzzy clustering method That is to say, if

we know the new patient belongs to which group then the

similar-ities of this patient with others in the group should be given a high

influence in the calculation of IFSD Therefore, in this paper we

pro-pose a novel hybrid model between picture fuzzy clustering and

intuitionistic fuzzy recommender systems for medical diagnosis

so-called Hybrid Intuitionistic Fuzzy Collaborative Filtering (HIFCF)

HIFCF makes uses of a newest picture fuzzy clustering method

namely Distributed Picture Fuzzy Clustering Method – DPFCM (Son,

2015) to classify the patients into some groups according to the

relations information of patients Then, the possibility of a patient

belonging to a certain cluster is used to calculate the similarity

degrees between users They are supplemented into IFSD to give

the final similarity between patients The new hybrid algorithm

HIFCF will be validated experimentally on benchmark UCI Machine

Learning Repository dataset and compared with the relevant

meth-ods in terms of accuracy The rests of the paper are organized as

follows Section2presents the new algorithm HIFCF Section3

val-idates the proposed model by experiments Section4gives the

con-clusions and future works of the paper

2 The proposed method

In this section, we firstly recall some principal terms and

algo-rithms of Intuitionistic fuzzy recommender system – IFRS (Son &

Filter-ing – IFCF algorithm in Section2.1 Secondly, we recall one of the

best recently-published picture fuzzy clustering methods namely

Distributed Picture Fuzzy Clustering Method – DPFCM (Son, 2015)

used to classify the patients into some groups according to their

relations information in Section2.2 Thirdly, the main contribution

of the paper regarding a novel hybrid model between DPFCM

and IFRS for medical diagnosis so-called Hybrid Intuitionistic Fuzzy

Collaborative Filtering (HIFCF) is presented in Section 2.3 Lastly,

some theoretical analyses of the new algorithm are made in

Section2.4

2.1 Intuitionistic fuzzy recommender system

Firstly, the definition of medical diagnosis under the light of

intuitionistic fuzzy sets is described as follows

Definition 3 (Medical diagnosis (Son & Thong, 2015)) Given three

lists: P ¼ fP1; ;Png; S ¼ fS1; ;Smg and D ¼ fD1; ;Dkg where

P is a list of patients, S a list of symptoms and D a list of diseases,

respectively Three values n; m; k 2 Nþare the numbers of patients,

symptoms and diseases, respectively The relation between the

patients and the symptoms is characterized by the

set-RPS¼ fRPSðPi;SjÞj8i ¼ 1; ; n; 8j ¼ 1; ; mg where RPS

ðPi;SjÞ

represented by either a numeric value or a (intuitionistic) fuzzy

value depending on the domain of the problem Analogously, the

relation between the symptoms and the diseases is expressed as

RSD¼ fRSDðSi;DjÞj8i ¼ 1; ; m; 8j ¼ 1; ; kg where RSD

ðSi;DjÞ reflects the possibility that symptom Si would lead to disease Dj

The medical diagnosis problem aims to determine the relation between the patients and the diseases described by the set –

RPD¼ fRPDðPi;DjÞj8i ¼ 1; ; n; 8j ¼ 1; ; kg where RPD

ðPi;DjÞ is either 0 or 1 showing that patient Piacquires disease Djor not The medical diagnosis problem can be shortly represented by the implication fRPS;RSDg ! RPD

Definition 4 (Single-criterion intuitionistic fuzzy recommender sys-tems – SC-IFRS (Son & Thong, 2015)) The utility function R is a mapping specified on ðX; YÞ as follows

R : X  Y ! D

ðl1XðxÞ;c1XðxÞÞ;

ðl2XðxÞ;c2XðxÞÞ;

ðlsXðxÞ;csXðxÞÞ



ðl1YðyÞ;c1YðyÞÞ;

ðl2YðyÞ;c2YðyÞÞ;

ðlsYðyÞ;csYðyÞÞ

!

ðl1DðDÞ;c1DðDÞÞ;

ðl2DðDÞ;c2DðDÞÞ;

ðlsDðDÞ;csDðDÞÞ

ð8Þ

whereliXðxÞ 2 ½0; 1 (resp.ciXðxÞ 2 ½0; 1),8i 2 f1; ; sg is the mem-bership (resp non-memmem-bership) value of the patient to the linguis-tic label ith of feature X:ljYðyÞ 2 ½0; 1 (resp cjYðyÞ 2 ½0; 1),

8j 2 f1; ; sg is the membership (resp non-membership) value of the symptom to the linguistic label jth of characteristic Y: Finally,

llDðDÞ 2 ½0; 1 (resp.clDðDÞ 2 ½0; 1),8l 2 f1; ; sg is the membership (resp non-membership) value of disease D to the linguistic label lth SC-IFRS provides two basic functions:

(a) Prediction: determine the values of ðllDðDÞ;clDðDÞÞ;

8l 2 f1; ; sg;

i¼ arg maxi¼1;sfliDðDÞ þliDðDÞð1 liDðDÞ ciDðDÞÞg

systems – MC-IFRS (Son & Thong, 2015)) The utility function R is

a mapping specified on ðX; YÞ below

R : X  Y ! D1     Dk

ðl1XðxÞ;c1XðxÞÞ;

ðl2XðxÞ;c2XðxÞÞ;

ðlsXðxÞ;csXðxÞÞ



ðl1YðyÞ;c1YðyÞÞ;

ðl2YðyÞ;c2YðyÞÞ;

ðlsYðyÞ;csYðyÞÞ

!

ðl1DðD1Þ;c1DðD1ÞÞ;

ðl2DðD1Þ;c2DðD1ÞÞ;

ðlsDðD1Þ;csDðD1ÞÞ

   

ðl1DðDkÞ;c1DðDkÞÞ;

ðl2DðDkÞ;c2DðDkÞÞ;

ðlsDðDkÞ;csDðDkÞÞ

ð9Þ

MC-IFRS is the system that provides two basic functions below

ðllDðDiÞ;clDðDiÞÞ; 8l 2 f1; ; sg; 8i 2 f1; ; kg;

2 ½1; s satisfying i

¼ arg maxi¼1;sfPk

j¼1wjðliDðDjÞ þliDðDjÞð1 liDðDjÞ ciDðDjÞÞÞg where wj2 ½0; 1 is the weight of Djsatisfying the constraint:

Trang 5

A representation of MC-IFRS in the matrix format is

demon-strated as follows

Definition 6 (Son & Thong, 2015) An intuitionistic fuzzy matrix

(IFM) Z in MC-IFRS is defined as,

Z ¼

a11 a12 a1s

b21 b22 b2s

c31 c32 c3s

c41 c42 c4s

ct1 ct2 cts

0

B

B

B

B

@

1 C C C C A

In Eq.(10), t ¼ k þ 2 where k 2 Nþis the number of diseases in

labels a1i;b2i;chi;8h 2 f3; ; tg;8i 2 f1; ; sg are the intuitionistic

fuzzy values (IFV) consisting of the membership and

non-member-ship values as in Definition 5 a1i¼ ðliXðxÞ;ciXðxÞÞ; 8i 2 f1; ; sg

represents for the IFV value of the patient to the linguistic label

ith of feature X b2i= (liY(y),ciY(y)), "ie{1, , s} stands for the IFV

value of the symptom to the linguistic label ith of characteristic Y

chi= (liD(Dh-2),ciD(Dh-2)), "ie{1, , s}, "he{3, , t} is the IFV value

of the disease to the linguistic label ith Each line from the third one

to the last in Eq.(10)is related to a given disease

Definition 7 (Son & Thong, 2015) Suppose that Z1and Z2are two IFM in MC-IFRS The intuitionistic fuzzy similarity matrix (IFSM) between Z1and Z2is defined as follows

eS ¼

eS11 eS12 eS1s

eS21 eS22 eS2s

eS31 eS32 eS3s

eS41 eS42 eS4s

eSt1 eSt2 eSts

0 B B B B B

1 C C C C C

where,

eS1i¼ 1 

1  exp 1=2 ffiffiffiffiffiffiffiffiffiffiffiffiffiffi

lð1Þ

iXðxÞ

q

 ffiffiffiffiffiffiffiffiffiffiffiffiffiffi

lð2Þ

iXðxÞ q



  þ ffiffiffiffiffiffiffiffiffiffiffiffiffifficð1Þ

iXðxÞ

q

 ffiffiffiffiffiffiffiffiffiffiffiffiffiffi

cð2Þ

iXðxÞ q



eS2i¼ 1 

1  exp 1=2ð ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

lð1Þ

iYðyÞ

q



ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

lð2Þ

iYðyÞ q



  þ ffiffiffiffiffiffiffiffiffiffiffiffiffifficð1Þ

iYðyÞ

q



ffiffiffiffiffiffiffiffiffiffiffiffiffiffi

cð2Þ

iYðyÞ q



1  expð1Þ

eS hi ¼ 1 

1  exp 1=2

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

lð1Þ

q



ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

lð2Þ

q



  þ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffifficð1Þ

q



ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

cð2Þ

q



Definition 8 (Son & Thong, 2015) Suppose that Z1and Z2are two IFM in MC-IFRS The intuitionistic fuzzy similarity degree (IFSD) between Z1and Z2is

SIMðZ1;Z2Þ ¼aXs

i¼1

w1ieS1iþ bXs

i¼1

w2ieS2iþvX

t

h¼3

Xs i¼1

whieShi; ð15Þ

Z2:W ¼ ðwijÞð8i 2 f1; ; tg; 8j 2 f1; ; sg) is the weight matrix

of IFSM between Z1and Z2satisfying,

Xs i¼1

w1i¼ 1; Xs

i¼1

w2i¼ 1; Xs

i¼1

whi¼ 1; 8h 2 f3; ; tg; ð16Þ

Definition 9 (Son & Thong, 2015) The formulas to predict the val-ues of linguistic labels of patient Puð8u 2 f1; ; ngÞ to symptom

Sjð8j 2 f1; ; mgÞ according to diseases ðD1;D2; ;DkÞ in MC-IFRS are:

lP u

iDðDjÞ ¼

Pn

v¼1SIMðPu;PvÞ lPv

iDðDjÞ

Pn

v¼1SIMðPu;PvÞ ; 8i 2 f1; ; sg;

8j 2 f1; ; kg; 8u 2 f1; ; ng; ð18Þ

cP u

iDðDjÞ ¼

Pn

v¼1SIMðPu;PvÞ cPv

iDðDjÞ

Pn

v¼1SIMðPu;PvÞ ; 8i 2 f1; ; sg;

8j 2 f1; ; kg; 8u 2 f1; ; ng: ð19Þ

Table 3

The extracted SC-IFRS dataset with ⁄

being the values to be predicted.

Ram Temperatureð0:8; 0:1Þ;

Headacheð0:6; 0:1Þ

Stomach painð0:2; 0:8Þ;

Coughð0:6; 0:1Þ

Chest painð0:1; 0:6Þ

* + Viral feverð0:4; 0:1Þ;Malariað0:7; 0:1Þ

Typhoidð0:6; 0:1Þ;

Stomach problemð0:2; 0:4Þ Chest problemð0:2; 0:6Þ

Mari Temperatureð0:0; 0:8Þ;

Headacheð0:4; 0:4Þ

Stomach painð0:6; 0:1Þ;

Coughð0:1; 0:7Þ

Chest painð0:1; 0:8Þ

* + Viral feverð0:3; 0:5Þ;Malariað0:2; 0:6Þ

Typhoidð0:4; 0:4Þ;

Stomach problemð0:6; 0:1Þ Chest problemð0:1; 0:7Þ

Sugu Temperatureð0:8; 0:1Þ;

Headacheð0:8; 0:1Þ

Stomach painð0:0; 0:6Þ;

Coughð0:2; 0:7Þ

Chest painð0:0; 0:5Þ

Somu Temperatureð0:6; 0:1Þ;

Headacheð0:5; 0:4Þ

Stomach painð0:3; 0:4Þ;

Coughð0:7; 0:2Þ

Chest painð0:3; 0:4Þ

Table 1

The relation between the patients and the symptoms.

P Temperature Headache Stomach_pain Cough Chest_pain

Ram (0.8, 0.1) (0.6, 0.1) (0.2, 0.8) (0.6, 0.1) (0.1, 0.6)

Mari (0, 0.8) (0.4, 0.4) (0.6, 0.1) (0.1, 0.7) (0.1, 0.8)

Sugu (0.8, 0.1) (0.8, 0.1) (0, 0.6) (0.2, 0.7) (0, 0.5)

Somu (0.6, 0.1) (0.5, 0.4) (0.3, 0.4) (0.7, 0.2) (0.3, 0.4)

Table 2

The training dataset with ⁄

being the values to be predicted.

P Viral_Fever Malaria Typhoid Stomach Chest

Ram (0.4, 0.1) (0.7, 0.1) (0.6, 0.1) (0.2, 0.4) (0.2, 0.6)

Mari (0.3, 0.5) (0.2, 0.6) (0.4, 0.4) (0.6, 0.1) (0.1, 0.7)

Table 4 The recommended diseases.

P Viral_Fever Malaria Typhoid Stomach Chest

Trang 6

Example 1 We illustrate the steps of IFCF by an example inSon

namely P = {Ram, Mari, Sugu, Somu}, five symptoms S =

{Tempera-ture, Headache, Stomach-pain, Cough, Chest-pain} and five

dis-eases D = {Viral-Fever, Malaria, Typhoid, Stomach, Heart} The

relation between the patients and the symptoms is illustrated in

values in this table are needed to be predicted Motivated by

w3i¼ 0:2, the IFSD between Sugu (Somu) and Ram & Mari are shown below

IFSDðSugu; RamÞ ¼ 0:87; ð20Þ IFSDðSugu; MariÞ ¼ 0:57; ð21Þ IFSDðSomu; RamÞ ¼ 0:83; ð22Þ IFSDðSomu; MariÞ ¼ 0:58: ð23Þ

Table 5

The pseudo-code of DPFCM.

Distributed Picture Fuzzy Clustering Method (DPFCM)

I: – Data X whose number of elements (N) in r dimensions

– Number of clusters: C – Number of peers: P þ 1 – Fuzzifier m

– Thresholde> 0 – Parameters:c;a 1 ; a 2 ;a;max Iter

ljh jl ¼ 1; P; j ¼ 1; C; h ¼ 1; r

; n ðu lkj ;glkj; nlkjÞjl ¼ 1; P; k ¼ 1; Y l ;j ¼ 1; C o

w ljh jl ¼ 1; P; j ¼ 1; C; h ¼ 1; r

: DPFCM

– Set the number of iterations: t ¼ 0 – Set D lijh ðtÞ ¼ h lijh ðtÞ ¼ 0, (8i–l; i; l ¼ 1; P; j ¼ 1; C; h ¼ 1; r) – Randomize fðu lkj ðtÞ;glkj ðtÞ; n lkj ðtÞÞjl ¼ 1; P; k ¼ 1; Y l ; j ¼ 1; Cg satisfying (31)

– Set w ljh ðtÞ ¼ 1=rðl ¼ 1; P; j ¼ 1; C; h ¼ 1; r) 2S: Calculate cluster centers VljhðtÞ; ðl ¼ 1; P; j ¼ 1; C; h ¼ 1; r) from ðulkjðtÞ;glkjðtÞ; n lkj ðtÞÞ; w ljh ðtÞ and h lijh ðtÞ by (39)

3S: Calculate attribute-weights wljhðt þ 1Þ; ðl ¼ 1; P; j ¼ 1; C; h ¼ 1; rÞ from ðu lkj ðtÞ;glkj ðtÞ; n lkj ðtÞÞ; V ljh ðtÞ and D lijh ðtÞ by (41)

4S: Send fDlijhðtÞ; hlijhðtÞ; VljhðtÞ; wljhðt þ 1Þji; l ¼ 1; P; i–l; k ¼ 1; Yl;j ¼ 1; Cg to Master

5M: Calculates fDlijhðt þ 1Þ; hlijhðt þ 1Þji; l ¼ 1; P; i–l; k ¼ 1; Yl;j ¼ 1; Cg by(38) and (40)and send them to Slave peers

6S: Calculate cluster centers Vljhðt þ 1Þ, (l ¼ 1; P; j ¼ 1; C; h ¼ 1; r) from ðulkjðtÞ;glkjðtÞ; n lkj ðtÞÞ; w ljh ðt þ 1Þ and h lijh ðt þ 1Þ by (39)

7S: Calculate positive degrees fulkjðt þ 1Þjl ¼ 1; P; k ¼ 1; Yl;j ¼ 1; Cg from ðglkjðtÞ; n lkj ðtÞÞ; w ljh ðt þ 1Þ and V ljh ðt þ 1Þ by (37)

8S: Compute neutral degrees fglkj ðt þ 1Þjl ¼ 1; P; k ¼ 1; Y l ; j ¼ 1; Cg from ðu lkj ðt þ 1Þ; n lkj ðtÞÞ; w ljh ðt þ 1Þ and V ljh ðt þ 1Þ by (42)

9S: Calculate refusal degrees fnlkjðt þ 1Þjl ¼ 1; P; k ¼ 1; Yl; j ¼ 1; Cg from ðu lkj ðt þ 1Þ;glkj ðt þ 1ÞÞ; w ljh ðt þ 1Þ and V ljh ðt þ 1Þ by (43)

10S: If max l fmaxfku lkj ðt þ 1Þ  u lkj ðtÞk; kglkjðt þ 1Þ glkjðtÞk; kn lkj ðt þ 1Þ  n lkj ðtÞkgg <eor t > max Iter then stop the algorithm,

Otherwise set t ¼ t þ 1 and return Step 3S.

S: Operations in Slave peers.

M: Operations in the Master peer.

Trang 7

Next, useDefinition 9to calculate the predictive IFM results of Sugu

and Somu

DiseaseðSuguÞ ¼

Viral feverð0:49; 0:38Þ;

Malariað0:52; 0:22Þ Typhoidð0:36; 0:52Þ;

Stomach problemð0:40; 0:34Þ Chest problemð0:10; 0:68Þ

; ð24Þ

DiseaseðSomuÞ ¼

Viral feverð0:47; 0:39Þ;

Malariað0:52; 0:22Þ Typhoidð0:36; 0:51Þ;

Stomach problemð0:39; 0:47Þ Chest problemð0:10; 0:68Þ

: ð25Þ

Based on the recommendation function ofDefinition 4and Eqs.(24)

as inTable 4 From this table, we conclude that Sugu and Somu both suffer from the Malaria

2.2 Distributed Picture Fuzzy Clustering Method

Clus-tering Method on picture fuzzy sets so-called DPFCM Firstly, we raise the definition of picture fuzzy sets

Definition 10 A Picture Fuzzy Set (PFS) (Cuong & Kreinovich, 2013)

in a non-empty set X is,

_A ¼ hx; lðxÞ;gðxÞ;cðxÞijx 2 X

Fig 3 MAE values of algorithms by 2-fold cross validation.

Fig 4 MAE values of algorithms by 3-fold cross validation.

Trang 8

wherel_AðxÞ is the positive degree of each element x 2 X;g_AðxÞ is the

neutral degree and c_AðxÞ is the negative degree satisfying the

constraints,

l_AðxÞ;g_AðxÞ;c_AðxÞ 2 ½0; 1; 8x 2 X; ð27Þ

0 6l_AðxÞ þg_AðxÞ þc_AðxÞ 6 1; 8x 2 X: ð28Þ

n_AðxÞ ¼ 1  ðl_AðxÞ þg_AðxÞ þc_AðxÞÞ;8x 2 X In cases n_AðxÞ ¼ 0 PFS

returns to intuitionistic fuzzy sets (IFS) (Atanassov, 1986), and

when bothg_AðxÞ ¼ n_AðxÞ ¼ 0, PFS returns to fuzzy sets (FS) (Zadeh,

In DPFCM, the communication model is the facilitator or the

Master–Slave model having a Master peer and P Slave peers, and

each Slave peer is allowed to communicate with the Master only

Each Slave peer has a subset of the original dataset X consisting

of N data points in r dimensions We call the subset Yjðj ¼ 1; PÞ and [P

j¼1Yj¼ X; PP

j¼1jYjj ¼ N The number of dimensions in a sub-set is exactly the same as that in the original datasub-set The clustering problem is to divide the dataset X into C groups satisfying the objective function below

J ¼XP l¼1

XY l

k¼1

XC j¼1

ulkj

1 glkj nlkj

!m

Xr h¼1

wljhkXlkh Vljhk2

þcXP l¼1

XC j¼1

Xr h¼1

wljhlog wljh! min; ð29Þ

where ulkj; glkjand nlkjare the positive, the neutral and the refusal degrees of data point kth to cluster jth in the Slave peer lth This reflects the clustering in the PFS set expressed throughDefinition

10 w is the attribute-weight of attribute hth to cluster jth in the

Fig 5 MAE values of algorithms by 4-fold cross validation.

Fig 6 MAE values of algorithms by 5-fold cross validation.

Trang 9

Slave peer lth Vljhis the center of cluster jth in the Slave peer lth

according to attribute hth Xlkhis the kth data point of the Slave peer

lth according to attribute hth m andcare the fuzzifier and a

posi-tive scalar, respecposi-tively The constraints for(29)are shown below

ulkj;glkj;nlkj2 ½0; 1; ð30Þ

ulkjþglkjþ nlkj61; ð31Þ

XC

j¼1

ulkj

1 glkj nlkj

!

XC

glkjþnlkj

C

Xr h¼1

Vljh¼ Vijh; ð8i – l; i; l ¼ 1; PÞ ð35Þ

wljh¼ wijh ð8i–l; i; l ¼ 1; PÞ ð36Þ

The clustering model in Eqs.(29)–(36)relies on the principles of the PFS set and the facilitator model By using the Lagranian method and the Picard iteration, the optimal solutions of this model are shown as in Eqs.(37)–(43)

ulkj¼ 1 glkj nlkj

PC i¼1

Pr h¼1 wljhkXlkhVljhk 2

Pr

w kX V k2

m1

; ð8l ¼ 1;P; k ¼ 1;Yl;j ¼ 1;CÞ; ð37Þ

Fig 7 MAE values of algorithms by 6-fold cross validation.

Fig 8 MAE values of algorithms by 7-fold cross validation.

Trang 10

hlijh¼ hlijhþ a1ðVljh VijhÞ; ð8i – l; i; l ¼ 1; P; j ¼ 1; C; h ¼ 1; rÞ;

ð38Þ

Vljh¼

PYl

k¼1

u lkj

1 g lkj n lkj

 m

wljhXlkhPP

i¼1 i–lhlijh

PY l

k¼1

u lkj

1 g lkj n lkj

 m

wljh

;

ð8l ¼ 1; P; j ¼ 1; C; h ¼ 1; rÞ; ð39Þ

Dlijh¼Dljihþ a2ðwljh wijhÞ; ð8i–l; i; l ¼ 1; P; j ¼ 1; C; h ¼ 1; rÞ;

ð40Þ

wljh¼

exp 1

c PYl k¼1

ulkj 1 g lkj nlkj

kXlkh Vljhk2þcþ 2PP

i¼1 i–l

Dlijh

Pr

h 0 ¼1exp 1

c PYl k¼1

ulkj 1 glkjnlkj

kXlkh 0 Vljh 0k2þcþ 2PP

i¼1 i–l

Dlijh0

glkj¼ 1  nlkjþ

C1 C

PC i¼1nlki

PC i¼1

ulkj

ulki

Pr h¼1

w lih kX lkh V lih k 2

w ljh kX lkh V ljh k 2

mþ1

;

ð8l ¼ 1; P; k ¼ 1; Yl;j ¼ 1; CÞ; ð42Þ

nlkj¼ 1  ðulkjþglkjÞ  ð1  ðulkjþglkjÞaÞ1=a;

ð8l ¼ 1; P; k ¼ 1; Y; j ¼ 1; CÞ: ð43Þ

Fig 9 MAE values of algorithms by 8-fold cross validation.

Fig 10 MAE values of algorithms by 9-fold cross validation.

Ngày đăng: 16/12/2017, 00:25

TỪ KHÓA LIÊN QUAN

🧩 Sản phẩm bạn có thể quan tâm