Lập luận xác xuất dựa vào các tầng của cơ sở tri thức. pot

This paper first presents a metho of approximate reasoning in the interval-valued probabilistic logic by basing on "byers" of a knowledge base.. Su h", method isb se on the reduced basic

Trang 1

T'l - p chi Tin h9Cv aDieu khi€ h9C, T.1 , S.2 (2001), 27-34

TRAN DINH QUE

Abstract Reasoning in th interv l-value probabi st logic depends heavily on the basic matrix oftruth values of sentences in a knowledge base 8 a d a target sentence S However, the problem of determining allsuch consiste t truth value assig ments for a set of se tences is NP-complete for propositional logic and undecidable for fist-order predicate logic

This paper first presents a metho of approximate reasoning in the interval-valued probabilistic logic

by basing on "byers" of a knowledge base Then, we investigate the metho of slightly decreasing th

c mplexity of reasoning via the maximum entropy principle in a p int-v lue pro abilistic knowledge base

Su h", method isb se on the reduced basic matrix constructed from sentences of the knowle ge base without the target sente ce

Tom tlit Lap luan to g logic xac sufit gia trj khodng phu thuoc rat nhieu vao ma tr~n CO 'bin ciia cac gia tri ch n ly cila cac cfiu tro g co' so' tri thic 8 va cau dic S Tuy nhie , bai toan xac dinh tat d. n img

p ep g an gia tr] c in ly phi mfiu thuin cho mot t~p ho-pcau 1 11 NP-da dt doi vo'i logic menh de v akhOng quye t djnh ducc doi voi logic vi t ir cap l

Bai b o nay tru'o'c het trlnh bay mot phtro'ng phap l%p lua xap xi trong logic xac sufit gia trj khodng bhg each dua vao "cac t'ang" cda CO' so' tri th irc , Sau do chiin ta se xem xet met phtro'n pha lam gidm mi?tchut di?phirc tap cil al%p luan du'a tren nguyen ly entopy toi dai trong C 'so' tri thrrc xac suat gia tr] die'm Phiro'ng ph ap l%p luan nhu' v~y d u'atre ma tr~n co' ban rut gon du'C!cx y dung t ir cac c au tro g co'

so'tri thu'c kho g bao gem cau dich

In vario s approaches to handling uncertain information, th paradigm of proba iistic logic has been widely studie in the community of AI resechers (e.g., [1- 1 3] ) The interest in pro abilistic logic as a researc topic for AI was sparke by Nilsson's paper on probabilistic logic [111.

The probabilistic logic, an integration of logic and the probability theory, d termines a pro ability

of a sentence by means of a probability distribution on asample space c mposed of c la sses of po ss ibl e

wo rld s. Each class is defined by mea s of a tuple of consistent truth v lues assigned to a set of sentences The d duction in this logic is the reduced to the linear programming problem Howe er,

th problem of determining all su h consistent truth v lue assigments for a set of-sentences is NP-complete for propositional logic and undecidable for fist-order logic There h v been a great deal

of attemps in the AI community to deal wih the drawback (e.g., [ 1 ] , [ 8 ] ' [ 10 ] ' [ 13 ]

This paper first proposes a method of approximate reasoning based on "layers" of an interval-valued pro abilist knowle ge base (iKB) The f st la er consists of elements of the iKB such that their sentences h v someJogical relatio ship with the target sente ce The secon one contains elements of iKB whose sentences have some relationship with sentences in the first layer a d so on, Our inference method is based on the idea that the calculation of a value of a sente ce is only based directly on its nearest upper layer Later we consider the deduction of point-value probabilistic logic via Maximum Entropy (ME) principle, Like the d duction from iKB, ME deducton is also based on

th matrix c mposed of vectors of consistent truth values of the target sentence a d sentences in a point-valued knowle g base (pKB) It is possible to build this deducton based o the reduced basic matrix of only sente ces in some layers of pKB without the target sentence,

The meth d of constructing layers from sentences in a knowle ge base and a method of

Trang 2

approx-imate reasoning b sed on them will be presented in the next section Se tio 3 presents a method

ofredu ing the sizeof th basic matrix in the pointed probabilistic reaso ing via ME Our approach

is to construct th basic matrix of the sentences in the related la ers with ut referring to the goal

sentenc Some con lusio s and discussions are presented in Section 4

2 APPROXIMATE REASONING BASED ON LAYERS

OF A KNOWLEDGE BASE

2.1 Entailment problem in probabilistic logic

This section overviews the entailment problem of the interval-value probabilistic logic [3 ] and

ofth p int-value probabilistc logicpropose by Nilsson [ 11]

Given an iKB

8 ={ Si , i) Ii=1 , , l} ,

in which Si (i = 1, ,l ) are se ten es, I; ( i = 1, ,l) are su intervals of th unit interval [ 0 , 1 ]

a d a target sentence S From the set ofsentences ~ = { S1, ,S I SI + 1 }, (SI + 1= S ) , it is possible

to construct aset ofclasses ofpossible worlds Every class isc aracteriz d by a vector of consistent

truth values ofse tences in ~ In this se tion, we suppose that 11={Wi, ,wd is the set of a

ll~-classes of possible wo rld s and (Ulj, , U l , U I +lj ) t isa column vector ofthe truth values ofsentences

w.r.t Sl, , SI, SI+ l in the class Wj.

Let P = (pi, ,Pk) be a probability distribution o er the sample space 11 The truth probability

of a sentence S; is then defined to be the sum of prob bilities on possible world classes in which S ;

is true, ie.,

7r ( Si ) = Ui lP l + +U ikPk

or

W iP S,

We can write these equalties in the form ofthe following matrix equatio

II = UP,

where II = ( 7r ( S , , 7 r( S t) , 7r( S ))t , P = (pi, , Pk )t and U = (U i j) (i = 1, ,l+ 1 , 1 = 1 , ,k)

The matrix U will be calle th b si x matrix of ~

Th probabilistic entailment problem is reduced to the linear programming one finding

where

7 r S ) = U I +l ,l Pl + +U I +l , kPk,

subject to constraints

{ ':~U~Pd +U"P,_EI, (i ~ l , ,I)

LPJ - 1, P J ~ 0 (1- 1, ,k).

j=1

We denote the interval [a, f3] by F(S, 8), and write 8 f ( S, F(S, 8)).

In the special case, when 8 is the point-valued probabilistic knowledge base (pKB), ie., all I,

are points ai in [0,-1], constraints b c me equalities

LPJ - 1, PJ ~ 0 (1- 1, ,k).

Trang 3

PROBABILISTIC REASONING BASED ON LAYERS OF KNOWLEDGE BASE

However, in general, F(S, B) is not to be a point value Some assumption is added to the constraints

to derive a point value for a target sentence The Maximum Entropy (ME) principle is usually used for such a deduction We will return to this investigation in Section 3

2.2 Layers of knowledge base

This subsection is devoted to presenting a procedure to produce layers of a knowledge base Suppose that B = {(Si, Ii) Ii = 1, , I} is an iKB, in which S, are propositional sentences and

Ii are interval values of sentences Si; S is any target sentence we would like to calculate its probability value

The reasoning for deriving the probabilistic value of the sentence S from the knowledge base B

depends strongly on the basic matrix of truth values of a subset of sentences in ~' = {Sl, , Sl}

that have some logical relationship with the target sentence We will characterise the relationship by layering the set of sentences in the knowledge base

A subset B' of B is sufficient for S if the probabilistic values of S deduced from Band B' are the same

It means that if B f- (S, I) and B' f- (S, I') then 1= I'.

Denote atom( q,) the set of atoms occuring in the sentence q, and atom( <1» = U1>E<1> atom( q,) the set of all atoms in sentences in <1>

Example 1 atom(A -> B /\C) = {A, B, C} atom( {A /\ B, C -> -, D}) = {A, B, C,D}.

The following note shows us the meaning of introducing the notion of atom

If B' is a subset of B such that

atom(B' U {S}) n atom(B - B') = 0,

then B' is sufficient for S.

We now consider a procedure to produce layers of a knowledge base based on a logical dependence

between its sentences with the sentence S.

Layers of sentences in ~ are constructed recursively as follows:

Lg = {S},

Lf ={q, I q,E~, q , rf:-Lg and atom(q,) n atom(Lg) ¥- 0},

L~ = {q , Iq,E~, q , rf:-u;=oLf and atom(q,) n atom(Lf) ¥- 0}

L~ = {q , Iq,E~, q , rf:-U7:~Lf and atom(q,) n atom(L~_l) ¥- 0},

With respect to each L~, let

Note that if S rf: - ~', then Bg ={(S, [0,1]); otherwise Bcf= {(S,Is) I (S,Is) E B}.

We call the subset 8;[ to benth-layer of the knowledge base Bw.r.t S Ifq, ELi, the layer Bl + 1

is called the nearest upper-layer of the sentence

It is easy to see that there always exists a number no such that L~ o ¥ - 0 but L~o+l = 0 We denote

B - uno BS

suf(S) - i=O i

It is clear that Bsuf(s) is a sufficient subset for S.

Consider the following illustrating example

Example 2 Given a knowledge base

Trang 4

8 = {B + A: [ 9,1 ]'

D + B :[ 8, 9 ] '

A A C :[.6,.8]'

D :[ 8,1] '

C: [ 2, 7 ] }

and a target sentence A.

The knowledge base can be layered into subsets with the target sentence A

L ~ = {A} , 8 = {A : 0,I}

L = {B + A , A AC} , 8 = {B + A : [.9,1]'A A C : [ 6, 8 ] }

L~ = {D + B, C}, 8 = {D + B :[ 8 , 9 ] ' C : [ 2 , 7 ] }

L~ = {D} , 8 : = {D : [ 8, I }

Thus, the sufficient subset for A is

8"uf(A) = 8

Similarly, layering can be performed for a point-valued probabilistic knowledge base

In the case a knowledge base is large, it is not easy to derive the smallest interval value for a

target s ntence S from 88 1l f( ) ' Layers gives us a method ofcalculating an approximate value The idea of approximate re soning is that the pro abilistic value of each sentence is updated by deriving

its value based on the nearest upper-layer of this sentence And when all sentences of the nearest

upper-layer of the target sentence are updated, its value is then calculated We now forrn alise the above presentatio

Without loss of generality, we suppose that 8 is a sufficient knowled e bas and S is a target

sentence It is layered into subsets 8 g, 8f, ,8:0 , where 8 ~ isthe hig est layer in the knowledge base Remind that L f ( = 1, ,no ) are subsets ofsentences w.r 8

Update of a sentence </J isrecursively defined as follows:

(i) For all </J EL ~ ' < /J isupdated;

(ii) </J E L f , ( < n ), is updated if all ' f; E L 7+ 1 are updated and 8(~+1,u) r (</J , 1 " ,) , where

8 ( s + 1,u) is the updated layer of 8is+1'

If 81' is updated into 8r,u) and 8ri,u) r ( S, I s ) , then I s is the approximate value for S

Thus, the approximate calculatio of interval value for a sentence co sists of three steps:

1 Divide the knowledge base into layers with the lowest layer being the target sentence S

2 Update the values for sentence of 8i - from the nearest upper-layer 8i. This proces starts

from i= no till 81 is updated into 8("~,u)'

3 Calculate the value for S from 8 ( ~ , u ) '

Example 3 (continued) In Example 2,we have constructed the layers ofthe knowledge base If we base on the whole 88 Uf( A ) it isnecessary to build a 6X 14-basic matrix of6rows and 14columns It

ispossible to calculate the value for A according to the ab ve approximate meth d

In the process of updating, D + Band B + A are stable, i.e., their values are [ 8 , 9]and [.9,1]'

respectively Since the value ofCis [.2, 7]' AAC isupdated to [.6, 7] Thus, avalue ofA is deduced from the 1th updated layer

8(i u) ={B + A : [.9,1]' AA C :[.6,.7]}

The basic matrix for sentences I; = {B + A, A A C,A} is

000

Trang 5

We need to compute

on the domain determined by

{ 9::; P1+P2 +P3 < 1

6 ::; Pl <.7

P1+P2 +P3 +P4 = 1

The value of A is then [ 6,1]

We compare now the computable value with a value derived from the anytime deduction proposed

by Frish and Haddawy [8]. Anytime deduction is based on a set of thirty two rules enumerated from (i) to (xxxii) In the above example, applying (xx) first to D : [ 8,1 ] and D + B : [ , 9] yields

same way, combining C : [ 2, 7] and A : [0,1] via the rule (xxv) gives A 1\C : [0, 7 ] and then with

A 1 \C : [ 6, 8]via (xvii) gives A 1\C : [ 6, 7]; applying (xxvi) to this result yields A :[ 6,1 ]. Applying (xvii) to two ways of computation of A, we have A :[ 6,1]. The derived interval equals to the interval

value of A deduced by our method of approximate reasoning

In this section, we investigate a method ofreducing the complexity of computation in applying the

Maximum Entropy Principle for deriving a point value for a sentence from a point-valued probabilistic knowledge base

3.1 Maximum Entropy Deduction

We first review a technique named Maximum Entropy Principle [11]to select a probability distribution among distributions holding some initial conditions given by a knowledge base

Suppose that

8 = { 5i,O:i) I i = 1 , ,I}

is pKB and 5 is a sentence (5 = f 5i, i = 1, ,I). As presented in Section 2, den te F (5, 8) the

s t of values of 7r(5) = LWil =S Pi = UI + l lPl + +UI+l,kPk, where P = (pl, '" , pd varies in the domain defined by conditional equation

where II = (1,0:1,"" o:t}t and U + is the basic matrix composing of columns of truth values of

sentences 51, , 5 1 , 5 1 + 1 (5 1 +1= 5) with the first row being units

According to Maximum Entropy Principle, in order to obtain a single value for 5, we must sele t

a distribution P such that the following optimization problem holds

k

J = l

where P subjects to constraints determined by the conditional equation (1)

Suppose that (pl,' ,Pk) is a solution of the above problem Then the probability of5is denoted by

F(5 , 8) = UllPl + +UI + l kPk '

ith-column of U +

Trang 6

From the initial conditons of the knowledge base, we can compute a, and then P i Thus the point

probability value ofSis then derived Wecall the deduction based on the Maximum Entropy Principle

to be the Maxim um E n tr o py d e duction or shortly ME deduction

3.2 Maxirnum Entropy Deduction with the Reduced Basic Matrix

As presented above, the ME deduction is based on the basic matrix constructed from the target

sentence and all sentences in the inital knowledge base The larger the basic matrix is, the more

complex the computatio is In fact, coefficients a i in (3) are only related to the matrix of truth

values of sentences in the knowledge base The complexity is slightly decreased if ME deduction is

based on the basic matrix constructed only from sentences of the knowledge base without the target

s ntence

As presented in Subsectio 2.2, the probabilistic inference only depends on the sufficient subset'

for the target sentence Without loss of generality, we suppose that B = B s u f s ), 0 = { W l ". ,wd

is a set of possible world classes determined by·~ ={Sl, . ,Sd and U + is the r e duced ba s ic matrix

ofsentences in ~ with the first row being units

In each class Wi , S can have either one truth value true/false or both truth values true and false

For ease of presentatio , we suppose that o classes Wl, , Wr n, the sentence S gets one truth value

and on W m+l " ,Wk, S has both values true and fals Thus, the ectende d set of possible world

classes W.r ~U{S} has the form

O+= F UE ,

where F = {WI, , wrn } and E = { w; : '+l ! W ; :'+l ,wt ,w; }. We have the following proposition

Proposition 1 Suppo se that P is a p robabi li t y d is i buti o sa t sf yi ng ME prin c ipl e on O W e hav e

w;i=S,w"I:<:i: :;m w;i S,m+l:<:;i:<:k

P r oo.f Supposep + -- (PI" I " 'Prn' Pr I n + I'Pm++ I" ", Pk,Pk- + -' is thepro b b bilia a ility diistrributiution on •n•+

satisfying ME and (1) According to the metho of constructing this distribution, we have

+ - - +-

-Pm+l - Pm+l!'" ,Pk - Pk '

Therefore, if P = (PI, , P m, Prn+l , Pk ) is the probabilistic distributo on 0 s tsfying (1) an

ME, then

Pi =P: (i = 1, ,m),

Pi = 2P t (i 2 m+ 1).

It is easy to derive (4) from these equalities The propositio is proved

In summary, the comp tatio of the p int value for a sentence S via ME co sists ofthree steps:

1 Co struct the sufficient subset for S to eliminate unnecessary information

2 Find an entropy-maximizing P based on the reduced basic matrix U+ of the sentences in the

sufficient subset

3 Calculate 7r S ) via the equality (4)

Example 4 Given a knowledge base

B = {A :ai,

A - > B : a2 ,

B - > C :a3 }

and a target sentence C

It is clear that B = The reduced basic matrix for the set ofsentences in B with the first

Trang 7

PROBABILISTIC REASONING BASED ON LAYERS OF KNOWLEDGE BASE 33

row of units is

1 1 1

1 0 1

1 1 0

001

in which the second row is the truth values ofA , the third and fourth ones are of A - > Band B - + C,

respectvely Thus, there are five classes of possible word WI, ,W 5 correspon ing to five column vectors (eliminating the first row)

Compo ents of Pi are written in the form

with (ao, aI, a2, a3) satisfying the system of equations

{ aO ala 2 a3 +a Oa l a2 +aOala 3 +aOa 2 3 =QI

aO ala 2a 3 +aOa la 2 +aO a2 +aOa 2 3 = Q2

a Oala 2a3 +a O a la 3 +a O a a = Q3

aO a la 2 3 +a O a l a2 +a O a2 +a Oa l a +a O a2a 3 = 1

Solving yields

ao = (1 - QI)(l - Q2)(1 - QI +Q2 - Q3)/(QI +Q3 - 1)(Q2 - Q3)'

al = (Q2 - Q3)/(1 - Qd,

a = (QI +Q3 - 1)(Q2 - Q3)/(1 - Q2)(1 - QI +Q2 - Q3),

a = (1 +QI +Q3)/(1 - QI +Q2- Q3)

Thus, the entrop -maximizing P is given by:

( (Q2 - Q3l~(~IQ: Q3 - 1) )

P = 1 - QI

1-Q2 (1 - Qd(QI +Q3 - 1)/(1 - QI +Q2- Q3)

Since C has one true value on WI , two truth values in classes W4 and W 5 (false value o W 2 ,W 3 ) the

pro ability of A is then

4 CONCLUSION

This paper has presented ametho of layering aknowledge base based on the logical relati nship between sentences of the knowledge base with a target sentence By means of layers, we can perform

appro imate resoning in order to derve an interval value for the sentence Our ap roximate metho

is different from the anytime deductio pro osed b Frish and Haddawy [8] While o r o e is based

o the process of updating of all sentences before deriving an interval value for the target sentence,

their anytime deduction is based o a set of rules

Trang 8

We have also presented a method of calculating the point probabilistic value of a sentence via

the Maximum Entropy Principle by not referring to the target sentence when constructing the basic

matrix This method slightly decreases the size of the matrix in the computation process

We have pre ented a comparative example between our approximate method and the anytime

deduction propos d by Frish and Haddawy A complete comparison of this approximate method with

the other o es will be a topic of our further work

Acknow l edgeme n t. Iam greatly indebted to my supervisor, Prof Phan Dinh Dieu, for invaluable

suggestions

REFERENCES

[1] K.A Anderson Characterizing consistency in probabilistic logic for a class of Horn clauses

M a th e matical Proqramrninq 66 (1994) 257-271

[2] F Bacchus, A J Grove, J.Y Halpern and D Koller From statistical knowledge bases to

degrees of belief, Artificial Intelligence 81(1-2) (1996) 75-143

[3] P.D Dieu, On a theory of interval-valued probabilistic logic, Research Report, NCSR Vietnam,

Hanoi, 1991

[4] P.D Dieu and P H Giang, Interval-valued probabilistic logic for logic programs, Journal of

Co mput er S cience and Cybernatic s 10(3) (1994) 1-8

[5] P.D Dieu and T.D Que, From a convergence to a reasoning with interval-valued probability,

Journal of Computer Science and Cybernetics 13(3) (1997) 1-9

[6] R Fagin, J Y Halpern, and N Megiddo, A logic for reasoning about probabilies, Information and Compuation 81 (1990) 78-128

[7] R Fagin and J Y Halpern, Uncertainty, Belief and Probability, Computational Intelligence 1

(1991) 160-173

[8] A.M Frish and P Haddawy, Anytime deduction for probabilistic logic, Artificial Intelligence

69 (1994) 93-122

[9] R Kruse, E Schwecke, and J. Heinsohn, Uncertainty and Vagueness in Knowledge Based Sys -tems , Springer-Verlag, Berlin - Heidelberg, 1991

[10]R.T Ng and V S Subr ahm anian, Probabilistic logic programming Information and Compu -tation 101 (1992) 150-201

[11] N.J Nilsson, Probabilistic logic, Artificial Intelligence 28 (1986) 71-78

[12] T.D Que, About semantics of probabilistic logic, Submitted to Computer Science and

Cyber-n e tic s

[1 ] P Snow, Compressed constraints in probabilistic logic and their revision, Uncertainty in Arti-ficial Intelligence (1991) 386-391

Received November is, 1999

D e partm e nt of Information Technology,

Po s t s and Telecommunications Institute of Technology ,

Hanoi , Vi e tnam.

Định dạng
Số trang	8
Dung lượng	4,3 MB