This paper first presents a metho of approximate reasoning in the interval-valued probabilistic logic by basing on "byers" of a knowledge base.. Su h", method isb se on the reduced basic
Trang 1T'l - p chi Tin h9Cv aDieu khi€ h9C, T.1 , S.2 (2001), 27-34
TRAN DINH QUE
Abstract Reasoning in th interv l-value probabi st logic depends heavily on the basic matrix oftruth values of sentences in a knowledge base 8 a d a target sentence S However, the problem of determining allsuch consiste t truth value assig ments for a set of se tences is NP-complete for propositional logic and undecidable for fist-order predicate logic
This paper first presents a metho of approximate reasoning in the interval-valued probabilistic logic
by basing on "byers" of a knowledge base Then, we investigate the metho of slightly decreasing th
c mplexity of reasoning via the maximum entropy principle in a p int-v lue pro abilistic knowledge base
Su h", method isb se on the reduced basic matrix constructed from sentences of the knowle ge base without the target sente ce
Tom tlit Lap luan to g logic xac sufit gia trj khodng phu thuoc rat nhieu vao ma tr~n CO 'bin ciia cac gia tri ch n ly cila cac cfiu tro g co' so' tri thic 8 va cau dic S Tuy nhie , bai toan xac dinh tat d. n img
p ep g an gia tr] c in ly phi mfiu thuin cho mot t~p ho-pcau 1 11 NP-da dt doi vo'i logic menh de v akhOng quye t djnh ducc doi voi logic vi t ir cap l
Bai b o nay tru'o'c het trlnh bay mot phtro'ng phap l%p lua xap xi trong logic xac sufit gia trj khodng bhg each dua vao "cac t'ang" cda CO' so' tri th irc , Sau do chiin ta se xem xet met phtro'n pha lam gidm mi?tchut di?phirc tap cil al%p luan du'a tren nguyen ly entopy toi dai trong C 'so' tri thrrc xac suat gia tr] die'm Phiro'ng ph ap l%p luan nhu' v~y d u'atre ma tr~n co' ban rut gon du'C!cx y dung t ir cac c au tro g co'
so'tri thu'c kho g bao gem cau dich
In vario s approaches to handling uncertain information, th paradigm of proba iistic logic has been widely studie in the community of AI resechers (e.g., [1- 1 3] ) The interest in pro abilistic logic as a researc topic for AI was sparke by Nilsson's paper on probabilistic logic [111.
The probabilistic logic, an integration of logic and the probability theory, d termines a pro ability
of a sentence by means of a probability distribution on asample space c mposed of c la sses of po ss ibl e
wo rld s. Each class is defined by mea s of a tuple of consistent truth v lues assigned to a set of sentences The d duction in this logic is the reduced to the linear programming problem Howe er,
th problem of determining all su h consistent truth v lue assigments for a set of-sentences is NP-complete for propositional logic and undecidable for fist-order logic There h v been a great deal
of attemps in the AI community to deal wih the drawback (e.g., [ 1 ] , [ 8 ] ' [ 10 ] ' [ 13 ]
This paper first proposes a method of approximate reasoning based on "layers" of an interval-valued pro abilist knowle ge base (iKB) The f st la er consists of elements of the iKB such that their sentences h v someJogical relatio ship with the target sente ce The secon one contains elements of iKB whose sentences have some relationship with sentences in the first layer a d so on, Our inference method is based on the idea that the calculation of a value of a sente ce is only based directly on its nearest upper layer Later we consider the deduction of point-value probabilistic logic via Maximum Entropy (ME) principle, Like the d duction from iKB, ME deducton is also based on
th matrix c mposed of vectors of consistent truth values of the target sentence a d sentences in a point-valued knowle g base (pKB) It is possible to build this deducton based o the reduced basic matrix of only sente ces in some layers of pKB without the target sentence,
The meth d of constructing layers from sentences in a knowle ge base and a method of
Trang 2approx-imate reasoning b sed on them will be presented in the next section Se tio 3 presents a method
ofredu ing the sizeof th basic matrix in the pointed probabilistic reaso ing via ME Our approach
is to construct th basic matrix of the sentences in the related la ers with ut referring to the goal
sentenc Some con lusio s and discussions are presented in Section 4
2 APPROXIMATE REASONING BASED ON LAYERS
OF A KNOWLEDGE BASE
2.1 Entailment problem in probabilistic logic
This section overviews the entailment problem of the interval-value probabilistic logic [3 ] and
ofth p int-value probabilistc logicpropose by Nilsson [ 11]
Given an iKB
8 ={ Si , i) Ii=1 , , l} ,
in which Si (i = 1, ,l ) are se ten es, I; ( i = 1, ,l) are su intervals of th unit interval [ 0 , 1 ]
a d a target sentence S From the set ofsentences ~ = { S1, ,S I SI + 1 }, (SI + 1= S ) , it is possible
to construct aset ofclasses ofpossible worlds Every class isc aracteriz d by a vector of consistent
truth values ofse tences in ~ In this se tion, we suppose that 11={Wi, ,wd is the set of a
ll~-classes of possible wo rld s and (Ulj, , U l , U I +lj ) t isa column vector ofthe truth values ofsentences
w.r.t Sl, , SI, SI+ l in the class Wj.
Let P = (pi, ,Pk) be a probability distribution o er the sample space 11 The truth probability
of a sentence S; is then defined to be the sum of prob bilities on possible world classes in which S ;
is true, ie.,
7r ( Si ) = Ui lP l + +U ikPk
or
W iP S,
We can write these equalties in the form ofthe following matrix equatio
II = UP,
where II = ( 7r ( S , , 7 r( S t) , 7r( S ))t , P = (pi, , Pk )t and U = (U i j) (i = 1, ,l+ 1 , 1 = 1 , ,k)
The matrix U will be calle th b si x matrix of ~
Th probabilistic entailment problem is reduced to the linear programming one finding
where
7 r S ) = U I +l ,l Pl + +U I +l , kPk,
subject to constraints
{ ':~U~Pd +U"P,_EI, (i ~ l , ,I)
LPJ - 1, P J ~ 0 (1- 1, ,k).
j=1
We denote the interval [a, f3] by F(S, 8), and write 8 f ( S, F(S, 8)).
In the special case, when 8 is the point-valued probabilistic knowledge base (pKB), ie., all I,
are points ai in [0,-1], constraints b c me equalities
LPJ - 1, PJ ~ 0 (1- 1, ,k).
Trang 3PROBABILISTIC REASONING BASED ON LAYERS OF KNOWLEDGE BASE
However, in general, F(S, B) is not to be a point value Some assumption is added to the constraints
to derive a point value for a target sentence The Maximum Entropy (ME) principle is usually used for such a deduction We will return to this investigation in Section 3
2.2 Layers of knowledge base
This subsection is devoted to presenting a procedure to produce layers of a knowledge base Suppose that B = {(Si, Ii) Ii = 1, , I} is an iKB, in which S, are propositional sentences and
Ii are interval values of sentences Si; S is any target sentence we would like to calculate its probability value
The reasoning for deriving the probabilistic value of the sentence S from the knowledge base B
depends strongly on the basic matrix of truth values of a subset of sentences in ~' = {Sl, , Sl}
that have some logical relationship with the target sentence We will characterise the relationship by layering the set of sentences in the knowledge base
A subset B' of B is sufficient for S if the probabilistic values of S deduced from Band B' are the same
It means that if B f- (S, I) and B' f- (S, I') then 1= I'.
Denote atom( q,) the set of atoms occuring in the sentence q, and atom( <1» = U1>E<1> atom( q,) the set of all atoms in sentences in <1>
Example 1 atom(A -> B /\C) = {A, B, C} atom( {A /\ B, C -> -, D}) = {A, B, C,D}.
The following note shows us the meaning of introducing the notion of atom
If B' is a subset of B such that
atom(B' U {S}) n atom(B - B') = 0,
then B' is sufficient for S.
We now consider a procedure to produce layers of a knowledge base based on a logical dependence
between its sentences with the sentence S.
Layers of sentences in ~ are constructed recursively as follows:
Lg = {S},
Lf ={q, I q,E~, q , rf:-Lg and atom(q,) n atom(Lg) ¥- 0},
L~ = {q , Iq,E~, q , rf:-u;=oLf and atom(q,) n atom(Lf) ¥- 0}
L~ = {q , Iq,E~, q , rf:-U7:~Lf and atom(q,) n atom(L~_l) ¥- 0},
With respect to each L~, let
Note that if S rf: - ~', then Bg ={(S, [0,1]); otherwise Bcf= {(S,Is) I (S,Is) E B}.
We call the subset 8;[ to benth-layer of the knowledge base Bw.r.t S Ifq, ELi, the layer Bl + 1
is called the nearest upper-layer of the sentence
It is easy to see that there always exists a number no such that L~ o ¥ - 0 but L~o+l = 0 We denote
B - uno BS
suf(S) - i=O i
It is clear that Bsuf(s) is a sufficient subset for S.
Consider the following illustrating example
Example 2 Given a knowledge base
Trang 48 = {B + A: [ 9,1 ]'
D + B :[ 8, 9 ] '
A A C :[.6,.8]'
D :[ 8,1] '
C: [ 2, 7 ] }
and a target sentence A.
The knowledge base can be layered into subsets with the target sentence A
L ~ = {A} , 8 = {A : 0,I}
L = {B + A , A AC} , 8 = {B + A : [.9,1]'A A C : [ 6, 8 ] }
L~ = {D + B, C}, 8 = {D + B :[ 8 , 9 ] ' C : [ 2 , 7 ] }
L~ = {D} , 8 : = {D : [ 8, I }
Thus, the sufficient subset for A is
8"uf(A) = 8
Similarly, layering can be performed for a point-valued probabilistic knowledge base
In the case a knowledge base is large, it is not easy to derive the smallest interval value for a
target s ntence S from 88 1l f( ) ' Layers gives us a method ofcalculating an approximate value The idea of approximate re soning is that the pro abilistic value of each sentence is updated by deriving
its value based on the nearest upper-layer of this sentence And when all sentences of the nearest
upper-layer of the target sentence are updated, its value is then calculated We now forrn alise the above presentatio
Without loss of generality, we suppose that 8 is a sufficient knowled e bas and S is a target
sentence It is layered into subsets 8 g, 8f, ,8:0 , where 8 ~ isthe hig est layer in the knowledge base Remind that L f ( = 1, ,no ) are subsets ofsentences w.r 8
Update of a sentence </J isrecursively defined as follows:
(i) For all </J EL ~ ' < /J isupdated;
(ii) </J E L f , ( < n ), is updated if all ' f; E L 7+ 1 are updated and 8(~+1,u) r (</J , 1 " ,) , where
8 ( s + 1,u) is the updated layer of 8is+1'
If 81' is updated into 8r,u) and 8ri,u) r ( S, I s ) , then I s is the approximate value for S
Thus, the approximate calculatio of interval value for a sentence co sists of three steps:
1 Divide the knowledge base into layers with the lowest layer being the target sentence S
2 Update the values for sentence of 8i - from the nearest upper-layer 8i. This proces starts
from i= no till 81 is updated into 8("~,u)'
3 Calculate the value for S from 8 ( ~ , u ) '
Example 3 (continued) In Example 2,we have constructed the layers ofthe knowledge base If we base on the whole 88 Uf( A ) it isnecessary to build a 6X 14-basic matrix of6rows and 14columns It
ispossible to calculate the value for A according to the ab ve approximate meth d
In the process of updating, D + Band B + A are stable, i.e., their values are [ 8 , 9]and [.9,1]'
respectively Since the value ofCis [.2, 7]' AAC isupdated to [.6, 7] Thus, avalue ofA is deduced from the 1th updated layer
8(i u) ={B + A : [.9,1]' AA C :[.6,.7]}
The basic matrix for sentences I; = {B + A, A A C,A} is
000
Trang 5We need to compute
on the domain determined by
{ 9::; P1+P2 +P3 < 1
6 ::; Pl <.7
P1+P2 +P3 +P4 = 1
The value of A is then [ 6,1]
We compare now the computable value with a value derived from the anytime deduction proposed
by Frish and Haddawy [8]. Anytime deduction is based on a set of thirty two rules enumerated from (i) to (xxxii) In the above example, applying (xx) first to D : [ 8,1 ] and D + B : [ , 9] yields
same way, combining C : [ 2, 7] and A : [0,1] via the rule (xxv) gives A 1\C : [0, 7 ] and then with
A 1 \C : [ 6, 8]via (xvii) gives A 1\C : [ 6, 7]; applying (xxvi) to this result yields A :[ 6,1 ]. Applying (xvii) to two ways of computation of A, we have A :[ 6,1]. The derived interval equals to the interval
value of A deduced by our method of approximate reasoning
In this section, we investigate a method ofreducing the complexity of computation in applying the
Maximum Entropy Principle for deriving a point value for a sentence from a point-valued probabilistic knowledge base
3.1 Maximum Entropy Deduction
We first review a technique named Maximum Entropy Principle [11]to select a probability distribution among distributions holding some initial conditions given by a knowledge base
Suppose that
8 = { 5i,O:i) I i = 1 , ,I}
is pKB and 5 is a sentence (5 = f 5i, i = 1, ,I). As presented in Section 2, den te F (5, 8) the
s t of values of 7r(5) = LWil =S Pi = UI + l lPl + +UI+l,kPk, where P = (pl, '" , pd varies in the domain defined by conditional equation
where II = (1,0:1,"" o:t}t and U + is the basic matrix composing of columns of truth values of
sentences 51, , 5 1 , 5 1 + 1 (5 1 +1= 5) with the first row being units
According to Maximum Entropy Principle, in order to obtain a single value for 5, we must sele t
a distribution P such that the following optimization problem holds
k
J = l
where P subjects to constraints determined by the conditional equation (1)
Suppose that (pl,' ,Pk) is a solution of the above problem Then the probability of5is denoted by
F(5 , 8) = UllPl + +UI + l kPk '
ith-column of U +
Trang 6From the initial conditons of the knowledge base, we can compute a, and then P i Thus the point
probability value ofSis then derived Wecall the deduction based on the Maximum Entropy Principle
to be the Maxim um E n tr o py d e duction or shortly ME deduction
3.2 Maxirnum Entropy Deduction with the Reduced Basic Matrix
As presented above, the ME deduction is based on the basic matrix constructed from the target
sentence and all sentences in the inital knowledge base The larger the basic matrix is, the more
complex the computatio is In fact, coefficients a i in (3) are only related to the matrix of truth
values of sentences in the knowledge base The complexity is slightly decreased if ME deduction is
based on the basic matrix constructed only from sentences of the knowledge base without the target
s ntence
As presented in Subsectio 2.2, the probabilistic inference only depends on the sufficient subset'
for the target sentence Without loss of generality, we suppose that B = B s u f s ), 0 = { W l ". ,wd
is a set of possible world classes determined by·~ ={Sl, . ,Sd and U + is the r e duced ba s ic matrix
ofsentences in ~ with the first row being units
In each class Wi , S can have either one truth value true/false or both truth values true and false
For ease of presentatio , we suppose that o classes Wl, , Wr n, the sentence S gets one truth value
and on W m+l " ,Wk, S has both values true and fals Thus, the ectende d set of possible world
classes W.r ~U{S} has the form
O+= F UE ,
where F = {WI, , wrn } and E = { w; : '+l ! W ; :'+l ,wt ,w; }. We have the following proposition
Proposition 1 Suppo se that P is a p robabi li t y d is i buti o sa t sf yi ng ME prin c ipl e on O W e hav e
w;i=S,w"I:<:i: :;m w;i S,m+l:<:;i:<:k
P r oo.f Supposep + -- (PI" I " 'Prn' Pr I n + I'Pm++ I" ", Pk,Pk- + -' is thepro b b bilia a ility diistrributiution on •n•+
satisfying ME and (1) According to the metho of constructing this distribution, we have
+ - - +-
-Pm+l - Pm+l!'" ,Pk - Pk '
Therefore, if P = (PI, , P m, Prn+l , Pk ) is the probabilistic distributo on 0 s tsfying (1) an
ME, then
Pi =P: (i = 1, ,m),
Pi = 2P t (i 2 m+ 1).
It is easy to derive (4) from these equalities The propositio is proved
In summary, the comp tatio of the p int value for a sentence S via ME co sists ofthree steps:
1 Co struct the sufficient subset for S to eliminate unnecessary information
2 Find an entropy-maximizing P based on the reduced basic matrix U+ of the sentences in the
sufficient subset
3 Calculate 7r S ) via the equality (4)
Example 4 Given a knowledge base
B = {A :ai,
A - > B : a2 ,
B - > C :a3 }
and a target sentence C
It is clear that B = The reduced basic matrix for the set ofsentences in B with the first
Trang 7PROBABILISTIC REASONING BASED ON LAYERS OF KNOWLEDGE BASE 33
row of units is
1 1 1
1 0 1
1 1 0
001
in which the second row is the truth values ofA , the third and fourth ones are of A - > Band B - + C,
respectvely Thus, there are five classes of possible word WI, ,W 5 correspon ing to five column vectors (eliminating the first row)
Compo ents of Pi are written in the form
with (ao, aI, a2, a3) satisfying the system of equations
{ aO ala 2 a3 +a Oa l a2 +aOala 3 +aOa 2 3 =QI
aO ala 2a 3 +aOa la 2 +aO a2 +aOa 2 3 = Q2
a Oala 2a3 +a O a la 3 +a O a a = Q3
aO a la 2 3 +a O a l a2 +a O a2 +a Oa l a +a O a2a 3 = 1
Solving yields
ao = (1 - QI)(l - Q2)(1 - QI +Q2 - Q3)/(QI +Q3 - 1)(Q2 - Q3)'
al = (Q2 - Q3)/(1 - Qd,
a = (QI +Q3 - 1)(Q2 - Q3)/(1 - Q2)(1 - QI +Q2 - Q3),
a = (1 +QI +Q3)/(1 - QI +Q2- Q3)
Thus, the entrop -maximizing P is given by:
( (Q2 - Q3l~(~IQ: Q3 - 1) )
P = 1 - QI
1-Q2 (1 - Qd(QI +Q3 - 1)/(1 - QI +Q2- Q3)
Since C has one true value on WI , two truth values in classes W4 and W 5 (false value o W 2 ,W 3 ) the
pro ability of A is then
4 CONCLUSION
This paper has presented ametho of layering aknowledge base based on the logical relati nship between sentences of the knowledge base with a target sentence By means of layers, we can perform
appro imate resoning in order to derve an interval value for the sentence Our ap roximate metho
is different from the anytime deductio pro osed b Frish and Haddawy [8] While o r o e is based
o the process of updating of all sentences before deriving an interval value for the target sentence,
their anytime deduction is based o a set of rules
Trang 8We have also presented a method of calculating the point probabilistic value of a sentence via
the Maximum Entropy Principle by not referring to the target sentence when constructing the basic
matrix This method slightly decreases the size of the matrix in the computation process
We have pre ented a comparative example between our approximate method and the anytime
deduction propos d by Frish and Haddawy A complete comparison of this approximate method with
the other o es will be a topic of our further work
Acknow l edgeme n t. Iam greatly indebted to my supervisor, Prof Phan Dinh Dieu, for invaluable
suggestions
REFERENCES
[1] K.A Anderson Characterizing consistency in probabilistic logic for a class of Horn clauses
M a th e matical Proqramrninq 66 (1994) 257-271
[2] F Bacchus, A J Grove, J.Y Halpern and D Koller From statistical knowledge bases to
degrees of belief, Artificial Intelligence 81(1-2) (1996) 75-143
[3] P.D Dieu, On a theory of interval-valued probabilistic logic, Research Report, NCSR Vietnam,
Hanoi, 1991
[4] P.D Dieu and P H Giang, Interval-valued probabilistic logic for logic programs, Journal of
Co mput er S cience and Cybernatic s 10(3) (1994) 1-8
[5] P.D Dieu and T.D Que, From a convergence to a reasoning with interval-valued probability,
Journal of Computer Science and Cybernetics 13(3) (1997) 1-9
[6] R Fagin, J Y Halpern, and N Megiddo, A logic for reasoning about probabilies, Information and Compuation 81 (1990) 78-128
[7] R Fagin and J Y Halpern, Uncertainty, Belief and Probability, Computational Intelligence 1
(1991) 160-173
[8] A.M Frish and P Haddawy, Anytime deduction for probabilistic logic, Artificial Intelligence
69 (1994) 93-122
[9] R Kruse, E Schwecke, and J. Heinsohn, Uncertainty and Vagueness in Knowledge Based Sys -tems , Springer-Verlag, Berlin - Heidelberg, 1991
[10]R.T Ng and V S Subr ahm anian, Probabilistic logic programming Information and Compu -tation 101 (1992) 150-201
[11] N.J Nilsson, Probabilistic logic, Artificial Intelligence 28 (1986) 71-78
[12] T.D Que, About semantics of probabilistic logic, Submitted to Computer Science and
Cyber-n e tic s
[1 ] P Snow, Compressed constraints in probabilistic logic and their revision, Uncertainty in Arti-ficial Intelligence (1991) 386-391
Received November is, 1999
D e partm e nt of Information Technology,
Po s t s and Telecommunications Institute of Technology ,
Hanoi , Vi e tnam.