DSpace at VNU: UNCERTAIN AND FUZZY OBJECT BASES: A DATA MODEL AND ALGEBRAIC OPERATIONS tài liệu, giáo án, bài giảng , lu...
Trang 1Fuzziness and Knowledge-Based Systems
Vol 19, No 2 (2011) 275−305
© World Scientific Publishing Company
275
DOI: 10.1142/S0218488511007003
UNCERTAIN AND FUZZY OBJECT BASES: A DATA MODEL
AND ALGEBRAIC OPERATIONS
TRU H CAO
Ho Chi Minh City University of Technology and John von Neumann Institute VNU-HCM,
268 Ly Thuong Kiet Street, District 10, Ho Chi Minh City, Vietnam
tru@cse.hcmut.edu.vn
HOA NGUYEN
Faculty of Information Technology, Ho Chi Minh City Open University,
97 Vo Van Tan Street, District 3, Ho Chi Minh City, Vietnam
hoa-hanh@hcm.vnn.vn
Received 20 January 2010 Revised 3 January 2011
Fuzzy set theory and probability theory are complementary for soft computing, in particular oriented systems with imprecise and uncertain object properties However, current fuzzy object-oriented data models are mainly based on fuzzy set theory or possibility theory, and lack of a rigorous algebra for querying and managing uncertain and fuzzy object bases In this paper, we develop an object base model that incorporates both fuzzy set values and probability degrees to handle imprecision and uncertainty A probabilistic interpretation of relations on fuzzy sets is introduced as a formal basis to coherently unify the two types of measures into a common framework The model accommodates both class attributes, representing declarative object properties, and class methods, representing procedural object properties Two levels of property uncertainty are taken into account, one of which is value uncertainty of a definite property and the other is applicability uncertainty of the property itself The syntax and semantics of the selection and other main data operations on the proposed object base model are formally defined as a full- fledged algebra
1 Introduction
It is undeniable that the real-world is pervaded by uncertain and imprecise information that we have to face, and make decisions on, in daily life Among the foundations for computer systems to deal with uncertainty and imprecision, probability theory and fuzzy set theory are major ones and complementary to each other Moreover, challenging real-world problems are also due to their large scales in practice For handling it, the object-oriented methodology has been proved as a key one for data modeling and system design and implementation In particular, there have been intensive research and development of fuzzy and probabilistic object-oriented databases, as collectively reported in Refs 1−4 Surveying research on extending the classical object-oriented data model to deal with uncertainty and imprecision, we identify the following key issues: (1) Modeling partial sub-class relationship; (2) Definition of partial class membership; (3) Representation of uncertain and/or imprecise attribute values; (4) Representation and execution of class
Trang 2methods; (5) Expression of partial applicability of class properties; and (6) Mechanism for inheritance under uncertainty and imprecision We discuss them in details in the following
For the first issue, in the classical object-oriented model, a class hierarchy defines the subclass relation on classes, whereby a class is totally included in any of its super-classes However, in the probabilistic and fuzzy cases, due to the uncertain applicability of class properties or the imprecision of attribute value ranges, the inclusion between classes naturally becomes graded, which could be computed on the basis of the value ranges of their common attributes.5,6 As discussed in Ref 7, a set of classes with a graded inclusion
or inheritance relation actually forms a network rather than a hierarchy, because if a class
A has some inclusion degree into a class B based on a fuzzy matching of their descriptions, then B usually also has some inclusion degree into A Moreover, in practice,
it is more natural to classify a concept into sub-concepts that are totally subsumed by it, than to think of overlapping between a concept and its sub-concepts, though the sub-concepts can overlap each other, as assumed in Ref 8 for instance
For the second issue, when attribute values of an object are uncertain and imprecise, its matching degree with a class becomes graded, and there have been different measures proposed In Ref 9, for instance, a membership function on a set of objects was defined for each class In Ref 10 linguistic labels were used to express the strength of the link of
an object to a class In Ref 8 membership was defined as similarity degrees between objects and classes In Ref 11 different measures were mentioned, including a probabilistic one, for membership degrees Nevertheless, for the soundness of using measures of different meanings, such as possibilistic and probabilistic ones, it is to be answered how those measures are integrated coherently on a common ground
For the third issue, many works on fuzzy object-oriented data models did not rely on probability theory, but used fuzzy sets or possibility distributions to represent imprecise attribute values The works10,11 also modeled uncertainty degrees for an attribute having a particular value However, much less concern was given for uncertainty over a set of values of an attribute and a foundation to combine probability degrees and fuzzy sets in the same model
For the fourth issue, while class methods are common in classical object-oriented systems for modeling object behaviors and parameterized properties, they were often neglected in uncertain and fuzzy extended models In Refs 8 and 11 methods were not considered In Ref 10 methods were mentioned but no formal representation and explicit manipulation were provided in the model In Refs 7 and 9, which were for declarative and deductive in contrast to imperative and procedural models, methods were formally defined as Horn clauses and executed as in a theorem proving process
For the fifth issue,12 introduced the notion of fuzzy property as an intermediate
between the two extreme notions of required property and optional property, each of
which was associated with a possibility degree of applicability of the property to the class Meanwhile,8 assumed that each property of a concept to have a probability degree for it occurring in exemplars of that concept Those are two typical works that model
Trang 3partial applicability of a property to a class of objects by possibility and probability degrees, respectively We note the distinction between the notion of uncertain property values and that of uncertain property applicability In the former case, a class or an object surely has a particular property but it is not sure which one among a given set of values the property takes Meanwhile, in the latter, it is even not sure if the class or the object has that property For example, “John owns a car whose brand is probably BMW” and
“John probably owns a car” express different levels of uncertainty In Refs 7, 10, and 11, the two levels were mixed
For the sixth issue, due to uncertain class membership and uncertain property applicability, inheritance of a class property by an object naturally becomes uncertain Uncertain inheritance was not considered in Refs 8, 9, and 10 In Ref 11, class membership degrees were used as thresholds to determine which properties in a class would be inherited with respect to an uncertainty degree In Ref 7, both membership of
an object into a class and applicability of a property to the class were represented by probability intervals and combined into a support pair for the object to inherit the property
For recent works,13 reviewed existing proposals and presented recommendations for the application of fuzzy set theory in a flexible generalized object model Further,14focused on representing data as constraints on object attributes and answering queries as constraint satisfaction Meanwhile, for realization of fuzzy object-oriented data models,15was concerned with implementation of their model on an existing platform In Ref 16, Fril++ was developed as a fuzzy object-oriented logic programming language The literature review of fuzzy relational and object-oriented databases17 missed those modeling uncertainty with probability theory
A common disadvantage of current fuzzy object-oriented models is that they lack of a rigorous algebra for querying and managing object bases In contrast, Ref 18 introduced
a probabilistic model to handle object bases with uncertainty, called POB (Probabilistic Object Base), and developed a full-fledged algebra for it However, the major shortcomings of the POB model are: (1) it does not allow imprecise attribute values; (2) it does not support class methods; and (3) it does not consider uncertain applicability of class properties To overcome the first two shortcomings,19,20 in turn extended POB to an uncertain and fuzzy object base model (UFOB) with class attributes and methods whose values could be fuzzy sets This paper extends UFOB further with class properties whose applicability to the class objects could also be uncertain, requiring and resulting in a new algebra of operations on uncertain and fuzzy object bases where previous definitions are
to be extended accordingly
Next, for the paper being self-contained, Section 2 recalls the probabilistic interpretation of relations on fuzzy sets and the algebra on fuzzy-probabilistic triples introduced in Refs 19 and 20, as a basis to integrate fuzzy set values into the probability-based framework of POB Section 3 describes properties of objects in UFOB, which can
be imprecise attribute values, computational methods, and uncertainly applicable to classes Section 4 presents the notion of instances and inheritance mechanism under
Trang 4uncertainty and imprecision in object bases Sections 5 and 6 define the selection operation and other algebraic operations on the proposed object base model The definitions in Secs 3 to 6 are extensions of the corresponding ones,19,20 for modeling and computing with uncertain applicability of class properties Finally, Sec 7 summarizes and concludes the paper
2 Fuzzy Sets and Probability
2.1 Probabilistic interpretation of fuzzy set relations
In this work, for extending the probabilistic model of POBs with fuzzy set values, we apply the voting model interpretation of fuzzy sets.21,22 That is, given a fuzzy set A on a domain U, this model defines a mass assignment mẶ) (ịẹ, probability distribution) on the power set of U, where the mass (ịẹ, probability value) assigned to a subset of U is
the proportion of voters who have that subset as a crisp definition for the fuzzy concept
A
Example 1: Let us take the Dice example in Ref 22 Given the dice values from the set
{1, 2, 3, 4, 5, 6}, suppose that a score high is defined by the discrete fuzzy set {3:0.2,
4:0.5, 5:0.9, 6:1}, ịẹ, the membership of value 3 is 0.2, and so on The voting pattern for
a group of 10 persons for this score could be as in Table 1
Table 1 Voting pattern for high dice values
That is, all voters, P1 to P10, vote for value 6 as a high score, while only two of them,
P1 and P2, vote for 3 as a high score, and so on In other words, the crisp definition of P10
for the high score is {6}, while that of P1 and P2 is {3, 4, 5, 6}, for instancẹ An
assumption made in this voting model is that any person who accepts a value as a high score also accepts all values that have higher membership in the fuzzy set high
This model defines the following mass assignment on the power set of {1, 2, 3, 4, 5, 6}: {6}: 0.1 {5, 6}: 0.4 {4, 5, 6}: 0.3 {3, 4, 5, 6}: 0.2
where the mass assigned to a subset of {1, 2, 3, 4, 5, 6} (ẹg mhigh({5, 6}) = 0.4) is the proportion of voters who have that subset as a crisp definition for the fuzzy concept high
voters scores
Trang 5score This mass assignment corresponds to a family of probability distributions on {1, 2,
3, 4, 5, 6}
On the basis of this voting model we introduce a probabilistic interpretation of the
following binary relations on fuzzy sets We write Pr(e1 | e2) to denote the conditional
probability of e1 given e2
Definition 1 Let A be a fuzzy set on a domain U, B be a fuzzy set on a domain V, and θ
be a binary relation from {=, ≤, <, ⊆, ∈} assumed to be valid on (U × V) The
probabilistic interpretation of a relation A θ B, denoted by prob(A θ B), is a value in [0,
1] that is defined by ∑S⊆U, T⊆V Pr (u θ v | u∈S, v∈T).mA(S).mB(T)
Intuitively, given fuzzy propositions “x is A” and “y is B”, prob(A θ B) is the probability for x θ y being true For a relation A θ B, if θ ∈ {=, ≤, <, ⊆}, then A and B are
presumed to be on the same domain, or compatible ones; if θ is ∈, then B’s domain is to
be the power set of A’s domain
Example 2: In the Dice example above, suppose that about_5 is defined by the fuzzy set
{6:0.3, 5:1, 4:0.3}, whose mass assignment is:
{5}: 0.7 {4, 5, 6}: 0.3
Given “x is about_5” and “y is high”, prob(about_5 = high) measures how likely it is that
x = y, as calculated below:
prob (about_5 = high)
= Pr(u = v | u∈{5},v∈{6}).m about_5({5}).mhigh({6}) +
Pr (u = v | u∈{5},v∈{5,6}).mabout_5({5}).mhigh({5, 6}) +
Pr (u = v | u∈{5},v∈{4, 5, 6}).m about_5({5}).mhigh({4, 5, 6}) +
Pr (u = v | u∈{5},v∈{3, 4, 5, 6}).m about_5({5}).mhigh({3, 4, 5, 6}) +
Pr (u = v | u∈{4, 5, 6},v∈{6}).mabout_5({4, 5, 6}).mhigh({6}) +
Pr (u = v | u∈{4, 5, 6},v∈{5, 6}).m about_5({4, 5, 6}).mhigh({5, 6}) +
Pr (u = v | u∈{4, 5, 6},v∈{4, 5, 6}).m about_5({4, 5, 6}).mhigh({4, 5, 6}) +
Pr (u = v | u∈{4, 5, 6},v∈{3, 4, 5, 6}).mabout_5({4, 5, 6}).mhigh({3, 4, 5, 6})
= 0.34
Definition 2 Let A and B be two fuzzy sets on the same domain U The probabilistic
interpretation of the relation A → B, denoted by prob(A → B), is a value in [0, 1] and
defined by ∑S,T⊆U Pr (u∈T | u∈S).m A(S).mB(T)
Unlike prob(A θ B) in Definition 1, prob(A → B) is actually the fuzzy conditional
probability23 of “x is B” given “x is A” Also, for fuzzy sets on continuous domains, the
above probabilistic interpretations can be adapted using integration instead of addition
Trang 6Example 3: In the Dice example, one has:
prob (high → about_5)
Pr (u ∈{4, 5, 6} | u∈{5, 6}).m high({5, 6}).mabout_5({4, 5, 6}) +
Pr (u ∈{4, 5, 6} | u∈{4, 5, 6}).mhigh({4, 5, 6}).mabout_5({4, 5, 6}) +
Pr (u ∈{4, 5, 6} | u∈{3, 4, 5, 6}).m high({3, 4, 5, 6}).mabout_5({4, 5, 6})
= 0.53
2.2. Fuzzy-probabilistic triples
Definition 3 Let dom( τ) be the set of values of a type τ A fuzzy-probabilistic triple on τ
is defined to be of the form 〈V, α, β〉 where V ⊆ dom(τ), and α and β are lower and upper bound probability distributions on V
In UFOB, dom(τ) and V can consist of fuzzy values Intuitively, given the uncertain value of an attribute represented by 〈V, α, β〉, the probability for that attribute taking a
value v∈V is between α(v) and β(v)
Example 4: Suppose the treatment duration of a patient is estimated within about 30 or 40
days with a probability for each between 4 and 6 Then this information can be
represented by the fuzzy-probabilistic triple 〈{about_30, about_40}, 8u, 1.2u〉 where
about_30 and about_40 are fuzzy sets defining the imprecise treatment durations and u is the uniform distribution Here, 8u and 1.2u respectively denote the lower and upper
bound probability distributions α and β that are defined by α(x) = 8(1/2) = 4 and β(x) =
1.2(1/2) = 6 for every x∈{about_30, about_40}
Given two events e1 and e2 having probabilities in the intervals [L1, U1] and [L2, U2],
one may need to compute the probability intervals of the conjunction event e1 ∧ e2,
disjunction event e1 ∨ e2, or difference event e1 ∧ ¬e2 In this paper we employ the conjunction, disjunction, and difference strategies given by18 and24 as presented in Table
2, where ⊗, ⊕, and denote the conjunction, disjunction, and difference operators, respectively
Trang 7Table 2 Examples of probabilistic combination strategies
2 [α(v), β(v)] = ⊕me : v1∈V1,v2∈V2,v = v1∩v2 [α1(v1), β1(v1)]⊗[α2(v2), β2(v2)], for every v∈V,
where ⊕me is the mutual exclusion probabilistic disjunction strategy
In the above computation of [α(v), β(v)], since there can be more than one pair (v1, v2)
∈ V1×V2 such that v = v1∩v2, the probability intervals for those pairs are combined using the mutual exclusion probabilistic disjunction strategy This is in agreement with Ref 25 for aggregate operations on probabilistic relational databases We also note that, for the
POB case, each value in V1 orV2 is elementary and non-fuzzy, so V is actually equal to the classical intersection of V1 and V2, and no such probabilistic disjunction is required for [α(v), β(v)] Meanwhile, with the normality condition of v1∩v2, the definition coincides to that of POB when all fuzzy sets reduce to crisp values
Example 5: Let fpt1 = 〈{about_48, about_72}, 8u, 1.2u〉 and fpt2 = 〈{about_72,
about_96 }, u, u〉 be fuzzy-probabilistic triples Then fpt1 ⊗in fpt2 with the independence
probabilistic conjunction strategy is the fuzzy- probabilistic triple fpt = 〈{about_72}, 2u, 3u〉
Trang 8Definition 5 Let fpt1 = 〈V1, α1, β1〉 and fpt2 = 〈V2, α2, β2〉 be fuzzy-probabilistic triples, and ⊕ is a probabilistic disjunction strategy Then the disjunction of fpt1 and fpt2 under ⊕, denoted by fpt1 ⊕ fpt2, is the fuzzy-probabilistic triple fpt = 〈V, α, β〉, such that:
Definition 6 Let fpt1 = 〈V1, α1, β1〉 and fpt2 = 〈V2, α2, β2〉 be fuzzy-probabilistic triples,
and is a probabilistic difference strategy Then the difference of fpt1 and fpt2 under , denoted by fpt1 fpt2, is the fuzzy-probabilistic triple fpt = 〈V, α, β〉, such that:
2.3. An algebra on fuzzy-probabilistic triples
In this work, for introducing methods into UFOB, we propose a principle to extend an algebra on values of a type τ to the corresponding one on fuzzy-probabilistic triples It is stated as an abstract algebra with operations on fuzzy-probabilistic triples as in the following definition
Definition 7 Let U U U = {〈V, α, β〉| V ⊆ dom(τ)} be a non-empty set of fuzzy-probabilistic
triples of type τ If A = (dom(τ), o1,…, on) is a fuzzy set algebra with operations o1,…, on
on dom(τ), then A A A = (U U U, op 1,…, opn) is a fuzzy-probabilistic triple algebra, in which the operations op1, op2,…, opn on U U U are derived from A as follows:
Example 6: Let {real} denote the domain of fuzzy real numbers, U U U = {〈V, α, β〉 | V ⊆
dom({real})} be the corresponding set of fuzzy-probabilistic triples, and A =
(dom({real}), ×) be the algebra with the fuzzy multiplication operator × based on the
extension principle.26 Then A A A = (U U U, × ×××) is the corresponding algebra on U U U with the
operator × defined by:
〈V1, α1, β1〉 × 〈V2, α2, β2〉 = 〈V, α, β〉, where V = {v = v1 × v2 | v1 ∈V1, v2 ∈V2} and
(v1 × v2)(z) = supz =x×y min [v1(x), v2(y)], for every real number x, y, z, and
[α(v), β(v)] = ⊕me: v
1∈V1,v2∈V2,v=v1×v2[α1(v1), β1(v1)] ⊗ [α2(v2), β2(v2)], for every v ∈V
Trang 93 Properties of Uncertain and Fuzzy Objects
3.1. UFOB class hierarchies
Fig 1 An example UFOB class hierarchy
For UFOB we use the same definition of class hierarchy as for POB Figure 1 is a hierarchy of patients, who are classified as being children, teenagers or adults and, alternatively, as being out-patients, or in-patients Those subclasses of a class that are
connected to a d node are mutually disjoint, and they form a cluster of that class That is,
the class PATIENT has two clusters {CHILD, TEENAGER, ADULT} and {OUT_PATIENT, IN_PATIENT} The value in [0, 1] associated with the link between a class and one of its immediate subclasses represents the probability for an arbitrary object
of the class belonging to that subclass For instance, the hierarchy says 80% of patients are non-resident while the rest 20% are resident, and 60% of resident patients are adult
3.2. UFOB class attributes and methods
As in the classical object-oriented model, each UFOB class is characterized by a number
of properties, each of which is an attribute or a method Each property has its type and value For a method, its type and value are those of its output, which is defined as a function of the input arguments of the method For a unified treatment of attributes and methods, an attribute could be considered as a special method with a fixed output and having no input argument Alternatively, a method could be considered as a parameterized attribute, whose value depends on its input arguments Moreover, each property is associated with a probability interval representing its uncertain applicability to the class in which it is defined The following definition and example explain these ideas
Definition 8 Let P P P be a set of properties and T T T be a set of atomic types Then types are
inductively defined as follows:
1 Every atomic type from T T T is a type
2 If τ is a type, then {τ} is the fuzzy set type of τ
d
0.2
0.3 0.8
Trang 103 If P1,…, Pk are pairwise different properties from P P P, τ i’s and τ ij’s are types, and [li,
u i]’s are subintervals of [0, 1], for every i from 1 to k and j from 1 to ni, then τ =
[P1(τ11,…, τ1n1): τ1[l1, u1],…, Pk(τ k1,…, τknk): τk[lk, uk]] is the tuple type over {P1,…, Pk} P1,…, Pk are called top-level properties of τ, and τ.P i and [τ.P i] denote
τi and [li, ui], respectively
In the definition above, τij’s represent the types of the input arguments of Pi when it is a method, and they are null when Pi is an attribute Each [li, ui] represents the lower and upper applicability probabilities of property Pi to type τ
Example 7: In the object base of patients described above, a tuple type can be [name:
string, age: {real}, address: string, check_date: datetype, medical_history: {string}[.8, 1], disease: string, duration: {real}, cost_per_day: {real}, total_cost
([duration: {real}], [cost_per_day: {real}]): {real}] A property with uncertain
applicability like medical_history: {string}[.8, 1] says that at least 80% patients have medical histories recorded
The domain values of a particular type in UFOB are then defined as follows
Definition 9 Let every atomic type τ ∈ TTTT be associated with a domain dom(τ) Then
values are defined by induction as follows:
1 For every τ∈TTTT, every v∈dom(τ) is a value of type τ
2 For every τ∈TTTT, every fuzzy set on dom(τ) is a value of type {τ}
3 If P1,…, Pk are pairwise different properties from P P P, v i’s are values of types τi’s, and [l’i, u’i]’s are subintervals of [0, 1], for every i from 1 to k, then [P1: v1[l’1,
u’1 ],…, Pk: vk[l’k, u’k]] is a value of type τ = [P1: τ1[l1, u1],…, Pk: τ k[lk, uk]]
In Definition 8, each probability interval [li, ui] quantifies the uncertain applicability
of Pi generally to all objects of type τ That is, the probability for a random object of type
τ having property Pi is between li and ui Meanwhile, in Definition 9 each probability interval [l’i, u’i] specifies the uncertain applicability of Pi to a specific object of type τ Therefore, [li, ui] is just the default value and not necessarily the same as [l’i, u’i]
Example 8: One may say that at least 90% of birds can fly, i.e., the applicability probability interval of the property fly to the class BIRD is [0.9, 1] However, since a penguin or an injured bird cannot fly, that probability interval for such a particular bird is [0, 0]
Example 9: Let young, middle_aged, old be linguistic labels of fuzzy numbers on
dom({real}) as illustrated in Fig 2
Trang 11
Fig 2 Fuzzy values of the attribute age
Then [name: Nguyen, age: middle_aged, address: Saigon, check_date: 21-3-08,
medical_history: {cholecystitis}[.7, 1], disease: hepatitis] is a value of the
corres-ponding tuple type
Each value of a property, or a method argument, of an object is now defined as a
fuzzy-probabilistic triple 〈V, α, β〉 where elements in V are values defined in Definition 9
Definition 10 Let P1,…, Pk be pairwise different properties from P P P, V i’s and Vij’s be finite
sets of values of types τi’s and τij’s, [α i, βi]’s and [α ij, βij]’s be pairs of probability distributions over Vi’s and Vij’s, for every i from 1 to k and j from 1 to ni Then fptv =
[P1(〈V11, α11, β11〉, …, 〈V 1n1, α1n1, β1n1 〉): 〈V1, α1, β1〉[l’1, u’1], …, Pk(〈Vk1, αk1, βk1〉, …,
〈V knk, αknk, βknk 〉): 〈V k, αk, βk 〉[l’ k, u’k]] is a fuzzy-probabilistic tuple value of type [P1(τ11,…, τ1n1): τ1[l1, u1], …, Pk(τ k1,…, τknk): τk[lk, uk]] over {P1,…, Pk} One writes
fptv Pi and [fptv.Pi] to denote 〈Vi, α i, β i 〉 and [l’ i, u’i], respectively
We note that, for each Pi, [li, ui] or [l’i, u’i] represents uncertainty of the applicability
of Pi, while [α i, β i] represents uncertainty of the value of Pi over Vi when Pi is applied Example 10: Assume we know that the name, age, and address of a patient are Nguyen,
middle_aged , and Saigon, respectively The patient is checked for health status, and the
doctor does not know certainly which kind of disease the patient gets However, based on the clinical symptoms, the doctor can judge the probability for the patient suffering
hepatitis or cirrhosis being 5 In addition, it is likely at least to degree 7 that the patient previously caught another disease that is either cholecystitis or gall-stone with equal
probabilities Also, suppose that the daily treatment cost is about 60USD and the estimated treatment duration for that patient is 30 or 32 days with probabilities between 4 and 6 Then this information can be represented by the fuzzy-probabilistic tuple value [name: 〈{Nguyen}, u, u〉, age: 〈{middle_aged}, u, u〉, address: 〈{Saigon}, u, u〉, medical_history: 〈{{cholecystitis}, {gall-stone}}, u, u〉[.7, 1], disease: 〈{hepatitis,
cirrhosis }, u, u〉, duration: 〈{30, 32}, 8u, 1.2u〉, cost_per_day: 〈{about_60}, u, u〉], where middle_aged and about_60 are linguistic labels of fuzzy sets
Trang 123.3. UFOB schemas
UFOB schemas extend those of POB with class methods
Definition 11 A UFOB schema is a hextuple (C C C, τ, ⇒, me, p, f) where:
1 C C C is a finite set of classes
2 τ maps each class c to a tuple type τ(c) representing c’s properties and their types
3 ⇒ is a binary relation on C C C such that (C C C, ⇒) is a directed acyclic graph, whereby
each arc c ⇒ d means c is an immediate subclass of d
4 me maps each class c∈C C C to a partition of the set of all immediate subclasses of c,
such that the classes in each cluster of the partition me(c) are mutually disjoint
5 p maps each arc c ⇒ d in (C C C, ⇒) to a rational number p(c | d) in [0, 1] measuring
the conditional probability for an object picked at random uniformly from d belonging to c such that ∀d∈CCCC, ∀D D D ∈me(d): Σ c ∈D D p(c | d) ≤ 1
6 f maps each method Pi(τ i1, τi2, …, τini): τi to a function from Cartesian products of
fuzzy-probabilistic triples of types τij’s to fuzzy-probabilistic triples of type τi
Given c1 ⇒ c2 ⇒ … ⇒ ck, one can write c1 ⇒* c k Like most object-oriented systems, it
assumed here that multiple super-classes of a class do not have a common property
Example 11: A UFOB schema for the patient database above can be defined as follows:
(C C C, ⇒), me and p are given as in Fig 1
f defines the method total_cost using an algebra on fuzzy-probabilistic triples
introduced above, with the extension principle-based multiplication operation × on fuzzy sets and the independence probabilistic conjunction strategy, as follows:
PATIENT: total_cost([duration: 〈V1, α1, β1〉], [cost_perday: 〈V2, α2, β2〉]): 〈V, α, β〉
return 〈V, α, β〉 = 〈V1, α1, β1〉 × 〈V2, α2, β2〉
Table 3 Type assignment τ
PATIENT [name: string, age: {real}, address: string, check_date: datetype, disease:
string, duration: {integer}, cost_per_day: {real}, total_cost: {real}]
OUT_PATIENT [check_again: datetype[.95, 1]]
IN_PATIENT [bed_no: string]
CHILD [medical_history: {string}[.4, 6]]
TEENAGER [medical_history: {string}[.6, 8]]
ADULT [medical_history: {string}[.8, 1]]
OUT_TEENAGER []
Trang 13An FPOB schema as defined above may be inconsistent when there is no set of
objects that satisfies its class hierarchy and probability assignment It is consistent if and
only if it has a taxonomic and probabilistic model as in the following definition adapted from Ref 18
Definition 12 Let S = (C C C, τ, ⇒⇒⇒, me, p, f ) be an FPOB schema An interpretation of S is a
mapping ε from CCCC to the set of all finite subsets of a set O O O of object identifiers It is said to
be a model of S if and only if:
1 ε(c) ≠ ∅ for every c∈CCCC, and
2 ε(c) ⊆ ε(d) for all c, d∈CCCC such that c ⇒⇒⇒ d, and
3 ε(c)∩ε(d) = ∅ for all c, d∈CCCC such that c and d belong to the same cluster defined
by me, and
4 |ε(c)| = p(c | d).|ε(d)| for all c, d∈CCCC such that c ⇒⇒⇒ d
4 UFOB Instances and Inheritance
4.1. UFOB instances
Given a UFOB schema, a UFOB instance over the schema is defined as a base of objects associated with their classes and fuzzy-probabilistic tuple values
Definition 13 Let S = (C C C, τ, ⇒, me, p, f) be a UFOB schema and O O O be a set of object
identifiers A UFOB instance over S is a pair (π, ν) where:
1 π maps each c∈CCCC to a finite subset of OOOO such that, for different c1, c2 ∈ CCCC, π(c1)∩π(c2) = ∅ In addition, the mapping π*: CCCC → 2 O is defined by π*(c) =
∪{π(d) | d∈CCCC, d ⇒* c} comprising objects that are defined in c or its proper
subclasses
2 For each c ∈ CCCC, ν maps each o∈π(c) to a fuzzy-probabilistic tuple value [P1: 〈V1,
α1, β1〉[l1, u1], …, Pk: 〈Vk, α k, β k 〉[l k, uk]] of type τ(c)
Intuitively, π(c) is the set of all objects whose most specific class is c Meanwhile,
π*(c) is the set of all objects that belong to c
Example 12: A UFOB instance over the UFOB schema in Example 11 is shown in Table
4 and Table 5 We note that Table 5 shows all inherited properties of an object from its
super-classes Here, about_60 = {59: 5, 60: 1, 61: 5} is the fuzzy number representing the approximate daily treatment cost of the patient denoted by o4, and about_1800 = 30 ×
about_60 and about_2400 = 40 × about_60 are the fuzzy numbers representing the
probable total treatment costs of that patient Meanwhile, medical_history:
〈{{cholecystitis}}, u, u〉[.7, 1] expresses it is likely at least to degree 7 that o4 got
cholecystitis before
Trang 14Table 4 Mapping π and π*
PATIENT {o 1 } {o1, o2, o3, o4, o5 } OUT_PATIENT {} {o2, o3, o5 } IN_PATIENT {} {o4 }
TEENAGER {} {o2, o3, o5 }
OUT_TEENAGER {o2, o3, o5 } {o2, o3, o5 } IN_ADULT {o 4 } {o4 }
Table 5 Value assignment ν
o1 [name: 〈{Le}, u, u〉, age: 〈{45}, u, u〉, address: 〈{Saigon}, u, u〉, check_date: 〈{20-3-08}, u, u〉, medical_history: 〈{{bronchitis}}, u, u〉[.32, 4], disease: 〈{lung cancer, tuberculosis}, 8u, 1.2u〉, duration: 〈{400, 500}, u, u〉, cost_per_day: 〈{300}, u, u〉, total_cost: 〈{120,000, 150,000}, u, u〉]
o2 [name: 〈{Tran}, u, u〉, age: 〈{16}, u, u〉, address: 〈{Hue}, u, u〉, check_date: 〈{20-3-08}, u, u〉, disease: 〈{flu}, u, u〉, duration: 〈{7}, u, u〉, cost_per_day: 〈{30}, u, u〉, total_cost: 〈{210}, u, u〉]
o3 [name: 〈{Nguyen}, u, u〉, age: 〈{young}, u, u〉, address: 〈{Hanoi}, u, u〉, check_date: 〈{21-3-08}, u,
u〉, disease: 〈{angina}, u, u〉, duration: 〈{10}, u, u〉, check_again: 〈{27-3-08}, u, u〉[.9, 1], cost_per_day: 〈{160, 170}, 8u, u〉, total_cost: 〈{1600, 1700}, 8u, u〉]
o4 [name: 〈{Ho}, u, u〉, age: 〈{middle_aged}, u, u〉, address: 〈{Saigon}, u, u〉, check_date: 〈{21-3-08},
u , u〉, medical_history: 〈{{cholecystitis}}, u, u〉[.7, 1], disease: 〈{hepatitis, cirrhosis}, u, u〉,
duration: 〈{30, 40}, u, u〉, bed_no: {A35}, u, u〉, cost_per_day: 〈{about_60}, u, u〉, total_cost:
〈{about_1800, about_2400}, u, u〉]
o5 [name: 〈{Trinh}, u, u〉, age: 〈{young}, u, u〉, address: 〈{Danang}, u, u〉, check_date: 〈{22-6-08}, u,
u〉, disease: 〈{bronchitis}, u, u〉, duration: 〈{10, 15}, u, u〉, cost_per_day: 〈{50}, u, u〉, total_cost:
〈{500, 750}, u, u〉]
4.2. UFOB probabilistic extents
In classical object bases, the extent of a class comprises all the objects that belong to that class In UFOB, the probabilistic extent of a class specifies the probability for each object
to belong to that class The following definition is adapted from that of POB
Definition 14 Let (π, ν) be a UFOB instance over the schema S = (CCCC, τ, ⇒, me, p, f)
Then, for each class c∈CCCC, the probabilistic extent of c, denoted by ext(c), maps each
o ∈π(CCCC) to a set of rational numbers in [0, 1] as follows:
1 If o∈π*(c) then ext(c)(o) = {1}
2 If there exists d such that o∈π*(d) and ε(c)∩ε(d) = ∅ for every model ε of S, then
ext (c)(o) = {0}
3 Otherwise, ext(c)(o) = {p | p is the product of the arc probabilities on a path from c
up to d where c ⇒* d with d being minimal and o∈π*
(d)}
Therefore, the probability interval for an object o to belong to a class c is [min(ext(c)(o)), max(ext(c)(o))]
Trang 15Example 13: Let I be the UFOB instance in Example 12 The probabilistic extents of the
classes OUT_TEENAGER and IN_ADULT are defined by:
ext (OUT_TEENAGER)(o1) = {.16} ext(IN_ADULT)(o1) = {.12}
ext (OUT_TEENAGER)(o2) = {1} ext(IN_ADULT)(o2) = {0}
ext (OUT_TEENAGER)(o3) = {1} ext(IN_ADULT)(o3) = {0}
ext (OUT_TEENAGER)(o4) = {0} ext(IN_ADULT)(o4) = {1}
ext (OUT_TEENAGER)(o5) = {1} ext(IN_ADULT)(o5) = {0}
4.3. UFOB inheritance
In UFOB, both the applicability of a property to a class and the membership of an object
to a class can be uncertain We define the probability for an object to inherit a property from a class as the conjunction of the applicability uncertainty and the membership uncertainty
Definition 15 Let [l, u] be the applicability probability interval of a property P to a class
c , and [x, y] be the membership probability interval of an object o to c The applicability probability interval of P to o is defined to be [l, u]⊗[x, y], where ⊗ is a probabilistic
conjunction strategy
Example 14: For the UFOB instance in Example 12, the membership of the object o1 to the class ADULT is 4 So it inherits the property medical_history from that class with the probability interval [.32, 4], under the independence probabilistic conjunction strategy
In order to compute the probability interval for an object having some property with some value, we define a combination of the uncertain applicability of the property to the object and the uncertain value of that property as a conjunction of these two events as follows
Definition 16 Let 〈V, α, β〉 be the fuzzy-probabilistic triple of a property P of an object
o , and [l, u] be the applicability probability interval of P to o Then the derived probabilistic triple of P for o is 〈V, α’, β’〉 where [α’(v), β’(v)] = [α(v), β(v)]⊗[l, u] for all
fuzzy-v ∈V, where ⊗ is a probabilistic conjunction strategy
5 UFOB Selection Operation
5.1. Syntax of selection conditions
As for general data or object bases, selection is a basic operation for UFOB Intuitively,
the result of a selection query on an FPOB instance I over an FPOB schema S is another UFOB instance I’ over S such that the objects of the classes in I’ and their property
values satisfy the selection condition of the query