An Ontological Approach to Lifelog Representation for Disclosure Control Truong Thi Thu Hien1, Shin-ichiro Eitoku2, Tomohiro Yamada2, Shin-yo Muto2 and Masanobu Abe2 1 Department of Inf
Trang 1An Ontological Approach to Lifelog Representation for Disclosure Control
Truong Thi Thu Hien1), Shin-ichiro Eitoku2), Tomohiro Yamada2), Shin-yo Muto2) and Masanobu Abe2)
1) Department of Information System, College of Technology, Vietnam National University, Hanoi
2) NTT Cyber Solutions Laboratories, Nippon Telegraph and Telephone Corporation
Abstract— In services using lifelog collected continuously and
over a long period of time, disclosure control is essential in order
to deal with the privacy because this lifelog includes confidential
information which users are unwilling to disclose freely This
paper proposes an ontology model for disclosure control needed
to handle each user's lifelog The user is able to easily set
disclosure rules on his/her diverse data collected from various
data sources Features of the proposed model are as follows: (1)
Values can be differentiated by their context, (2) Various time
expressions can be handled and (3) Disclosure attribute is set on
each lifelog item (fine-grained control) We show an example of
implementation and the result of experiments using an
implemented system
Keywords-component; Lifelog, Disclosure Control, Ontology
I INTRODUCTION The miniaturization of mobile terminals with various
sensors has made it possible to collect lifelog data from sensors
(GPS devices, an acceleration sensor, etc.) in addition to
collecting lifelog data from PCs (schedules, search words,
E-mail etc.) The trend has stimulated the rise of services that use
lifelogs Because each user can carry one or more mobile
terminals all the time, he/she can not only collect various kinds
of his/her lifelog but also collect the data continuously over
long periods of time Therefore, the amount of data will be
huge, and more over, various time expressions are needed to
time stamp the data For example, search keywords might be
given instant time stamps while staying places should be given
time intervals The other requirement is to deal with the
privacy concerns of users The collected data might include
information the users are unwilling to disclose For example,
place data in the lifelog could be useful for organizing business
records, but some places where a user met his girl friend the
user may want to keep secret
Taking this background into account, we have proposed the
architecture of a lifelog processing system [1] To receive a
service, the user has to provide the lifelog data that the service
requires However, the part of the lifelog data might include
unwilling data as mentioned above Therefore, it is very
important to provide a framework that reassures users that their
privacy will not be compromised
Requirements with regard to disclosure control can be
summarized as follows: (1) Values can be differentiated by
their context Lifelogs collected by different information
sources should have different meaning (e.g actually visited
locations (GPS data) differ from locations intended to visit
(schedule data) (2) Various time expressions should be
handled (e.g time point, time interval and social time), because the representation of time is different from the kinds of
information (3) Disclosure attribute is set on each lifelog
item (fine-grained control) The judgment of disclosure
depends on each event of the lifelogs, even if the kind of event (e.g visited place, searched word, etc.) is same
In order to satisfy these requirements, we propose a domain ontology model that yields a semantic representation of everyday lifelogs The model serves to represent a user’s lifelog data The system employs this model to represent disclosure rules given by the end-user We compare it to the existing ontology models for realizing lifelog disclosure control, implement a system based on this model, and show the results of experiments on this system
II LIFELOG PROCESSING SYSTEM Recently, location-based services based on GPS sensors or cell tower information have become popular [2][3] Data used
in such services (person’s location, used transportation or activity, etc.) is one kind of lifelog data We foresee that the number of lifelogs captured during our daily life will increase and concierge services based on lifelogs will become essential
to us
In order to realize these services, we are developing a lifelog processing system Fig.1 shows the architecture It
consists of local system (user’s PC, home server, etc.), an aggregated lifelog manager, and application system The local
system generates and manages the user’s lifelogs The aggregated lifelog manager manages many users’ lifelogs uploaded from their local systems, and creates valuable information from these lifelogs The information is used to provide the appropriate services to each user In the local system, the system generates lifelogs from the data acquired by various sensors [4][5][6] For example, by using latitude and longitude data acquired from GPS devices, the local system generates address information, transportation, etc To receive a service, the user first obtains information about what lifelogs the application system needs After the user decides which lifelogs are to be disclosed, the system uploads them to the aggregated lifelog manager Therefore, only lifelogs that the user has decided to disclose to the application system, are stored in the aggregated lifelog manager The user can control the lifelog data in his/her local system The uploading of lifelogs to the aggregated lifelog manager is our focus point with regard to disclosure control
The 13th IEEE International Symposium on Consumer Electronics (ISCE2009)
Trang 2Lifelogs which user decides to disclose
Local system (User’s PC, Home server, etc.)
Address Place,
…
Mr **
Ms **
…
Train Walk,
…
User’s lifelog
DB
1 Generate lifelog
from sensor data 4 Upload lifelogs user
decides to disclose
User’s lifelog
DB
Other users’s lifelog
Data mining processing
Aggregated lifelog manager
Disclosure control
Application system
Service provider
***
***
6 Send lifelogs
to service provider
The kinds of lifelog the service provider requests to the user
***
***
2 Download requirement for types of lifelogs
5 Data mining on many users’ lifelog
GPS
Sensor
Accelera-tion
Sensor
Sensor
“Where” type info
“Who” type info
“What” type info
Disclosure rule DB
Other users’
lifelog DB
Service provider
Figure 1 Lifelog processing system
Feature extraction
3 Add disclosure attribute to each lifelog
***
***
Ontologies play a pivotal role in not only the Semantic
Web, but also pervasive computing and next generation mobile
communication systems [7] Some ontology models exist to
capture context in pervasive computing However, there is no
ontology model that satisfies the three points mentioned in
Section 1 GLOSS [8] (GLObal Smart Space) supports
interactions between people and places on a global scale By
exploiting the features of physical spaces, GLOSS software
provides appropriate information or service, GLOSS provides
an easy-to-extend ontology that includes personal profile,
location, mode of transportation, time, and activity It is a
powerful tool for modeling the human activities that are closely
related to geographical information However, it lacks a
representation for information sources (only activity is
mentioned) Thus, for example, it cannot distinguish between
events actually done and events scheduled to be done
SOUPA[9] (Standard Ontology for Ubiquitous and Pervasive
Applications) is a well-structured model that can capture
context and to make policies, but the model cannot treat
disclosure control for each item in context because represented
events are not connected to the acquired channel (only to
location and time attributes) COMANTO[10] (COntext
MAnagement oNTOlogy) was implemented as middleware in
order to support context-awareness services It represents the
wide range of concepts needed for context modeling;
unfortunately, it pays no attention to user privacy; moreover,
only activities are mentioned and time can be described only in
terms of absolute time Accordingly, it can not represent the
time intervals or social time needed to handle lifelog events
CAUSB[11] (Context-Aware Ubiquitous Services Browsers)
was proposed for classifying services based on the user’s
context It supports the policy adaptability needed in ubiquitous
environments but does not support the classification of
possibilities of rule condition (no condition, simple or complex
condition) Table I shows a comparison of these models
IV PROPOSED ONTOLOGY MODEL
Fig.2 shows our lifelog ontology model It consists of five
abstract basic classes: Sensor, Channel, Time, Event, and
Policy The Sensor class represents Information sources the
TABLE I C OMPARISON OF MODELS
Requirement \ Model GLOSS SOUPA COMANTO CAUSB
Distinguishable meaning No No Yes Basic(*) Various time expression Yes Yes Basic (**) Basic(**) Fine-grained control No No No Limited(***)
(*) No fundamental class for supporting other types of information (e.g search word or schedules)
(**) Time is only data type property or absolute value therefore it does not support policy description with abstract time or social time (***) Rule is only for type of data with pre-defined action, no attempt
to control each item of event
information sources include sensors (GPS, accelerator sensor, etc.) and PC-based applications (search word logger, scheduler,
etc.) The Channel class corresponds to the various kinds of
lifelog information related to the user Generally, each kind of sensor collects a specific type of data However, different sensors can collect the same type of data, and several sensors
may be needed to collect a complex type of data The Time
Figure 2 Overview of lifelog ontology model
…
SENSOR GPS
PC Log Scheduler
…
…
CHANNEL
Place People
Transport
TV Prog.
Search word Address
TIME
Calendar
Month Week-Day
Year Day
Week
Clock Hour Minute Second
…
Social-Time
Break-Time Holiday
Office-Hour
Time-Interval Time-Point
… Transport-Event
EVENT Place-Event
Address-Event
Search-Event
TV Prog.-Event
People-Event
Simple Condition ConditionComplex
Condition Head
Pre-defined Basic Class
Extended Sub-Class
Pre-defined Sub-Class Sub-Class Of Relation
Trang 3class represents time attribute Time includes Clock, Calendar,
Time-Point, and Time-Interval classes [12][13] The Event
class represents actual personal lifelogs The Policy class
represents disclosure rules given by the user In this model,
sub-classes can be added easily to the basic classes They are
pre-defined or attached by the system according to the lifelog
data received For example, once the system receives visited
place information from GPS, GPS can be added to Sensor class
and Place to Channel class as new sub-classes The sub-class
can be further extended to archive detailed expressions of
lifelog data
The basic classes and extended sub-classes satisfy the
requirements mentioned in Section I as follows Firstly, the
Sensor and Channel classes serve to distinguish the semantic
context of data from different information sources in different
channels, so they are related to point (1) Secondary, the Time
class covers time expressions, mentioned as key point (2)
Additionally, in order to improve the manageability for users,
we introduce Social-Time class to represent time which has
different definitions in different communities (e.g holiday,
office-hour) Thirdly, for point (3), the disclosure rules can be
set in Policy class as instances Each rule consists of a head and
conditions, instantiated in two sub-classes SimpleCondition and
ComplexCondition SimpleCondition is defined based on
lifelog data by withType, and withValue relations
Complex-Condition consists of several simple conditions linked by AND
operator In order to ease user’s management concerns,
complex conditions exclude operators NOT, OR, XOR
(following Rei [14]); instead, a complex condition is defined as
a chain of simple conditions
The links among classes indicate the relations from one class
(domain) to another class (range) Table II shows these
relations Left column shows the relation name, center column
shows the domain class, and right column shows the range
class and instance SensorOf, channelOf and timeOf make an
association of Events to the data source, the kind of
information and the time when lifelog is recorded, respectively
In the TimePoint class, inCalendar and atClock relations are
used to represent a point of time in different formats For
example, January 1st has month and day attribute, so only
inCalendar relation is used In contrast, January 1st 10:00:00
has month, day, hour, minute and second attributes; therefore
inCalendar and atClock relations are used Time-Interval class
has start and end relations Social-Time class has two relations:
atPoint and inInterval They represent the association of one
social time instance to a point or an interval time In the Policy
class, the relation ruleOf, hasCondition, and hasHead serve for
rule description The relations withEvent, withType, and
withValue make direct connections between Policy and classes
or instances in domain model These relations permit all lifelog
events to be connected semantically They make it easy for
users to manage and maintain disclosure rules
V DISCLOSURE CONTROL This section explains how disclosure rules are defined using
this ontology model The two components of each rule are
head and condition The system uses head as information for
undisclosed data if all corresponding conditions are true Some
TABLE II R ELATIONS IN T HIS MODEL
rules may have only head and no condition, which means all instances in the class related to head are undisclosed To show
the potential of this model for lifelog data representation and the formation of disclosure rules, we use the example shown in Table III The user can easily define disclosure rules to handle the data The following three rules prevent the disclosure of highlighted entries in Table III
r 1: Don’t disclose every accompanied people information
r 2: Don’t disclose the search word from 08/12/21 00:00
to 12/21 10:00
r 3: Don’t disclose the visited places in Tokyo collected
by GPS sensor
The corresponding disclosure rule representations in the system are illustrated in Fig.3 Rule r1 has head People-Event
and no condition, rule r2 has head Search word-Event with one simple condition, which is Time as 12/21 00:00 – 10:00
Similarly, Rule r3 can be easily created with head is Place-Event, and one complex condition consisting of two connected simple conditions: “Sensor is GPS” and “Address is Tokyo”
TABLE III EXAMPLE OF LIFELOG DATA
Channel Address Place Search word Place People
08/12/21 00:00 Yokohama Home Train- time 08/12/21
08:00 Yokohama Home Present Home 08/12/22
09:00 Tokyo Restaurant 08/12/22
14:00 Tokyo University University Professor 08/12/22
08/12/22 21:00 Yokohama Home Film
T i m e
08/12/23 19:00 Yokohama Cinema Cinema Classmate
VI IMPLEMENTATION
A OWL Representation and Disclosure setting
The Lifelog domain model and lifelog disclosure rules are defined by OWL language and stored as RDF/XML documents Fig.4 shows the flowchart of this process The example of
Place-Event: place name is University at 2008-06-21T11:50:
sensorOf Event Sensor channelOf Event Channel
inInterval Social-Time Time-Interval ruleOf Rule Policy hasHead Rule Head
hasSimpleCondition ComplexCondition SimpleCondition withEvent Head Event withType SimpleCondition Any class of lifelog domain withValue SimpleCondtion Any instance of lifelog domain
Trang 4Figure 3 Example of setting rules using domain model
00 from GPS is shown in Fig.5 Disclosure rules, defined by
the user, are created as instances of Rule class A rule can
belong to one or several policies Each policy is applied to the
corresponding service provider(s) Fig.6 shows the description
of the simple condition “Address is Tokyo”
Although OWL provides power for modeling lifelogs and
disclosure rules, it has some limitations OWL cannot capture
the relationship of one property to another in the domain For
example, if watching TV and place are recorded at the same
time, it is possible to infer that user watch TV at such place, but
the relation atPlace is not directly defined without time
consideration In order to overcome this restriction, we use the
additional definition created by the Semantic Web Rule
Language (SWRL) [15] which realizes extra relations in the
lifelog domain
A simple rule editor is used to define disclosure rules with
simple or complex conditions Lifelog data in ontology model
is loaded and displayed to help end-user make original
disclosure rules For example, rule r 3 (in Fig.3) can be set by
rule editor as shown in Fig.7
We implemented the prototype of this model and disclosure
control system by using the Protégé library and Jess (Java
Expert System Shell) rule engine which uses the well-known
Rete algorithm for rule matching Various personal lifelog data
are represented in this model For example, lifelogs of address,
place, transport, people, TV-program, search word, activity,
are represented in different sub-classes of Channel class
The domain model was utilized for setting disclosure rules
These rules were then translated into SWRL Jess accepts rules
formulated in SWRL that are translated into Jess rules by the
Java API SWRL factory Jess fires rules and returns
disclosed/undisclosed attributes for each instance in the lifelog
model
Figure 4 Flow of disclosure processing in this implementation
Figure 5 Example of instance in Place-Event
Figure 6 Example of instance in Rule
Figure 7 User interface for setting rules
B Experiments
In order to clarify the feasibility of the proposed ontology model for lifelog data representation, and to check its performance at run time for matching the domain data with user-defined disclosure rules, we performed two experiments
In these experiments, we used a laptop computer with Intel core 2 duo 1.40 GHz CPU and 1.5 GB RAM Results are calculated as the average of 5 runs Lifelog data and disclosure rules are represented in five basic classes and seven sub-classes
GPS-sensor, Remote-controller, Address, Place, Transport, People, TV-program The size of dataset was measured in term
of the number of instances in Event class
<policy:Rule rdf:ID=”Rule_2”>
<policy:hasHead>
<policy:Head rdf:ID=”Head_1”>
<policy:withEvent rdf:resource =”#AddressEvent”/> </policy:Head>
</policy:hasHead>
<policy:hasComplexCondition>
<policy:Complex rdf:ID=”Complex_4”>
<policy:hasSimpleCondition>
<policy:Simple rdf:ID=”Simple_3”>
<policy:withType rdf:resource=”#Address”/>
<policy:withValue>
<Address rdf:about=”#Tokyo/>
</policy:withValue>
</policy:Simple>
</policy:hasSimpleCondition>
</policy:Complex>
</policy:hasComplexCondition>
</policy:Rule>
<PlaceEvent rdf:about="http://lifelog.owl#PlaceEvent_19">
<domain:TimeOf rdf:resource="http://lifelog.owl#TimePoint_20"/> <domain:ChannelOf rdf:resource="http://lifelog.owl#University"/> <domain:SensorOf rdf:resource="&domain;GPSsensor"/>
</PlaceEvent>
<time:TimePoint rdf:about="http://lifelog.owl#TimePoint_20">
<time:inCalendar rdf:resource="&time;year-2008"/>
<time:inCalendar rdf:resource="&time;month-6"/>
<time:inCalendar rdf:resource="&time;day-21"/>
<time:inCalendar rdf:resource="&time;weekday-MONDAY"/> <time:atClock rdf:resource="&time;hour-11"/>
<time:atClock rdf:resource="&time;minute-50"/>
<time:atClock rdf:resource="&time;second-0"/>
</time:TimePoint>
: Instance
POLICY
Complex Condition
Simple Condition
Domain People-Event Search word- Time 08/12/21 00:00-10:00
Event
Sensor
Italic: Class
Under Line: Instance
r1 r2
Head
r3
Lifelog OWL model
Policy (OWL)
Policy Editor
Disclosed data
Domain (OWL)
Policy (SWRL)
Rule engine
Lifelog
Data
Trang 5In the first experiment, we checked the run time for
matching disclosure rules and lifelog data with an increasing
number of rules The rule in this case has simple condition
(Place is home, Transportation is Private car, Address is
Tokyo, etc.) The number of instances in the Event class was
1000 Data for this experiment were extracted from the real life
records of a person In the second experiment, we checked the
run time with an increasing number of lifelogs The number of
classes and the three following complex disclosure rules were
kept consistent These complex rules require the combination
of different events where time attribute is used as the key
relation
1 Don’t disclose the Place located in Tokyo which is
collected by GPS sensor
2 Don’t disclose People accompanying me in private car
3 Don’t disclose TV-program at Friend’s home when
time is weekend
Fig.8 and Fig.9 show the results When the number of
lifelogs is 1000, run time was under 5[sec] If we assume that
one lifelog is obtained in every minute, the number of lifelogs
per day can reaches almost 1000 Thus, the user can adapt her
rules to the lifelogs in 5 seconds Therefore, we think this result
shows that performance of our algorithm is acceptable for users
Figure 8 Run time with growing number of disclosure rules
Figure 9 Run time with growing number of lifelogs
Run time grows linearly with the number of disclosure
rules However, it grows power with respect to the number of
lifelog events added to rule engine This shows that the run
time for extracting disclosed data upon rules largely depends
on the dataset and the nature of rule rather than the number of
rules Therefore, in order to deal with huge data sets collected continuously over long periods, we should manage the scale of dataset input to the rule engine One solution is for the system
to perform the matching process on partial events instead the entire dataset in a single step The dataset could be divided into several parts depending on the structure of rules and the time period (by one day, by one week, etc)
This paper studied the representation of lifelogs collected continuously from sensors and PC-based applications, and the rules to control their disclosure We proposed an ontology
model that consists of five basic classes: Channel, Sensor, Time, Event, and Policy, together with their extensible
sub-classes We compared the model to previous works and showed its superiority with regard to disclosure control In the future, we will investigate user-friendly interfaces for making rules that more closely mimic natural language policies Also,
we will apply this model to actual lifelog services and evaluate its performance and the ease with which rules can be set
REFERENCES [1] S Eitoku, et al., “A study on Visualization of Personal Lifelog Considering Disclosure Control”, IEICE Technical Report,
ISEC2008-88, OIS2008-64, pp.97-104, 2008 (in Japanese) [2] J Hightower, et al, “Practical Lessons from Place Lab.”, IEEE Pervasive Computing, vol 5, no 3, pp.32-39, 2006
[3] A Tomioka, et al, “Information Distributing System Based on User Behavior”, NTT Docomo Technical Journal, Vol 9, No 1, pp 51-56,
2007 [4] M Abe, et al., “A Life Log Collector Integrated with a Remote-Controller for Enabling User Centric Services”, IEEE Transaction on Consumer Electronics, Vol 55, No.1, 2009 (in appear)
[5] M Nishino, et al, “A Place Prediction Algorithm Based on Time-Sensitive Frequent Patterns”, Proc of Pervasive 2009 (in appear) [6] S Seko, et al, “An Algorithm to Estimate the Level of Friendship Based
on the Mode of Transportation and the Time Spent Sharing Movement Tracks”, Proc Pervasive 2009 (in appear)
[7] P Floreen, et al, “Towards a Context Management Framework for MobiLife”, In IST Mobile & Wireless Communications Summit, 2005 [8] G Kirby, et al, “GloSS Ontology and Narratives”, GLOSS Consortium Report D7, 2002
[9] H Chen, et al, “SOUPA: Standard Ontology for Ubiquitous and Pervasive Applications”, Proc of First Annual International Conference
on Mobile and Ubiquitous Systems: Networking and Services (MobiQuitous'04), pp.258-267, 2004
[10] M A Strimpakou, et al, “A Context Ontology for Pervasive Service Provision”, Proc of 20th International Conference on Advanced Information Networking and Applications, pp.775-779, 2006
[11] Hamdeh N A., et al, “OWL-based Ontology for Secure and Adaptable Ubiquitous Environment”, Proc of Third International Conference on Semantics, Knowledge and Grid, pp.230-235, 2007
[12] F Pan, et al, “Temporal Aggregates in OWL-Time”, Proc of 18th Inter Conf Florida Artificial Intelligence Research Society, pp.560-565, AAAI Press, 2005
[13] Q Zhou, et al, “A Reusable Time Ontology”, Proc of the AAAI Workshop on Ontologies for the Semantic Web, 2002
[14] L Kagal, “Rei: A Policy Language for the Me-Centric Project”, Technical report of HP Laboratories Palo Alto, 2002
[15] M O'Connor, et al, “Supporting Rule System Interoperability on the Semantic Web with SWRL”, The Semantic Web - ISWC 2005,
pp.974-986, 2005