The sensor program is implemented in NesC, which is a C-like language for TinyOS [TINY], and is responsible for capturing sensor data values, evaluating queries, submitting sub-query res
Trang 1832 K.-Y Lam and H.C.W Pang
Due to the dynamic properties of sensor data, the probability of satisfying the
condition in a sub-query at a node may change with time Therefore the coordinator
node needs to reorder the sequence of the participating nodes periodically The
reorder procedure is performed when the following condition is satisfied: the
evaluation is stopped at the same node, called the false node, consecutively for a
pre-defined period of time, and the false node is not the first node Satisfaction of these
conditions suggests that the sensor data values generated at the false node may have a
high probability to be false in next evaluation Hence the coordinator node will
reorder the sequence of the nodes using the following procedure:
a The false node is now the first node in the sequence
b All the nodes following the false node will remain in the same relative order
to each other
c All the nodes in front of the false node remain in their original relative order
They rejoin the node sequence but are now attached after the last node of theoriginal sequence
4 Implementation
CMQES is implemented with MICA Motes [MM] In CMQES, one of the MSPUs is
connected with the base station through a mote interface board It is the base station
MSPU CMQES contains two main software components: the sensor program in the
MPSU, and the server program in the base station The sensor program is
implemented in NesC, which is a C-like language for TinyOS [TINY], and is
responsible for capturing sensor data values, evaluating queries, submitting
sub-query results to the coordinator nodes, and collecting performance statistics at each
MPSU We have implemented SeqPush in the sensor program The evaluation results
of a CMQ and performance statistics are periodically sent to the base station through
the base station MSPU for reporting
The second program is the server program residing at the base station This
program is implemented in Java, with MS Windows 2000 and MS SQL server chosen
respectively as the operating systems and database The server program is responsible
for collecting results and performance statistics In addition, the following parameters
can be set using the server program at the base station for submitting a CMQ and
controlling the operations at the MPSUs:
1 The sampling rate of the MSPUs
2 The number of nodes participating in a CMQ
3 Aggregation functions to be performed at a node, i.e., calculate the mean,
maximum and minimum from the values of sub-queries
4 Condition for the sub-query of a CMQ at an MPSU
5 Demonstration and Sample Results
In the demonstration, through the interface at the base station, we can submit CMQs
for processing at the MPSUs, as shown in Figure 1 Currently, the MPSU is
programmed to capture the light intensity of its surrounding environment periodically,
Trang 2Aggregation of Continuous Monitoring Queries 833
i.e., every 2 sec, as sensor data values, and a message is sent every 30 sec to the
coordinator node The sampling rate and message reporting period can be varied at
the base station The message size for communication in TinyOS is 34 bytes Five
bytes are reserved for the message header and the message contains the results from
10 evaluations with each reading taking up 2 bytes The remaining 9 bytes are for
cycle number, message type and the destination address Currently, the transmission
delay of a single message from a MSPU to another one in the testing environment is
between 300ms and 700ms As TinyOS only provides best effort message delivery
service A lost message will be considered a missed evaluation cycle and logged
accordingly by the coordinator node
Fig 1. Program at the base station Fig 2 Real time display of received
messages and statisticsOur experiment results show that the number of messages submitted in central
aggregation scheme (CAS), i.e all the participating MSPU submits sub-query result
to a central MUPU for data aggregation periodically, is much larger than that in
SeqPush. Two ammeters are connected to one of the participating nodes and the
coordinator node to measure the energy consumption rates of the nodes when
different operations are performed at the nodes
The result captured by the base station is displayed in real time as shown in Fig 2
The statistics include:
(1) Number of message transmitted, including sending and receiving messages
(2) Number of successful evaluations and number of missed query results due to
Trang 3eVitae: An Event-Based Electronic Chronicle
Bin Wu, Rahul Singh, Punit Gupta, Ramesh JainExperiential Systems Group Georgia Institute of Technology
of the data in a manner that is independent of media To store events, a novel database called EventBase is developed which is indexed by events The unique characteristics of events make multidimensional querying and multiple perspective explorations of personal history information feasible In this demo
we present the major functions of eVitae.
1 Introduction
Personal history systems electronically record important activities from a person’s life
in the form of photographs, text, video, and audio Examples of some existing systems
such as [3] have shown that there are a number of salient challenges in this domain
First, information is fundamentally anchored to space and time and people often
exploit them as cues for querying information Second, the data as the carrier of
information stays in respective silos This fragments meaningful information across
data Third, to break down these silos, an information model independent of media is
required to depict the content of information Lastly, presentation of the information
must be dynamically generated according to individual users’ preferences
We have developed a personal eChronicle [1] called eVitae [4] In eVitae we
utilize a novel generative theory based upon the concept of event to design an
information system [2] In this paper, we show how eVitae system as an access
environment ingests heterogeneous data into meaningful information conveyed by
events, aids the user to quickly focus on what is of interest and presents a
multidimensional environment for exploration of events with their details stored in
appropriate media
2 Event-Based Data Modeling
The approach we employ to design and implement eVitae is based on the notion of
events [6] An event is an occurrence or happening of significance that can be defined
as a region or collection of regions in spatial-temporal-attribute space Given k events,
E Bertino et al (Eds.): EDBT 2004, LNCS 2992, pp 834–836, 2004.
© Springer-Verlag Berlin Heidelberg 2004
Trang 4eVitae: An Event-Based Electronic Chronicle 835
the event is formally denoted as and uniquely identified by eID
(event identifier) In this notation t characterizes the event temporally, s denotes the
spatial location(s) associated with the event, and are the attribute associated
with the event An event is defined by its event models, which includes the mandatory
attributes: space, time, transcluded-media, event-name, and event-topic, and a finite
set of free attributes
Events can be grouped together in collections, called event categories Formally, an
set of events that comprise the category Event categories provide a powerful
construct to support the multiple ways of organizing information, definition of
complex queries and notification, personalized views of information space where the
user is interested
According to the definition of the event, the information layer implemented by
events breaks down the data silos This layer uses an event-based data model to
construct a new index that is independent of data type The organization, indexing,
and storage of events conveying potentially changing information are accomplished
by parsing the data as it is entered, and storing all events in a database of events called
EventBase The data is parsed by the system and events are produced using the event
model The EventBase also stores links to original data sources, which means the
system can only present the appropriate media in the context of a particular event
EventBase is the extension of traditional database In the implementation of prototype
eVitae system, we use MySQL as the database to store and index events
3 System Architecture
The architecture of eVitae system comprises three modules, namely, Event Entry,
EventBase, and What-You-See-Is-What-You-Get (WYSIWYG) query and exploration
environments. The key features of the system are briefly discussed as following
EventBase. EventBase is the backend of eVitae system which stores the Events The
transclusion of media is maintained by storing links between an event and the data it
is based upon EventBase uses the eID attribute of events as the unified index and is
supported by MySQL
WYSIWYG query and exploration environment. This is an integrated interaction
environment to explore electronic chronicle of a person, as shown in figure 1 By
using temporal and spatial relationship as cues and by defining event categories to
organize events, we create an exploratory environment for the user The event exhibit
panel (Fig 1) presents an overall picture of events Options for zooming, filtering,
extraction, viewing relations, history keeping, and details-on-demand make the
environment appealing and powerful
Event Entry. An event may include multifarious data from different sources, such as
video, audio, sensors, texts The Event Entry module of the system is designed to
produce events using event models, and record the link between events and related
data
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
Trang 5836 B Wu et al.
Fig 1. WYSIWYG Query and Exploration Environment
4 Conclusion
eVitae system demonstrates an event-based approach for organization, storage,
management, and querying of personal history information comprising of multifarious
data The system provides an event entry module for importing data from
heterogeneous data sources and assimilating them into events This information is
stored in EventBase which is indexed by events Event-based modeling allows
multidimensional querying and exploration of personal history information
Furthermore, flexibility in event definition and organization allows exploration of the
data from multiple perspectives Preliminary results indicate that an event-based
system not only offers significant advantages in information organization, but also in
exploration and discovery of information from data
References
R Jain “Multimedia Electronic Chronicles”, IEEE MultiMedia, pp 102-103, Volume
10, Issue 03, July 2003.
R Jain, “Events in Heterogeneous Data Environments”, Proc International Conference on
Data Engineering, Bangalore, March 2003.
J Gemmell, G Bell, R Lueder, S Drucker, and C Wong “MyLifeBits: fulfilling the
Memex vision”, ACM Multimedia, pp 235-238, ACM, 2002.
R Singh, B Wu, P Gupta, R Jain “eVitae: Designing Experiential eChronilces”, ESG
Technical Report Number : GT-ESG-01-10-03,
Trang 6CAT: Correct Answers of Continuous Queries
Department of Computer Science, University of Illinois at Chicago,
Chicago, Il 60607,
{wolfson,nnedunga}@cs.uic.edu
1 Introduction and Motivation
Consider the query Q1: Retrieve all the motels which will be no further then 1.5
miles from my route, sometime between 7:00PM and 8:30PM, which a mobile
user posed to the Moving Objects Database (MOD) Processing such queries
is of interest to wide range of applications (e.g tourist information systems
and context awareness [1,2]) These queries pertain to the future of a dynamic
world Since MOD is only a model of the objects moving in the real world, the
accuracy of the representation has to be continuously verified and updated, and
the answer-set of Q1 has to be re-evaluated in every clock-tick
1
However, there-evaluation of such queries can be avoided if an update to the MOD does not
affect the answer-set
The motion of the moving object is typically represented as a trajectory –
in the X- Y plane is called a route The details of the construction based on
electronic maps and the speed-profiles of the city blocks are given in [5] After a
trajectory is constructed, a traffic abnormality may occur at a future time-point,
due to an accident, road-work, etc , and once it occurs, we need to: identify the
trajectories that are affected, and update them properly (c.f [5]) In the sequel,
we focus on the impacts of the abnormalities to the continuous queries
Figure 1 shows three trajectories – and and their respective
routes and If a road-work starts at 4:30PM on the segment between
A and B which will last 5 hours, slow down the speed between 4:30PM and
9:30PM enters that segment after 4:30PM, and its future portion will need
to be modified As illustrated by the thicker portion of instead of being at
the point B at 4:50, the object will be there at 5:05 A key observation is that
if the object, say whose trajectory is issued the query Q1, we have to
re-evaluate the answer
* Research partially supported by NSF grant EIA-000 0536
1 Hence the name continuous queries – formally defined for MOD in [3].
E Bertino et al (Eds.): EDBT 2004, LNCS 2992, pp 837–840, 2004.
© Springer-Verlag Berlin Heidelberg 2004
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
Trang 7There are tables which: store the trajectories (MOT) and landmarks of
inter-est (BUILDINGS); keep track of the queries posed and their answers
(PEND-ING-QUERIES and ANSWERS); and store the information about the traffic
abnormalities (TRAFFIC_ABN) The trajectories and the landmarks were
ob-tained using the real maps of Cook County, Chicago
In the heart of the CAT system is the set of Triggers, part of which we
illustrate in the context of the example query Q1 If posed the query Q1
at 3:30PM, its answer contains the motels and to which should be
close at 7:55PM and 8:20PM, respectively When an abnormality is detected, its
relevant parameters are inserted in the TRAFFIC_ABN table This, however,
is satisfied by and its action part will re-evaluate the query Q1,
based on the new future-portion of Due to the delay, trajectory will
Trang 8CAT: Correct Answers of Continuous Queries Using Triggers 839
Fig 2. Behavioral Aspects of the CAT
be near at 8:35PM, which is a bit too late for the user On the other hand,
will be near the motel at 7:05PM Before the traffic incident was not
part of the answer, (it would have had the desired proximity at 6:50PM)
All the back-end components are implemented using Oracle 9i as a server We
used User-Defined Types (UDT) to model the entities and User-Defined
Func-tions (UDF) to implement the processing, exploiting the Oracle Spatial
predi-cates
The front-end client, which is the GUI presented to the end-user, is
imple-mented in Java The GUI gives the options of specifying the queries (i.e time
of submission; relevant time interval; objects of interest; etc ) Once the user
clicks the SUBMIT button, the query is evaluated and its answer is displayed
In the server, the query is assigned an id number and it is stored in the
PEND-ING_QUERIES table Clearly, in a real MOD application, the client will be
either a wireless (mobile) user of a web browser-based one, properly interfaced
to the server
To test the execution of the triggers and the updates of the answer(s) to the
continuous queries posed, the GUI offers a window for generating a traffic
ab-normality The user enters the beginning and the end times of the incident as
well as its “type” (which determines the impact on the speed-profile) He also
enters the route segments along which the traffic incident is spread The moment
this information is submitted to the server, the affected trajectories are updated
AND the new answer (s) to the posed continuous queries are displayed back to
the user
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
Trang 9840 G Trajcevski et al.
References
A Hinze and A Voisard Location-and time-based information delivery in tourism.
In SSTD, 2003.
A Pashtan, R Blatter, A Heusser, and P Scheuermann Catis: A context-aware
tourist information system In IMC, 2003.
A P Sistla, O Wolfson, S Chamberlain, and S Dao Modeling and querying
moving objects In ICDE, 1997.
G Trajcevski and P Scheuermann Triggers and continuous queries in moving
Trang 10Hippo: A System for Computing Consistent
Answers to a Class of SQL Queries
Jan Chomicki1, Jerzy Marcinkowski2, and Slawomir Staworko1
1 Dept Computer Science and Engineering, University at Buffalo
{chomicki,staworko}@cse.buffalo.edu 2
Instytut Informatyki, Wroclaw Uniwersity, Poland
Jerzy.Marcinkowski@ii.uni.wroc.pl
1 Motivation and Introduction
Integrity constraints express important properties of data, but the task of
pre-serving data consistency is becoming increasingly problematic with new database
applications For example, in the case of integration of several data sources, even
if the sources are separately consistent, the integrated data can violate the
in-tegrity constraints The traditional approach, removing the conflicting data, is
not a good option because the sources can be autonomous Another scenario
is a long-running activity where consistency can be violated only temporarily
and future updates will restore it Finally, data consistency may be neglected
because of efficiency or other reasons
In [1] Arenas, Bertossi, and Chomicki have proposed a theoretical framework
for querying inconsistent databases Consistent query answers are defined to be
those query answers that are true in every repair of a given database instance A
repair is a consistent database instance obtained by changing the given instance
using a minimal set of insertions/deletions Intuitively, consistent query answers
are independent of the way the inconsistencies in the data would be resolved
Example 1 Assume that an instance of the relation Student is as follows:
above instance has two repairs: one obtained by deleting the first tuple, the
other - by deleting the second tuple A query asking for the address of Jones
returns Chicago as a consistent answer because the third tuple is in both repairs.
However, the query asking for the address of Smith has no consistent answers
because the addresses in different repairs are different On the other hand, the
query asking for those people who live in Los Angeles or New York returns Smith
as a consistent answer
This conservative definition of consistent answers has one shortcoming: the
number of repairs Even for a single functional dependency, the number of repairs
E Bertino et al (Eds.): EDBT 2004, LNCS 2992, pp 841–844, 2004.
© Springer-Verlag Berlin Heidelberg 2004
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
Trang 11842 J Chomicki, J Marcinkowski, and S Staworko
can be exponential in the number of tuples in the database [3] Nevertheless,
several practical mechanisms for the computation of consistent query answers
without computing all repairs have been developed (see [5] for a survey): query
rewriting [1], logic programs [2,4,9], and compact representations of repairs [6,
7] The first is based on rewriting the input query Q into a query such that
the evaluation of returns the set of consistent answers to Q This method
works only for SJD1 queries in the presence of universal binary constraints The
second approach uses disjunctive logic programs to specify all repairs, and then
with the help of a disjunctive LP system [8] finds the consistent answers to a
given query Although this approach is applicable to very general queries in the
presence of universal constraints, the complexity of evaluating disjunctive logic
programs makes this method impractical for large databases
The system Hippo is an implementation of the third approach All information
about integrity violations is stored in a conflict hypergraph Every hyperedge
connects the tuples violating together an integrity constraint
Using the conflict hypergraph, we can find if a given tuple belongs to the set
of consistent answers without constructing all repairs [6] Because the conflict
hypergraph has polynomial size, this method has polynomial data complexity
and allows us to efficiently deal even with large databases [7] Currently, our
ap-plication computes consistent answers to SJUD queries in the presence of denial
constraints (a class containing functional dependency constraints and exclusion
constraints) Allowing union in the query language is crucial for being able to
extract indefinite disjunctive information from an inconsistent database (see
Ex-ample 1)
Future work includes the support for restricted foreign key constraints,
uni-versal tuple-generating dependencies and full PSJUD2 queries However, because
computing consistent query answers for SPJ queries is co-NP-data-complete [3,
6], polynomial data complexity cannot be guaranteed once projection is allowed
The whole system is implemented in Java as an RDBMS frontend Hippo
works with any RDBMS that can execute SQL queries, and provides a JDBC
access interface (we use PostgreSQL) The data stored in the RDBMS needs not
be altered
The flow of data in Hippo is presented on Figure 1 Before processing any
input query, the system performs Conflict Detection and creates Conflict
Hyper-graph for further usage We are assuming that the number of conflicts is small
enough for the hypergraph to be stored in main memory The only output of
this system is the Answer Set consisting of the consistent answers to the input
1 When describing a query class, P stands for projection, S for selection, U for union,
J for cartesian product, and D for difference.
2
Currently, our application supports only those cases of projection that don’t
intro-duce existential quantifiers in the corresponding relational calculus query.
Trang 12Hippo: A System for Computing Consistent Answers 843
Fig 1. Data flow in Hippo
Query in the database instance DB with respect to a set of integrity constraints
IC.
The processing of the Query starts from Enveloping As a result of this step
we get a query defining Candidates (candidate consistent query answers) This
query subsequently undergoes Evaluation by the RDBMS For every tuple from
the set of candidates, the system uses Prover to check if the tuple is a consistent
answer to the Query Depending on the result of this check, the tuple is either
added to the Answer Set or not.
For every tuple that Prover processes, several membership checks have
typi-cally to be performed In the base version of the system this is done by simply
executing the appropriate membership queries on the database This is a costly
procedure and it has a significant influence on the overall time performance of
the system We have introduced several optimizations addressing this problem
In general, by modifying the expression defining the envelope (the set of
can-didates) the optimizations allow us to answer the required membership checks
without executing any queries on the database Also, using an expression
select-Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
Trang 13844 J Chomicki, J Marcinkowski, and S Staworko
ing a subset of the set of consistent query answers, we can significantly reduce
the number of tuples that have to be processed by Prover A more detailed
description of those techniques can be found in [7]
The presentation of the Hippo system will consist of three parts First, we will
demonstrate that using consistent query answers we can extract more
informa-tion from an inconsistent database than in the approach where the input query is
evaluated over the database from which the conflicting tuples have been removed
Secondly, we will show the advantages of our method over competing approaches
by demonstrating the expressive power of supported queries and integrity
con-straints And finally, we will compare the running times of our approach and
the query rewriting approach, showing that our approach is more efficient For
every query being tested, we will also measure the execution time of this query
by the RDBMS backend (it corresponds to the approach when we ignore the
fact that the database is inconsistent) This will allow us to conclude that the
time overhead of our approach is acceptable
References
M Arenas, L Bertossi, and J Chomicki Consistent Query Answers in Inconsistent
Databases In ACM Symposium on Principles of Database Systems (PODS), pages
68–79, 1999.
M Arenas, L Bertossi, and J Chomicki Answer Sets for Consistent Query
An-swering in Inconsistent Databases Theory and Practice of Logic Programming,
3(4–5):393–424, 2003.
M Arenas, L Bertossi, J Chomicki, X He, V Raghavan, and J Spinrad Scalar
Aggregation in Inconsistent Databases Theoretical Computer Science, 296(3) :405–
434, 2003.
P Barcelo and L Bertossi Logic Programs for Querying Inconsistent Databases.
In International Symposium on Practical Aspects of Declarative Languages (PADL),
pages 208–222 Springer–Verlag, LNCS 2562, 2003.
L Bertossi and J Chomicki Query Answering in Inconsistent Databases In
J Chomicki, R van der Meyden, and G Saake, editors, Logics for Emerging
Appli-cations of Databases. Springer-Verlag, 2003.
J Chomicki and J Marcinkowski Minimal-Change Integrity Maintenance Using
Tuple Deletions Technical Report cs.DB/0212004, arXiv.org e-Print archive,
De-cember 2002 Under journal submission.
J Chomicki, J Marcinkowski, and S Staworko Computing Consistent Query
An-swers Using Conflict Hypergraphs In preparation.
T Eiter, W Faber, N Leone, and G Pfeifer Declarative Problem-Solving in DLV.
In J Minker, editor, Logic-Based Artificial Intelligence, pages 79–103 Kluwer, 2000.
G Greco, S Greco, and E Zumpano A Logical Framework for Querying and
Repairing Inconsistent Databases IEEE Transactions on Knowledge and Data
Trang 14An Implementation of P3P Using Database
Technology
Rakesh Agrawal, Jerry Kiernan, Ramakrishnan Srikant, and Yirong Xu
IBM Almaden Research Center
650 Harry Road, San Jose, CA 95120, USA
{ragrawal,kiernan,srikant}@almaden.ibm.com, xuyirong@cn.ibm.com
http://www.almaden.ibm.com/software/quest
1 Introduction
The privacy of personal information on the Internet has become a major concern
for governments, businesses, media, and the public Platform for Privacy
Pref-erences (P3P), developed by the World Wide Web Consortium (W3C), is the
most significant effort underway to enable web users to gain more control over
their private information P3P provides mechanisms for a web site to encode its
data-collection and data-use practices in a standard XML format, known as a
P3P policy [3], which can be programmatically checked against a user’s privacy
preferences
This demonstration presents an implementation of the server-centric
archi-tecture for P3P proposed in [1] The novel aspect of this implementation is that
it makes use of the proven database technology, as opposed to the prevailing
client-centric implementation based on specialized policy-preference matching
engines Not only does this implementation have qualitative advantages, our
experiments indicate that it performs significantly better (15-30 times faster)
than the sole public-domain client-centric implementation and that the latency
introduced by preference matching is small enough (0.16 second on average) for
real-world deployments of P3P [1].
The P3P protocol has two parts:
Privacy Policies: An XML format in which a web site can encode its
data-collection and data-use practices [3] For example, an online bookseller can
publish a policy which states that it uses a customer’s name and home
phone number for telemarketing purpose, but that it does not release this
information to external parties
Privacy Preferences: An XML format for specifying privacy preferences and
an algorithm for programmatically matching preferences against policies
The W3C APPEL working draft provides such a format and corresponding
policy-preference matching algorithm [2] For example, a privacy-conscious
consumer may define a preference stating that she does not want retailers
to use her personal information for telemarketing and product promotion
1
2
E Bertino et al (Eds.): EDBT 2004, LNCS 2992, pp 845–847, 2004.
© Springer-Verlag Berlin Heidelberg 2004
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
Trang 15846 R Agrawal et al.
Fig 1. Client-centric policy-preference matching
2.1 Client-Centric Implementation
A client-centric architecture for implementing P3P has been described in [4]
As a user browses a web site, the site’s P3P policy is fetched to the client side
The policy is then checked by a specialized APPEL engine against the user’s
APPEL preference to see if the policy conforms to the preference (see Figure 1)
There are two prominent implementations of this architecture: Microsoft IE6
and AT&T Privacy Bird
2.2 Server-Centric Implementation
Figure 2 shows the server-centric architecture we have developed A web site
de-ploying P3P first installs its privacy policy in a database system Then database
querying is used for matching a user’s privacy preference against privacy policies
The server-centric implementation has several advantages including: setting up
the infrastructure necessary for ensuring that web sites act according to their
stated policies, allowing P3P to be deployed in thin, mobile clients that are likely
to dominate Internet access in the future, and allowing site owners to refine their
policies based on the privacy preferences of their users
Our implementation consists of both client and server components
3.1 Client Components
We extend Microsoft Internet Explorer to invoke preference checking at the
server before a web page is accessed The IE extension allows a user to specify
her privacy preference at different sensitivity levels It invokes the preference
checking by sending the preference to the server
Trang 16An Implementation of P3P Using Database Technology 847
Fig 2. Server-centric policy-preference matching
3.2 Server Components
We define a schema in DB2 for storing policy data in the relational tables This
schema contains a table for every element defined in the P3P policy The tables
are linked using foreign keys reflecting the XML structure of the policies We
extend IBM Tivoli Privacy Wizard (a web-based GUI tool for web site owners to
define P3P policies) with the functionality of parsing and shredding P3P policies
as a set of records in the database tables
When the server receives the APPEL preference from the client, it translates
the preference into SQL queries to be run against the policy tables The SQL
queries corresponding to the preference are submitted to the database engine
The result of the query evaluation yields the action to be taken The evaluation
result is sent back to the client If the policy does not conform to the preference,
the IE extension will block the web page and prompt a message to the user
Otherwise, the requested web page is displayed
References
Rakesh Agrawal, Jerry Kiernan, Ramakrishnan Srikant, and Yirong Xu
Implement-ing P3P usImplement-ing database technology In 19th Int’l Conference on Data EngineerImplement-ing,
Bangalore, India, March 2003.
Lorrie Cranor, Marc Langheinrich, and Massimo Marchiori A P3P Preference
Ex-change Language 1.0 (APPEL1.0). W3C Working Draft, April 2002.
Lorrie Cranor, Marc Langheinrich, Massimo Marchiori, Martin Presler-Marshall,
and Joseph Reagle The Platform for Privacy Preferences 1.0 (P3P1.0)
Specifica-tion. W3C Recommendation, April 2002.
The World Wide Web Consortium P3P 1.0: A New Standard in Online Privacy.
Available from http://www.w3.org/P3P/brochure.html.
Trang 17XQBE: A Graphical Interface for XQuery
Engines
Daniele Braga, Alessandro Campi, and Stefano Ceri
Politecnico di Milano Piazza Leonardo da Vinci,32 - 20133 Milano, Italy
{braga,campi,ceri}@elet.polimi.it
Abstract. XQuery is increasingly popular among computer scientists with a SQL background, since queries in XQuery and SQL require com- parable skills to be formulated However, the number of these experts
is limited, and the availability of easier XQuery “dialects” could be tremely valuable Something similar happened with QBE, initially pro- posed as an alternative to SQL, that has then become popular as the user-friendly query language supported by MS Access We designed and
ex-implemented XQBE, a visual dialect of XQuery that uses hierarchical
structures, coherent with the hierarchical nature of XML, to denote the input and output documents.
Our demo consists of examples of queries in XQBE and shows how our prototype allows to switch between equivalent representations of the same query.
The diffusion of XML sets a pressing need for providing the capability to query
XML data to a wide spectrum of users, typically lacking in computer
program-ming skills This demonstration presents a user friendly interface, based on an
in-tuitive visual query language (XQBE, XQuery By Example), that we developed
for this purpose, inspired by the QBE [2] QBE showed that a visual interface to
a query language is effective in supporting the intuitive formulation of queries
when the basic graphical constructs are close to the visual abstraction of the
underlying data model Accordingly, while QBE is a relational query language,
based on the representation of tables, XQBE is based on the use of annotated
trees, to adhere to the hierarchical nature of XML XQBE was designed with the
objectives of being intuitive and easy to map directly to XQuery Our interface is
capable of generating the visual representation of many XQuery statements that
belong to a subset of XQuery, defined by our translation algorithm (sketched
later)
XQBE allows for arbitrarily deep nesting of XQuery FLWOR expressions,
construction of new XML elements, and restructuring of existing documents
However, the expressive power of XQBE is limited in comparison with XQuery,
which is Turing-complete The particular purpose of XQBE makes usability one
E Bertino et al (Eds.): EDBT 2004, LNCS 2992, pp 848–850, 2004.
© Springer-Verlag Berlin Heidelberg 2004
Trang 18XQBE: A Graphical Interface for XQuery Engines 849
Fig 1. A sample document (bib.xml)
of its critical success factors, and we considered this aspect during the whole
de-sign and implementation process Still from a usability viewpoint, our prototype
is a first step towards an integrated environment to support both XQuery and
XQBE, where users alternate between the XQBE and XQuery representations
XQBE is fully described in [1] Here we only introduce its basics by means of the
query (Q1) “List books published by Addison-Wesley after 1991, including their
year and title”, on the data in Figure 1 Its XQBE version is in Figure 2(a),
while its XQuery version is
A query is formed by a source part (on the left) and a construct part (on
the right) Both parts contain labelled graphs that express properties of XML
fragments: the source part describes the XML data to be matched, the construct
part specifies which are to be retained Correspondence between the two parts
is expressed by explicit bindings XML elements in the target are represented as
labelled rectangles, their attributes as black circles (with the name on the arc),
and their PCDATA as an empty circle In the construct part, the paths that
branch out of a bound node indicate which of its contents are to be retained In
Figure 2(a) the source part matches thebook elements with a year greater than
1991 and apublisher equal to “Addison-Wesley” The binding edge between the
book nodes states that the result shall contain as many book elements as those
matched The trapezoidalbib node means that all the generated books are to be
contained into onebib element
The translation process translates an XQBE query into a sentence of the
XQuery subset defined by the grammar in figure 3
The generated translation of Q1 is:
It is also possible to obtain the XQBE version of an XQuery statement The
automatically generated XQBE version of Q1 is shown in Figure 2(b)
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
Trang 19850 D Braga, A Campi, and S Ceri
Fig 2. The XQBE version of Q1(a) and the automatically generated XQBE for Q1(b).
Fig 3. EBNF specification of the XQuery subset expressible with XQBE
3 Conclusions
The contribution of our work is the availability of an environment in which users
can query XML data with a GUI, access the generated XQuery statement, and
also visualize the XQBE version of a large class of XQuery statements Moreover
they can modify any of the representations and observe the changes in the other
representation
References
D Braga and A Campi A graphical environment to query xml data with xquery.
In Proc of the 4th WISE, Roma (Italy), December 2003.
M M Zloof Query-by-example: A data base language IBM Systems Journal, 1977.
1.
2.
Trang 20P2P-DIET: One-Time and Continuous Queries
in Super-Peer Networks
Stratos Idreos, Manolis Koubarakis, and Christos Tryfonopoulos
Dept of Electronic and Computer Engineering Technical University of Crete, GR73100 Chania, Greece
{sidraios,manolis,trifon}@intelligence.tuc.gr
1 Introduction
In peer-to-peer (P2P) systems a very large number of autonomous computing
nodes (the peers) pool together their resources and rely on each other for data
and services P2P systems are application level virtual or overlay networks that
have emerged as a natural way to share data and resources The main
applica-tion scenario considered in recent P2P data sharing systems is that of one-time
querying: a user poses a query (e.g., “I want music by Moby”) and the system
returns a list of pointers to matching files owned by various peers in the network
Then, the user can go ahead and download files of interest The complementary
scenario of selective dissemination of information (SDI) or selective information
push is also very interesting In an SDI scenario, a user posts a continuous query
to the system to receive notifications whenever certain resources of interest
ap-pear in the system (e.g., when a song of Moby becomes available) SDI can be as
useful as one-time querying in many target applications of P2P networks ranging
from file sharing, to more advanced applications such as alert systems for digital
libraries, e-commerce networks etc
At the Intelligent Systems Laboratory of the Technical University of Crete,
we have recently concentrated on the problem of SDI in P2P networks in the
context of project DIET(http://www.dfki.de/diet) Our work, summarized
in [3], has culminated in the implementation of P2P-DIET, a service that
uni-fies one-time and continuous query processing in P2P networks with
super-peers P2P-DIET is a direct descendant of DIAS, a Distributed Information
Alert System for digital libraries, that was presented in [4] P2P-DIET combines
one-time querying as found in other super-peer networks and SDI as proposed
in DIAS P2P-DIET has been implemented on top of the open source DIET
Agents Platform (http://diet–agents.sourceforge.net/) and it is available
at http://www.intelligence.tuc.gr/p2pdiet
A high-level view of the P2P-DIET architecture is shown in Figure 1(a) and
a layered view in Figure 1(b) There are two kinds of nodes: super-peers and
clients. All super-peers are equal and have the same responsibilities, thus the
E Bertino et al (Eds.): EDBT 2004, LNCS 2992, pp 851–853, 2004.
© Springer-Verlag Berlin Heidelberg 2004
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
Trang 21852 S Idreos, M Koubarakis, and C Tryfonopoulos
Fig 1. The architecture and the layered view of P2P-DIET
super-peer subnetwork is a pure P2P network Each super-peer serves a fraction
of the clients and keeps indices on the resources of those clients.
Clients can run on user computers Resources (e.g., files in a file-sharing
application) are kept at client nodes, although it is possible in special cases
to store resources at super-peer nodes Clients are equal to each other only
in terms of download Clients download resources directly from the resource
owner client A client is connected to the network through a single super-peer
node, which is the access point of the client It is not necessary for a client
to be connected to the same access point continuously since client migration
is supported in P2P-DIET Clients can connect, disconnect or even leave from
the system silently at any time To enable a higher degree of decentralization
and dynamicity, we also allow clients to use dynamic IP addresses Routing of
queries (one-time or continuous) is implemented using minimum weight spanning
trees for the super-peer subnetwork After connecting to the network, a client
may publish resources by sending resource metadata to its access point, post an
one-time query to discover matching resources or subscribe with a continuous
query to be notified when resources of interest are published in the future A
user may download a file at the time that he receives a notification, or save it in
his saved notifications folder for future use Additionally a client can download
a resource even when he has migrated to another access point The feature of
stored notifications guarantees that notifications matching disconnected users
will be delivered to them upon connection If a resource owner is disconnected,
the interested client may arrange a rendezvous with the resource P2P-DIET
also offers the ability to add or remove super-peers Additionally, it supports a
simple fault-tolerance protocol based on are-you-alive messages Finally,
P2P-DIET provides message authentication and message encryption For the detailed
protocols see [5]
Trang 22P2P-DIET: One-Time and Continuous Queries in Super-Peer Networks 853
The current implementation of P2P-DIET to be demonstrated supports the
model [4] and it is currently been extended to support [4] Each
super-peer utilises efficient query processing algorithms based on indexing of
re-source metadata and queries and a hierarchical organisation of queries (poset)
that captures query subsumption as in [1] A sophisticated index that exploits
commonalities between continuous queries is maintained at each super-peer,
en-abling the quick identification of the continuous queries that match incoming
resource metadata In this area, our work extends and improves the indexing
algorithms of SIFT [6] and it is reported in [2].
References
A Carzaniga and D.S Rosenblum and A.L Wolf Design and evaluation of a
wide-area event notification service ACM Transactions on Computer Systems, 19(3) :332–
383, August 2001.
C Tryfonopoulos and M Koubarakis Selective Dissemination of Information in
P2P Networks: Data Models, Query Languages, Algorithms and Computational
Complexity Technical Report TUC-ISL-02-2003, Intelligent Systems Laboratory,
Dept of Electronic and Computer Engineering, Technical University of Crete, July
2003.
M Koubarakis and C Tryfonopoulos and S Idreos and Y Drougas Selective
Infor-mation Dissemination in P2P Networks: Problems and Solutions ACM SIGMOD
Record, Special issue on Peer-to-Peer Data Management, K Aberer (editor), 32(3),
September 2003.
M Koubarakis and T Koutris and C Tryfonopoulos and P Raftopoulou
Informa-tion Alert in Distributed Digital Libraries: The Models, Languages and Architecture
of DIAS In Proceedings of the 6th European Conference on Research and Advanced
Technology for Digital Libraries (ECDL 2002), volume 2458 of Lecture Notes in
Computer Science, pages 527–542, September 2002.
S Idreos and M Koubarakis P2P-DIET: A Query and Notification Service Based
on Mobile Agents for Rapid Implementation of P2P Applications Technical Report
TUC-ISL-01-2003, Intelligent Systems Laboratory, Dept of Electronic and
Com-puter Engineering, Technical University of Crete, June 2003.
T.W Yan and H Garcia-Molina Index structures for selective dissemination of
information under the boolean model ACM Transactions on Database Systems,
Trang 23A Hierarchical Storage and Archive Environment for
Multidimensional Array Database Management Systems
Bernd Reiner and Karl HahnFORWISS (Bavarian Research Center for Knowledge Based Systems)
Technical University Munich Boltzmannstr 3, D-85747 Garching b München, Germany
{reiner,hahnk}@forwiss.tu-muenchen.de
Abstract. The intention of this paper is to present HEAVEN, a solution of intelligent management of large-scale datasets held on tertiary storage systems.
We introduce the common state of the art technique storage and retrieval of
large spatio-temporal array data in the High Performance Computing (HPC)
area An identified major bottleneck today is fast and efficient access to and evaluation of high performance computing results We address the necessity of developing techniques for efficient retrieval of requested subsets of large datasets from mass storage devices Furthermore, we show the benefit of managing large spatio-temporal data sets, e.g generated by simulations of
climate models, with Database Management Systems (DMBS) This means
DBMS need a smart connection to tertiary storage systems with optimized access strategies HEAVEN is based on the multidimensional array DBMS RasDaMan.
1 Introduction
Large-scale scientific experiments often generate large amounts of multidimensional
data sets Data volume may reach hundreds of terabytes (up to petabytes) Typically,
these data sets are stored as files permanently in an archival mass storage system on
up to thousands of magnetic tapes The access times and/or transfer times of these
kinds of tertiary storage devices, even if robotically controlled, are relatively slow
Nevertheless, tertiary storage systems are currently the common state of the art
storing such large volumes of data Concerning data access in HPC area the main
disadvantages are high access latency compared to hard disk devices and to have no
direct access A major bottleneck for scientific application is the missing possibility of
accessing specific subsets of data If only a subset of such a large data set is required,
the whole file must be transferred from tertiary storage media Taking into account
the time required to load, search, read, rewind and unload several cartridges, it can
take many hours/days to retrieve a subset of interest from a large data set Entire files
must be loaded from the magnetic tape, even if only a subset of the file is needed for a
further processing The processing with data across a multitude of data sets, for
example, time slices is hard to support Evaluation of search criteria requires network
transfer of each required data set, implying sometimes a prohibitively immense
E Bertino et al (Eds.): EDBT 2004, LNCS 2992, pp 854–857, 2004.
© Springer-Verlag Berlin Heidelberg 2004