Multi-dimensional Indexing layer

1. Building Indexing Space

Multi-dimensional indexing layer is the essential part of the system model. Our system is based on CAN [5] and uses a logical d-dimensional Cartesian Product space in a similar way. We consider a two-dimensional range query evaluation example here for simplicity, although the approach can handle multi-dimensional range queries with any number of dimensions.

In general, a key set with d attributes (dimensions) will form a d-dimensional Cartesian Product space, termed as key-space in Nieevrgelt et at [20]. Each attribute forms one dimension of the coordinate space. For one specific P2P system, the schema used to

describe the information corresponds to one virtual coordinate space. For instance, the example schema showed in section 3.1.2, R (music_name, rank, author, album, year), now is mapped into a five-dimensional space. Afterwards, each tuple in this table is mapped into one point. Also, the set of points corresponding to the tuples in this table becomes a subset of the set of all the point in the space. Thus, such mapping changes the query platform from one table to the coordinate space.

This mapping is important because the query is mapped into points or regions. For instance, multi-dimensional equality selection query like Query 3 is mapped into one point in the coordinate space corresponding to the relational table R.

Query 3.3:

Select tuples From R (A,B,C…) Where A1≤A≤A2 AND B1≤B≤B2 AND C1≤C≤C2 AND …

More important, range query focused is now mapped into one rectangular region in the space. Consider the simplistic case, given one schema T (A, B) with two attributes A, B. Attribute A and attribute B are restricted by two ranges respectively. We select attributes A, B as two coordinates (X, Y) of the virtual two-dimensional space and create the key set (A, B). Attribute A is used as the horizontal coordinate X and B is used as the vertical coordinate Y. The corresponding key space is two-dimensional

square, and the range query is mapped into the shaded rectangular region bounded by the read line, shown in Fig. 3.2.

Figure 3.2 Two-dimensional space and range query region

Hence, we have constructed one coordinate space, which is used to map the tuples in the table into the corresponding points in the space. In the next section, we discuss how to use space filling curve to sort and index this space.

2. Sorting the Indexing Space Using Hilbert Space Filling Curve

One important predicate that speeds up range query is to sort the data items. Here, we also follow this usual thought. The difficulty of sorting here lies in multi-dimensions.

1 2

3 63

0 x

X.low X.high

Y.high

Y.low

We choose Hilbert space filling curve as the tool to reduce the high dimensions and map multi-dimensions into one single dimension. The related property of Hilbert space filling curve has been examined in chapter 2. Here, we explore its application in sorting the indexing space. We still discuss 2-dimensional case for simplicity. The underlying assumption of space filling curve is that the values of any attribute can be represented by some fixed number of bits, say k bits. Thus the maximum number of values along each dimension is 2k .We divide the square into N*N grids with N2 quadrants, N=2k .

One to one mapping occurs between the potential combination of (A, B) and the quadrants. Hilbert curve of order k passes through all the quadrants once and only once and assigns each quadrant a unique binary sequence number. As a result, any record in the relation R is mapped into one quadrant with a unique sequence number.

That is, all the tuples are sorted based on the combination of A and B. Thus, a relation R corresponds to a set of quadrants or a set of sequence numbers. As a result, a two-dimensional query “R.A(low,high), R.B(Low,High)” is regarded as a rectangular area in the space, shown as the shaded area in Fig.3.3.

Figure 3.3 System model

The Hilbert curve repeats the cycle of entering and leaving the requested region for several times. Hence, the query region encloses several curve sections, which are disconnected. To answer the two-dimensional range query, the only thing we need to do is to find these curve segments.

0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111

Peer1 Peer2

Peer3 Peer4

Peer5

Peer6

Peer7

63 47 44 43 42

120 121

122 123 124 125 126

127 128 129 130

131 132 135

212 211

213 192

223 224

255 32

64 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1101

191

Thus, multi-dimensional range query is converted into lookup for the curve segments in the space.

3. Zone Distribution

We have built an indexing layer for the multi-dimensional query space using Hilbert space filling curve. This offers a global abstraction of the data items stored in the system. However, we are facing the distributed environment of DHT-P2P systems.

Such an indexing infrastructure is not centrally stored. In DHT-based P2P designs, each peer only owns one part of the query space. How to assign the index space among peers?

For simplicity, the overlay P2P network is modeled as a system consisting of seven peers shown in Fig.3.4.

Figure 3.4 P2P system with 7 peers

Peer 1

Peer 3

Peer 7

Peer 6 Peer 5 Peer 2

Peer 4

Initially, at the time of system startup, only one peer, peer1 enters the system and covers the whole square. When the second peer, peer2 enters the system, peer1 will split the whole square into 2 rectangular zones and keeps one for it and assigns another half to peer2. The space partitioning follows such a strategy: because the points or quadrants in the square are ordered, the partitioning process should make each zone to cover continuous units. Here, each unit is one “2nd order Hilbert curve”, shown in Fig.3.5 (a), regardless of orientation. It is allowed that one peer owns a zone with curve sections like Fig3.5 (a), but it is not allowed to be like the zone enclosing disconnected curve sections like the case in Fig.3.5 (b). Observing such a rule ensures that each peer owns a zone containing 16*k consecutive sequence numbers, with k representing the number of curve units covered. Any two peers do not overlap their zones. Because of the ordering property of the space, each zone has four adjacent zones in general; among these, one is the zone starting with the quadrant whose sequence number is equal to the largest sequence number in the current zone plus one; one is the zone ending at the quadrant whose sequence number is equal to the starting sequence number of current zone minus one. Details about space partitioning strategy can see the later section 4.3.

(a) Good Splitting (b) Bad Splitting Figure.3.5 Zone partitioning

The first quadrant passed through by the Hilbert curve in one peer’s zone is ordered with the lowest sequence number among the quadrants in this zone. This lowest sequence number is called peer-ID for that peer. Each peer has one unique peer-ID.

When the zone of one peer splits, the original peer-ID is also adjusted. Each zone is essentially a range starting with the peer-ID and ending with the largest sequence number in its zone. For example, in Fig.3.3, the square is divided by the seven pees into seven zones: zone1(0~31) (zone1 covering the sequence number from 0 to 31 consecutively), zone2(32~63), zone3(64,95), zone4(96,127), zone5(128,191), zone6(192,223), zone7(224,255).

To forward the request efficiently during the process of space partitioning, each peer keeps one routing table with IP addresses, zone covered and peer-IDs of its adjacent neighbors. Among these includes its successor-peer (which covers the adjacent upper curve sections) and predecessor-peer (which covers the adjacent lower curve section).

These constitute the content of the routing table. For example, in Fig.3.3 peer2 keeps information about peer1, peer4, peer3, peer6, but do not keep information about other peers. Among these four neighbors, peer1 and peer 3 are its predecessor and successor, respectively.

Compared with the traditional routing table in DHTs, the routing table here is adapted so that it contains not only the neighboring information but also the range information of neighbors. This neighboring information stored in the routing table offers sufficient underpinning direction for each peer to routing range towards the requested query

region in the space. A range query message includes the destination range. Using the neighbor coordinate set and the neighboring range information, a peer routes a message towards the requested region by simply forwarding greedily to the neighbors.

3.3.4 Two-fold Property of Zone Partitioning Manner

The zone partitioning approach described above has two-fold property which is central to executing range routing and processing the range query.

In one hand, the space is partitioned into zones and then assigned to individual peer.

Such partitioning manner follows the traditional DHTs’ partitioning way. It follows the rule of thumb that specifies the neighboring relationship between two nodes: in a k-dimensional coordinate space, two nodes are neighboring if and only if their zones overlap along (k-1)-dimensions and adjacent along one dimension. Also, it makes individual node to have 2k neighbors. For example, in Fig3.4, peer4 is a neighbor of peer5, because their coordinate zones abut along on Y axes. The traditional partitioning style brings us the functionality of greedy forwarding request which is supported and essential to DHT-based P2P designs. It enables one node to forward the query request to its multiple neighbors in the case that the requested items are not stored in its local storage. Essentially, this enables the k-ary search possible when we process multi-dimensional range query in the distributed environment of P2P systems.

However, such a greedy forwarding only limits to the single attribute exact-match. In this thesis, we adapt it to accommodate range forwarding through sorting the space.

In the other hand, the proposed partitioning manner is also a sequence dividing way.

This is because the zone distributed to each peer covers one continuous curve section.

Thus, each zone corresponds to a one-dimensional sequence. This brings the effect that the whole one-dimensional sequence is divided among the peers. This secondary property of sequence dividing manner makes range processing possible.

In the next chapter, we will see that it is this two-fold property that enables the range to be queried in the lookup way.

3.3.5 File Systems

File systems still hold the original habitat and we do not make any special specification on the file systems. Storage of the distributed indices is one part of the normal file systems.

Chapter 4

Multi-dimensional Range Query Evaluation

In this chapter, we start with outlining the query scheme for processing multi-dimensional exact-match lookup in the P2P systems. This is one important underpinning for the later general multi-dimensional range query processing scheme.

The proposed general query scheme is a relay-race like.

Based on the basic query processing approach, we propose the strategies which aim at reducing the number of hops, users perceived response time. Finally, the query performance is examined by simulation.

4.1 Multi-dimensional Single Point Query Evaluation

A multi-dimensional single point query can be answered by issuing a lookup request.

For example, in Fig.4.1, a user at peer2 issues one query for “x=0100 (Binary format) and Y=1100”, we first map the coordinates (0101, 0010) into Hilbert sequence number:

1100001(97) at peer2. Peer2 checks this sequence number to see if it is within its sequence number range. If it is within its range, peer2 retrieves this record and send it to

peer1. If not, peer2 checks the routing table storing the IP addresses of the neighboring peers and forward the query request into the P2P systems. Once the query request is forwarded to the node that covers this sequence number, the result will be returned to the requesting peer. This process adopts the forward routing manner and it is essentially solving the multi-dimensional single point lookup.

4.2 Multi-dimensional Range Query Evaluation

Lawder [17] proposes a calculate-next-match (CNM) function providing such kind of functionality for Hilbert curve mapping: in a multi-dimensional space, given a hyper-rectangular query region, input the sequence number of one multi-dimensional point, calling the calculate-next-match function produces a sequence number in the query region, which is equal to or minimally greater than the input sequence number.

This CNM function and the lookup mechanism for evaluating multi-dimensional single point range query underlie general multi-dimensional range query evaluation.

The fundamental approach for multi-dimensional range query processing proposed here is relay-race alike. We illustrate this approach with an example in Fig.4.1: “Given a two-dimensional range query: “0100<=X<=1010 and 1001<=Y<=1110”, (shown in the shaded rectangle in Fig.4.1) which is issued by a user in peer1.

The query evaluation proceeds as follows:

Step 1: Peer1 calls CNM function with the input sequence number 0 and gets a resulting point called next-match with sequence number 42. Peer1 checks if sequence number 42 is in its range of sequence number and it is not. Because this approach is in ascending peer-ID order, we must start with the lowest sequence number 0. Otherwise, the query result may be incomplete.

Step 2: Peer1 issues a lookup request (essentially a single point query) for the next-match with sequence number 42 to its neighbors in the coordinate space. As the space is partitioned into zones and is also sorted as we analyzed, the routing table contains both the coordinate space of its neighbors and the range information of its neighbors. The range request is forwarded to its neighbors according to the routing table. This is similar to the traditional greedy forwarding mechanism of DHT lookup that we intend to keep. The request is finally forwarded to the destination node whose curve segment passes through the next match point. Such a node is called next-peer.

Here, peer2 assumes the next peer.

Step 3: Peer2 identifies and retrieves the points, which are stored by it and fall within the query region. Then these results are sent to peer1.

Step 4: At the same time, peer2 relay the calling of CNM function with its successor’s peer-ID (64) as input. The resulting next-match point has sequence number 120.

Step 5: Peer2 issues a lookup request to its neighbors for the next-match 120.

Step 6: Next-peer, peer4, receives the query request, identifies and retrieves the qualified points and send them back to peer1. Meanwhile, peer4, calls CNM function

with its successor's peer–ID (128), as input. Peer4 gets the next-match 128. Peer4 issues a lookup request for next-match 128.

Step 7: Such a relay race like approach continues until there is no next-match can be found via the CNM function. Thus, the query processing finishes and the query results are complete. In this process, the total number of times that CNM function is called is equal to the numbers of intersecting peers intersecting plus one.

Figure 4.1 Query evaluation

0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111

Peer1 Peer2

Peer3 Peer4

Peer5

Peer6

Peer7

63 47 44 43 42

120 121

122 123 124 125 126

127 128 129 130

131 132 135

212 211

213 192

223 224

255 32

64 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1101

191

4.3 Zone Maintenance

Because the points in the multi-dimensional space, which Hilbert curve passes through, are ordered, this necessitates new approach to partition the square. Here, we illustrate the approach in two-dimensional case, although the method can be applied to any number of dimensions. The partitioning method aims that the corresponding zone of each peer covers a continuous curve section and each peer keeps IP addresses and range information of its neighbors (includes its successor and predecessor).

Initially, the whole square is a single zone owned by a single peer. A second peer enters and splits one half of the square in a manner that ensures each peer to cover a continuous curve section consisting of a number of second order curves. The partitioning method also tries to maintain that the curve length is distributed roughly equally among these peers. The sequence numbers of the points held by peer2 should be lower than the sequence numbers held by peer1. That is, new peer always gets the lower part. Peer1 is called the predecessor-peer of peer2. On the other way, peer2 is termed as the successor-peer of peer1. Meanwhile, Peer2 records IP address of its predecessor-peer, peer1, in its routing table and also peer1 records IP address of its successor-peer. As more and more peers join the system, this splitting process proceeds in the same manner and the IP addresses of its successor-peer and predecessor-peer are recorded.

Also, for the purpose of routing, each peer stores IP addresses of its neighboring peers into its routing table. A zone also splits when the peer has high workload. In this case, the owner peer contacts one of its neighbors and assigns a portion of its curve section to one of them by transferring the corresponding results and neighboring hosts. Then, both these two peers involved adjust their routing table. Fig.4.2 shows the partitioned zones after the zone owned by peer5 splits into two continuous curve sections and peer 5 assigns the lower-end part to peer8. Originally, IP address of peer5 is stored in peer4 as successor-peer-ID. After the splitting operations, peer8 contacts peer4 and informs the change of peer4’s successor. Meanwhile, peer5 does the similar change to its predecessor-peer’s (now peer8) IP address and peer-ID in its routing table.

In such a partitioning way, each peer has not only the coordinate neighboring information but also the range information. Again, this is due to the fine property of two-fold partitioning. For example, supposing peer2 keeps a routing table stored information (0, peer1IP), (64, peer3IP), (192, peer6IP), (96, peer4IP). The first two records indicate the peer-IDs and IP addresses of its predecessor-peer and successor-peer, respectively and the other three records contain information about peer2’s three neighbors, peer1, peer6 and peer7. Given a request point with sequence number 200, peer3 will forward the query request to all the ones in its routing table.

0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111

Figure 4.2 Space Partitioning

Peer1 Peer2

Peer3 Peer4

Peer6

Peer7

63 47 44 43 42

120 121

122 123 124 125 126

127 128 129 130

131 132 135

212 211

213 192

223 224

255 32

64 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111

Peer8

Peer5 96

4.4 System Performance Evaluation

We examine multi-dimensional range query processing capability of our model by simulation. A multi-dimensional query processing simulator based on CAN overlay network is built up. The design of the simulator involves using Transit-Stub topologies of the GI-ITM topology generator [21], (refer to appendix C). TS topologies models networks using a 2-level hierarchy of routing domains with transit domains that interconnect lower level stub domains. And we use link latencies of 100ms for intra-transit domain links, 10ms for stub-transit links and 1ms for intra-stub domain links. The Hilbert space filling curve mapping (refer to Appendix A) is deployed and the zone partitioning approach is implemented.

All experiments are performed on a Sunfire 4800 server running Sun Solaris, with 8 GB RAM and 8 750MHz Ultra Sparc III CPU, which is connected to 2 Sun T3 Disk Array.

How about the dynamicity problem? The nodes are connected dynamically, that is, it is allowed that one node can join or leave the system at their will. This follows one important feature of P2P paradigm: each node has the full power of choosing the occasion to join or leave. Hence, for specific range query, it will result in difference sets of answers for different times. Because from one querying time to another query times slot, the number of nodes in the systems has changed. The same situation holds for the data distribution. Towards this problem, we issue the same query for several times and

One-dimensional Indexing for P2P designs