Research on smart caching and content sharing for mobile networks

For this reason, the network architecture needs to be operated in order to effectively support content distribution to the large number of users.. Key words: Content centric networking S

Trang 2

Research on Smart Caching and Content Sharing for

Mobile Networks

Ph.D Candidate : Ong Mau Dung Major : Computer System Architecture Supervisor : Prof Chen Min

Huazhong University of Science and Technology

Wuhan 430074, P R China

May, 2015

Trang 3

学位论文作者签名：

日期：年月日

学位论文版权使用授权书

本学位论文作者完全了解学校有关保留、使用学位论文的规定，即：学校有权保留并向国家有关部门或机构送交论文的复印件和电子版，允许论文被查阅和借阅。本人授权华中科技大学可以将本学位论文的全部或部分内容编入有关数据库进行检索，可以采用影印、缩印或扫描等复制手段保存和汇编本学位论文。

本论文属于

Trang 4

摘要

了实现资源共享。它被设计为一个能够使两个终端主机之间进行会话的通讯架构。因此，如今的互联网已经不仅仅作为一个传统的包传输网络，而且作为一个分发网络来使用。由于上述原因，需要调整网络架构以便有效地将内容分发给众多用户。

一种面向内容名字的方法，它被用来向边缘网关/路由器传播内容，CCN 是一种在不断发展的分布网络构架。在保证最小的上行带宽需求和最短的下行延迟的前提下，

Trang 5

成体系进一步改善了 MS2的性能，因为它将附近的无线接入网络的可用流行内容缓存提供给移动用户，有助于避免从远程的服务器访问流行内容而导致的拥塞。

关键词：内容中心网络智能缓存缓存命中率网络流量卸载

无缝服务迁移

Trang 6

Abstract

The current Internet’s architecture was designed and created in the late 1960s and early 1970s primarily to enable resource sharing It was designed as a communication architecture to enable a conversation between two end-hosts Thus, the usage of Internet nowadays is more than a network for traditional packet deliveries, we are now using it as a distribution network For this reason, the network architecture needs to be operated in order to effectively support content distribution to the large number of users

Content Centric Networking (CCN) provides a network architecture to support the above requirements CCN is an evolving distribution network architecture since CCN is a content name-oriented approach to disseminate content to edge gateways/routers To maximize the probability of content sharing while ensuring minimal upstream bandwidth demand and lowest downstream latency, CCN routers/gateways should cache exchanged content as long as possible For this reason, novel smart caching policies are firstly proposed and especially designed for CCN The obtained simulation results demonstrated the good performance of the proposed policies in achieving higher hitting rates Then, many CCN applications and network architecture solutions are presented in Wireless Body Area Network (WBAN), Vehicle Ad-hoc Network (VANET), Data Center Network (DCN) and Multi-Source Mobile Streaming (MS2) to illustrate for the efficient and flexible CCN implementation

In WBAN, a new type of hybrid system and WBAN transmission network architecture are proposed to provide high Quality of Service (QoS) for remote monitoring

in the context of the healthcare system Long-Term Evolution (LTE) has engaged the best choice for healthcare system and CCN with adaptive streaming is a suitable solution for dynamic patients and physicians

In VANET, motivated by the increasing demand for efficient and reliable information dissemination and retrieval, Vehicular Named Data Networking (VENDNET) solution is proposed by inheriting the basic principle of the CCN into VANET The obtained results show that CCN mechanism can improve the performance of the VANET significantly

Trang 7

In the era of cloud and mobile social computing, the mobility characteristic may cause a service on mobile station (MS) to be migrated between different Data Centers (DC); otherwise, a packet delay is increased due to the fact that a considerable geographical distance between MS and serving DC With a current networking architecture, IP address of either MS or Virtual Machine (VM) is changed because of the

VM migration, and an IP session between two peers is released and the service is disrupted as a result Based on unique content name identification in CCN, instead of IP address, service migration can be continuous

Finally, CCN is integrated in recently proposed MS2 architecture for mobile multimedia streaming The resultant architecture improves further the performance of MS2

as it makes popular content available to mobile users at caches placed nearby Radio Access Networks This helps in avoiding the access to popular content from far away servers along paths that could otherwise get congested

Key words: Content centric networking Smart caching Cache hit rate

Network traffic offloading Seamless service migration

Trang 8

Table of Contents

摘要 I

Abstract III

1 Introduction

1.1 Overview and motivation (1)

1.2 Content Centric Networking (CCN): the Next Generation Network (NGN) for the Future Internet (FI) (3)

1.3 Research on smart caching, content sharing, network architecture and smart services provisioning in mobile network (9)

1.4 Summary (12)

2 CCN performance improvement with smart caching 2.1 Related works and their drawback (14)

2.2 Popularity Prediction and Cooperative Caching (PPCC) (15)

2.3 PPCC algorithm simulation and results (21)

2.4 Fine-Grained Popularity-based Caching (FGPC) (26)

2.5 FGPC algorithm simulation and results (30)

2.6 Summary (35)

3 Efficient content sharing over CCN network architecture 3.1 Current challenges in content sharing for Wireless Body Area Network (WBAN) (37)

3.2 The virtue of sharing: efficient content delivery in WBAN for ubiquitous healthcare (43)

3.3 Performance evaluation for a new hybrid system in WBAN (46)

Trang 9

3.4 Vehicular Ad Hoc Network (VANET) overview, wireless access

Standards and existing issues (52)

3.5 Vehicular Named Data Network (VENDNET): efficient content distribution in VANET (62)

3.6 VENDNET performance evaluation (63)

3.7 Summary (66)

4 Seamless service migration in mobile network 4.1 Virtualization technology and Data Center (DC) overview (68)

4.2 Motivation and problem statement (69)

4.3 Seamless service migration framework (71)

4.4 Simulation and results (73)

4.5 Summary (76)

5 Supporting rich media services via Named Data Networking 5.1 Related works and problem statement (78)

5.2 MS2 : Traditional solution to supporting rich media services in mobile networks (80)

5.3 MM3C: Multi-Source Mobile Streaming in Cache-enabled CCN (85)

5.4 MM3C performance evaluation (88)

5.5 Summary (89)

6 Mobile Data Offloading: A Named Data Networking Approach 6.1 Challenges in mobile traffic (90)

6.2 Integrating NDN with LTE (92)

6.3 Network architecture (93)

6.4 Results Analysis (95)

Trang 10

6.5 Summary (97)

7 Conclusions and future works 7.1 Conclusions (98)

7.2 Recommendation for future works (99)

Acknowledgements (101)

References (102)

Appendix 1 Academic papers published during the period of PhD degree directory (111)

Appendix 2 OPNET Modeler simulation tool (113)

Trang 11

1 Introduction

This chapter introduces history of Information Centric Networking (ICN), focus on the Content Centric Networking (CCN) and its technology From the motivation behind the CCN architecture project, we introduce our research on smart caching, content sharing, network architecture, and smart services provisioning in mobile networks

1.1 Overview and motivation

The network technology for today’s Internet was created in the 1960s And it still

speaks only of connection between hosts, e.g Clients-Server, and Peer-to-Peer (P2P)

connection Therefore, the existing network architecture meets challenges That is to say, host-to-host connection strongly ineffective distributes content to a large number of users such as YouTube in US, Youku in China or Daum in Korea, etc Internet users and mobile subscribers feel unsatisfied with the performance in terms of delay, jitter and throughput, etc[1] The existing P2P network architecture meets challenges, the increasing demand for mass distribution and replication of large amounts of resources has led to develop existing P2P networking The other challenge in P2P networking is how to optimal peer selection

If suboptimal P2P peer is selected that leads to expensive inter-provider traffic[2-3]

Content Delivery Network (CDN) overlay tries to solve a fundamental challenge for the Internet That is to say, how to distribute and retrieve content effectively, meanwhile reduce the delay time of terminal host CDNs such as Akamai and Limelight mitigated this problem by placing caches at strategic locations in the network They are essentially a large overlay infrastructure comprised of a large number of caches, and serve contracted data The basic approach to address the performance problem is to move the content from the places of origin servers to the places at the edge of the Internet Serving content from a local replica server typically has better performance (lower access latency, higher transfer rate) than from the origin server, and using multiple replica servers for servicing requests costs less than using only the data communications network does CDN takes precisely this approach On the technical level, CDN provides content caching and services distributed on the Internet On the physical level, CDN includes some center servers and

Trang 12

the number of edge servers placed in wide geographic locations Using CDN, the request content come from users will be redirected to the nearest edge server to solve the backbone network bottleneck and provide better quality of service [4]

However, with the exponential growth of Internet traffic, especially video traffic, the CDN requires very high cost for large storage in the edge servers [5-6] Moreover, deployment considerations force them to place these content caches at the peering point edges Carriers are averse to having third parties such as Akamai and Limelight install their content caches in the carriers’ Point of Presence (PoPs) Thus, these content caches are not integrated into the network - rather they connect to the network like an external application Another disadvantage of CDNs are that these services are specific to contracted applications that are specifically modified to use them

The proliferation of video in the past few years and its demanding Constant Bit Rate (CBR) communication patterns have further limited the benefits of using CDNs Content caches at peering edges do not reduce traffic on the network backbones; they only reduce peering traffic Traffic demands on the network backbone are growing due to growing video traffic and carriers cannot easily upgrade their backbones to handle this traffic surge Beside CDN, cloud computing provides elastic infrastructure and pay-as-you-go These characteristics make cloud computing become a suitable solution for the drawback of CDN about large storage

Nowadays, Internet subscribers care about the data content they wish to get much more than where the content comes from Information-Centric Networking (ICN) has emerged as a promising candidate for the architecture of the future Internet [7-8] Inspired

by the fact that the Internet is increasingly used for information dissemination, rather than for pair-wise communication between end hosts, ICN aims to reflect current and future needs better than the existing Internet architecture By naming information at the network layer, ICN favors the deployment of in-network caching and multicast mechanisms, thus facilitating the efficient and timely delivery of information to the users However, there is more to ICN than information distribution, with related research initiatives employing information-awareness as the means for addressing a series of additional limitations in the current Internet architecture, for example, mobility management and security enforcement,

so as to fulfill the entire spectrum of future Internet requirements and objectives

Trang 13

The ICN architectures leverage in-network storage for caching, multiparty communication through replication, and interaction models that decouple senders and receivers The common goal is to achieve efficient and reliable distribution of content by providing a general platform for communication services that are today only available in dedicated systems such as P2P overlays and proprietary content distribution networks The following more recent projects representing four approaches being actively developed: Data-Oriented Network Architecture (DONA) [9], Content-Centric Networking (CCN) [10], currently in the Named Data Networking (NDN) project [11], Publish-Subscribe Internet Routing Paradigm (PSIRP) [12], now in the Publish-Subscribe Internet Technology (PURSUIT) project [13], Network of Information (NetInf) [14] from the Design for the Future Internet (4WARD) project [15], currently in the Scalable and Adaptive Internet Solutions (SAIL) project [16]

1.2 Content Centric Networking (CCN): the Next Generation Network

(NGN) for the Future Internet (FI)

In order to alleviate the bandwidth problem while considering the feature of Internet subscribers, CCN is proposed to effectively distribute popular data content to a huge number of users CCN was first proposed by V Jacobson in 2009 [10] At this time, CCN is not only a theory proposal but also capability for the real world implementation Thus, many researches, projects and prototypes have been applied CCN [17-18] Similar with NDN, CCN is a network architecture build on Internet Protocol (IP) engineering principle, but it treats content as a primitive IP infrastructure services that have taken decades to evolve, such as Domain Name Service (DNS) naming conventions and namespace administration or inter-domain routing policies and conventions, can be readily used by CCN Indeed, because CCN hierarchically structured names are semantically compatible with IP’s hierarchically structured addresses, the core IP routing protocols, Border Gateway Protocol (BGP), Intermediate System To Intermediate System (IS-IS) and Open Shortest Path First (OSPF), can be used as-is to deploy CCN co-existing or overlaying with IP

It should be noted that CCN and NDN can be exchangeable used in this thesis CCN refers to the architecture project V Jacobson started at PARC, which included leading the

Trang 14

development of a software codebase that represents a baseline implementation of this architecture NDN refers to the National Science Foundation (NSF)-funded Future Internet Architecture project, a 12-campus collaboration that began in 2010 and included PARC The NDN project originally used CCN as its codebase, but as of 2013 has forked a version

to support the needs specifically related to the NSF-funded architecture research and development

CCN is built around secure name-based packet forwarding While CCN is a new architecture for efficient, secure distribution of content, it is compatible with today’s Internet A CCN network is built around CCN nodes that perform name-based forwarding

of packets between content consumers and content providers CCN nodes also transparently cache content as it is being transferred, and respond with cached content when possible Similar to IP, CCN may be deployed over any layer-2 technology CCN can also be layered over IP CCN can be layered over a higher protocol such as HTTP, should it be an expedient means to avoid firewalls IP can also be deployed over CCN Deploying CCN in the existing Internet will initially layer all CCN packets on IP Explicit tunneling between CCN nodes will be used to bridge segments of the network where no CCN nodes are placed This feature allows incremental deployment of CCN in the Internet Eventually, we envision regions in the Internet that primarily use CCN, with residual IP traffic possibly layered on native CCN nodes

1.2.1 CCN architecture and operation workflow

In the CCN architecture, the basic operation of a CCN node is similar to an IP node CCN nodes receive and send packets over faces A face is a connection point to an application, or another CCN node, or some other kind of channel A face may have attributes that indicate expected latency and bandwidth, broadcast or multicast capability,

or other useful features In CCN, two types of packets are envisioned to identify a content, which is typical hierarchical and human readable They are namely interest packets (IntPk) and data packets (DataPk) A CCN node accepts IntPk and either sends them out on an outgoing face, or replies directly with a matching content packet If the node has forwarded an IntPk then it should normally receive a responding content packet which it will send to the original requesting face

Trang 15

Figure 1.1 Example of Client sends IntPk and receives DataPk

It is important to note that CCN provides a naming framework, CCN names need not

be human readable For names that are intended to be human readable, the current convention is to use UTF-8 encoding A CCN name is similar to an URI, where segment has a label and a value The label identifies the purpose of the name component, such as a general name component used for routing, or a specialized component used to sequence numbers or timestamps There are also application specific labels An example CCN name

for content might look like: /hust.edu.com/epiclab/documents/CCN-Introduction.pdf

In the above example, the first segment can be a globally routable name under the control of some organization and can be derived from a Domain Name Service (DNS) name assigned to that organization Other conventions can also be used The remaining segments in the name are application specific Protocol specific segments such as version number or chunk numbers can also be included in the name CCN name matching is based

on exact equality for segment Exact matching for two names requires a match for all corresponding segments Prefix matching requires that all segments in the prefix be equal

to corresponding segments in the name Since name matching examines entire segments, simple hash techniques can provide efficient name matching

A typical CCN framework is illustrated in Fig 1.2 It presents a simple and effective communication model Typically, CCN nodes includes two data tables for name-based routing, and a cache They are Forwarding Information Base (FIB), Pending Interest Table (PIT) and Content Store (CS)

Once a CCN node receives an IntPk, it looks up its CS If an appropriate content is found, the DataPk will be sent for a request, otherwise the IntPk will be checked in PIT PIT keeps track of unsatisfied IntPks After PIT creates a new entry for an unsatisfied IntPk, the IntPk is forwarded to upstream towards a potential content source based on FIB’s information

Trang 16

A returned DataPk will be sent to downstream and stored on CS In general, a content

is cached at routers for a certain time When the caching deadline expires, the content is removed to cope with the limited size of content storage When CS is about to get full or receive a new content, it stores the new content according to the underlying replacement policy to leave space for the new content Least Recently Used (LRU), Least Frequently Used (LFU) and First In First Out (FIFO) are few notable examples of replacement policies for CCN

Figure 1.2 A typical CCN framework

1.2.2 Transport and congestion control

In the transport layer of CCN, it can run over any layer-2 technology or above The CCN protocol requires very little functionality from the underlying packet transport CCN simply assumes stateless, unreliable, unordered, best-effort delivery of packets Thus, IntPk and/or DataPk might be lost or corrupted in transit, or the requested content might

be unavailable In CCN, the return path for the DataPk is the reverse of the path for the IntPk Intermediate CCN nodes might fail as well, resulting in DataPk not reaching the consumer (the application that sent the initial IntPk) Reliability in CCN is achieved by retransmitting IntPk that have not yet been satisfied The onus is on the content consumer

Trang 17

to retransmit unsatisfied IntPk The consumer can incorporate a timeout for every IntPk it sends Upon a timeout, an unsatisfied IntPk can be retransmitted

CCN enforces flow balance, so that one IntPk results in at most one DataPk As in TCP, it is not necessary, or even desirable, to wait for a reply before sending out the next IntPk, so many IntPk may be in flight simultaneously The IntPk in CCN are analogous to window advertisements in TCP Unlike TCP, however, lost packets do not stall the pipeline, as CCN packets are independently named and are not part of a specific conversation CCN maintains flow balance to enable efficient communication over links with varying latencies and bandwidths and nodes with varying capabilities Flow balance

at every node allows for simple and effective techniques to avoid congestion The underlying transport may receive duplicate IntPk and DataPk All duplicate DataPk are dropped by CCN nodes Duplicate IntPk received over different paths are also detected and dropped through the use of a hop limit CCN transport can forward an IntPk on multiple faces This feature is especially useful in dynamically determining the best face

to use based on varying network and traffic conditions

1.2.3 Security

CCN supports content-based security, rather than connection based security In CCN, every DataPk is authenticated with digital signatures and private content is encrypted This key attribute is what enables CCN to retrieve content from any CCN node without contacting the content’s source Since the name and the content are cryptographically bound together, it is not necessary to trust the nodes or the connection

Signature verification provides a means for securely associating a name, a publisher, and content together With appropriate verification only valid content will pass to the client As a side benefit, verification provides error detection for DataPk with a high degree of reliability Every DataPk has the following fields:

Signature

Name: the full CCN name for the content

SignedInfo: key hash, key locator, and other details

Data: the binary representation of the content

Every DataPk is signed with a signing key such as a public key The signature is

Trang 18

computed over the entire packet content exclusive of the Signature field, and so includes the Name, the SignedInfo, and the Data The exact method for computing the signature depends on the cryptographic algorithm associated with the signing key

The SignedInfo includes a hash of the key, a key locator, a timestamp, and some additional details not within the scope of this document A key locator may either include the public key directly, or it make contain a name used to find the key Signature verification proves that the content can be trusted, provided that the key can be trusted CCN does not mandate policy decisions about trust The level of trust for the key depends

on a trust model that is chosen by the client

The information included in the DataPk must support a reasonable set of models, but

is not dependent on them In practice, we expect several trust models, including a choice

of public key infrastructure By providing a mechanism for transporting authenticated data, CCN makes key distribution itself more reliable as keys can be distributed as CCN Content Objects Privacy can be incorporated in CCN and can be layered on top of CCN

1.2.4 Publishing

A publisher can publish its content under a given name prefix For example, all

content published by EPIC lab can be under the "/epiclab" name prefix When an IntPk is

delivered to the publisher, a DataPk can be generated in response The publisher signs each DataPk with the publisher’s private key so it can be verified using the publisher’s known public key

In CCN, content that changes over time is represented as having multiple versions It

is possible to represent new versions that are minimally changed from old versions as a set

of changes along with a reference to the older version, and this approach can provide significant bandwidth savings However, such a convention would be layered on top of the core CCN protocol CCN just provides a naming scheme that supports different versions Since CCN is a form of packet switching, large content must be represented as multiple chunks This feature allows for partial caching and delivery of partial results before the entire content can be generated There is no mechanism in the core CCN protocol for deleting or revoking access to content It is not always possible to find all of the copies in an extended network Rather, when publishing content, one gives it a

Trang 19

time-to-live, so that cached copies expire Thus, only the originating sources for the content need to have their copies deleted to eventually ensure that the content is no longer

in the network (although clients cannot be guaranteed to not have copies)

1.3 Research on smart caching, content sharing, network architecture and smart services provisioning in mobile network

From the background of CCN protocol introduction, caching decision, replacement policies, and content sharing play crucial roles in CCN’s overall performance We research on smart caching to achieve higher hitting rate and faster convergence speed Moreover, we research on integrated CCN to existing network architecture and service to improve Quality of Service (QoS) and Quality of Experiment (QoE)

1.3.1 Research on smart caching

(1) Prefix-based Prediction-oriented Cooperative Caching (PPCC)

We propose two novel replacement policies, named Prefix-based Prediction-oriented (PP) and Prefix-based Prediction-oriented Cooperative Caching (PPCC) PP maintains prefix tree of all contents in CS, then PP can determine popularity levels and give suitable lifetime for each content, included new coming content PPCC is PP with periodic exchange prefix information among close by CCN nodes

(2) Fine-Grained Popularity-based Caching (FGPC)

FGPC maintains a large table to generate three kinds of statistic information, namely

i) content names, ii) popularity levels of content by counting the frequency of appearances

of a content name, and iii) time stamp of used contents located in a cache By filtering

popular content based on the large table, FGPC achieves effective caching with high hitting ratio

1.3.2 Research on content sharing and network architecture

(1) Wireless Body Area Network (WBAN)

WBAN includes a set of body sensor nodes which are placed around human body, collecting data while sending them to medical center In order to deliver the body signal to remote terminals in timely fashion, an extended communication architecture dubbed

Trang 20

"beyond-BAN communication" was proposed [19-20] However, existing architectures are not suitable for the scenarios with high mobility of both patients and physicians due to the fluctuation of wireless links Furthermore, when the amount of healthcare content is large, the quality of delivery is hard to be guaranteed

To address these challenging issues, we propose a novel network architecture, which integrates WBAN with the Long Term Evolution (LTE) networking and CCN [21-22] The integration with LTE is to enlarge the radio coverage and guarantee the quality of wireless transmissions, while the integration with CCN is to leverage edge router caching technique to enhance the capacity of the WBAN coordinator, and to avoid the packets loss

by adapting to dynamic wireless link conditions with the adaptive streaming technique The experimental results conducted by OPNET Modeler prove that our solution improves the Quality of Service (QoS) performance of WBAN transmission significantly

(2) Vehicular Ad-hoc Network (VANET)

VANET is a technique that uses moving vehicles as wireless nodes in a mobile network, in which each wireless node takes a role as an end-user and wireless router to support wide range communications Motivated by the increasing demand for efficient and reliable information dissemination and retrieval, the Named Data Networking (NDN)[1]presents a simple and effective communication model [10, 23] We propose our solution, Vehicular Named Data Networking (VENDNET), by inheriting the basic principle of the NDN However, extending the NDN model to the VANET is not straightforward due to a lot of challenges in the vehicle environment such as the limited and intermittent connectivity, and node mobility

We first introduce some meeting challenges in different types of vehicle communication mechanism [24-25] Motivation from the NDN model simulation, the VENDNET performance is taken into account by clearly comparing the VANET under two scenarios: with typical clients–server connection and with NDN connection

1.3.3 Research on smart services provisioning in mobile network

(1) Seamless service migration in mobile network

In the era of cloud and mobile social computing, a plethora of mobile applications

[1] CCN and NDN can be exchangeable used with the same meaning

Trang 21

demand Mobile Stations (MSs) seamlessly interact with the cloud anyplace in a real-time fashion The mobility characteristic may cause a service on MS to be migrated between different Data Centers (DC); otherwise, a packet delay is increased due to the fact that a considerable geographical distance between MS and serving DC With a current networking architecture, IP address of either MS or Virtual Machine (VM) is changed because of the VM migration, and an IP session between two peers is released and the service is disrupted as a result

We leverage the emerging CCN as a straightforward solution to these issues Based

on unique content name identification, instead of IP address, service migration can be continuous Furthermore, a seamless service migration framework is proposed to conduct the user's service request to the optimal DC, which satisfies user requirements, minimizes the network usage and ensures application Quality of Experience (QoE)

(2) MM3C: Multi-Source Mobile Streaming in Cache-enabled CCN

Along with an ever-growing demand for rich video applications by an ever-increasing population of mobile users, it is becoming difficult for the Internet backbone to cope with a constantly increasing mobile traffic Some previous research work aim at solving the bottleneck issue of the Internet backbone considering simultaneous multiple low streaming rate transmissions to mobile users However, this multi-source streaming approach does not consider redundant transmission of popular contents

A novel scheme integrating CCN with Multi-Source Mobile Streaming (MS2), dubbed MM3C, is presented as a better solution to alleviate the problem If the content is popular, the previously queried content can be reused for multiple times to save bandwidth capacity, reduce overall energy consumption, and improve users’ Quality of Experience (QoE) Using OPNET Modeler for the simulation, the performance of MM3C is evaluated Compared to MS2 under the same network configuration, the simulation results show that MM3C exhibits less number of bottleneck links, smaller average round trip time, while achieving better performance in terms of traffic offloading

(3) Mobile Data Offloading: A Named Data Networking Approach

Nowadays, mobile Internet becomes increasingly popular and the number of mobile users is growing exponentially Internet mobile subscribers often access to multimedia

Trang 22

services, which consume a large amount of bandwidth in backbone transmissions as well

as server resource With a traditional clients-server connections, servers are usually in overload state by a huge number of users accessing the service at the same time Moreover,

in a century of "green" Internet, it should be more effective in content distribution for a huge number of users

Name Data Networking (NDN) had been proposed as the promising solution for above problem, which is a content name-oriented approach to disseminate content to edge gateways/routers In NDN, a content is cached at routers for a certain time When deadline

is reached, the content will be removed to yield space due to the limited size of content storage If some content is popular, the previously queried one can be reused for multiple times to save bandwidth

In this section, we propose a solution for Long Term Evolution (LTE) mobile network based on the concept of NDN Furthermore, a novel caching policy dubbed by Fine-Grained Popularity-based Caching (FGPC) is proposed By OPNET Modeler simulation, we carry out the evaluation in realistic mobile network with a huge number of mobility LTE mobile users access to a single serve, where content names and content sizes are obtained from real trace of Internet traffic The obtained results show that Evolved Packet Core (EPC) caching scheme can helps to satisfied most of the mobile users request, which further increases the quality of service as well as offloads server traffic significantly

1.4 Summary

While IP has exceeded all expectations for facilitating ubiquitous interconnectivity, it was designed for conversations between communications endpoints but is overwhelmingly used for content distribution Furthermore, today’s applications are typically written in terms of what information they want rather than where it is located The tremendous growth of the Internet and the introduction of new applications to fulfill emerging needs, has given rise to new requirements from the architecture, such as support for scalable content distribution, mobility, security, trust, and so on From above motivation, CCN was successful designed for the Future Internet with a lot of key functionalities via named-routing, on-path caching, mobility support and security

Trang 23

It should be noted that I use OPNET Modeler simulation tool for my research [26-27] OPNET Modeler is a very good tool for network designing and simulation OPNET Modeler was selected because most of the wired and wireless network components are available in the OPNET 16.0 Modeler In this version, a number of different models can

be created to simulate, analyze and compare their results OPNET software can offer students a broader insight in networking technologies, simulation techniques and the impact of applications on network performance, and makes them feel as if they are real network engineers

In this section, I have presented about CCN architecture, operation workflow, and applications I also give the brief introduction for my research on smart caching, content sharing, network architecture and smart services provisioning in mobile network In the next sections, we are going to describe my research areas one by one in detail

Trang 24

2 CCN performance improvement with smart caching

Having described the CCN architecture network, operation workflow, and having given brief introduction for my research, I am going to present novel smart caching in order to improve CCN performance significantly

2.1 Related works and their drawback

Recently, CCN has become a hot research area, and several projects and prototypes have applied CCN [28-30] Least Recently Used (LRU) and Least Frequently Used (LFU) replacement policies are proposed in original CCN [31] At CCN node, all contents in CS are marked with time stamp [31] When content is responded to satisfy IntPk, time stamp of used content will be up-to-date In LRU(LFU) policy, CS maintains the same lifetime for all contents And CS periodic refreshes a memory by checking lifetime of all contents At the refresh time, content is deleted if the different between a current time and its time stamp equals or great than lifetime In the other words, content recently used will be kept and otherwise, content is deleted

The drawback of LRU(LFU) policy only happens when buffer memory was full Generally, the LRU(LFU) policy makes a replacement decision (keep or delete) based on time stamp in case of LRU (count number of sent in case of LFU) on each content And LRU(LFU) cannot know the popular level of coming content Let us consider a situation that a cache was in full state and a new unpopular content arrived Then, one popular content in CS has to be deleted and replaced by arrival unpopular content For this reason, hitting ratio of LRU(LFU) cannot be achieved maximum value From above observation, LRU and LFU ignore the advantage of CCN:

LRU and LFU only base on object name to make replacement decision Since, LRU uses a time stamp on object name while LFU uses a frequencies of object name Because LRU and LFU do not consider prefixes popularity of the object, they cannot recognize the popular level of new coming content

All contents have the same lifetime while their popularity levels are very skew

In order to improve the precision of content replacement decision, a novel caching

Trang 25

strategy, named Most Popular Content (MPC), was proposed in [32] In MPC, every router/gateway counts the local number of requests for each content name, and stores the pair (content name; popularity count) into a content popularity table Once the popularity

of a content object reaches a predetermined threshold in a caching node, it is tagged as a popular content and is stored in the cache By storing only popular content, MPC caches less content, saves resources and reduces the number of cache operations, which makes it achieve a higher hitting ratio in comparison to LRU and LFU

Although some memory space is saved in MPC, the utilization of the cache is typically not high Based on our observation, higher hitting rates can be achieved by intelligently utilizing vacant cache memory to contribute to a certain amount of hitting ratio For example, when there is room in available memory, unpopular content can be stored When the cache becomes full of content, unpopular contents are removed, yielding space for popular ones Furthermore, such strategy of caching less content, adopted by MPC, results in a slow convergence of MPC in terms of hitting rate performance

From above observation, we propose two novel replacement policies, named Prefix-based Prediction oriented Cooperative Caching (PPCC) and Fine-Grained Popularity-based Caching (FGPC) In the next two sections, we are going to present them

in detail

2.2 Popularity Prediction and Cooperative Caching (PPCC)

In our proposed Prefix-based Prediction-oriented (PP) policy, prefix tree (PT) is created to maintain all contents in CS, then PP can determine popularity levels and give suitable lifetime for each content, included new coming content PPCC is PP with periodic exchange prefix information among close by CCN nodes Due to the important as compared to the existing LRU or LFU solutions, the PP and PPCC schemes have the following unique features:

We carefully investigate the characteristics of CCN where "name" of data content includes prefixes We add a table to handle prefix name and prefix counter

The way Internet users request content is fine turned with Pareto principle, that is 20% popular content is requested by 80% number of users [33-37]

Trang 26

The prefixes of popular data contents always appear with a high probability and vice versa

 Content with popular prefixes (e.g popular publisher server, related novel content)

will be popular too

From the background of CCN, we import CCN protocol to all network elements on top of IP layer The existing LRU, LFU and our proposal PP, PPCC replacement policies have been successful constructed in CS Our simulation results prove that CCN is a good solution for existing challenges of traditional IP network And PPCC, PP outperform than LRU, LFU with highly effective caching

2.2.1 Prefix handle

Names of content are hierarchically structure and human readable CCN node uses the longest prefix mapping to find the name in CS, PIT and FIB when it serves for arrival IntPks For notational convenience, names are presented like Uniform Resource Identifier (URI) with "/" characters separating components [10] In our work, we use a format

"root/prefix1/prefix2/prefix3/prefix4/prefix5/" to present a name And the position of

prefix’s name in URI presents for the prefix’s level We added to CCN node one more table called Prefix Tree (PT) to maintain prefix’s names and prefix’s counters On the top

of the tree structure, "root" element is called globally routable name CCN published

server broadcasts the root’s name to whole network When CCN nodes receive the root’s name, they add to their routing table (FIB) and then continue forward root’s name to their neighbors CCN published server needs up-to-date the root’s name by sending in

periodically (e.g 100 seconds) For some reason, either CCN published server or CCN

nodes at middle of network can be broken down or moved to other locations All CCN nodes can realize a changing of network and re-configure their FIB by receiving root’s name in periodically

To create a huge number of IntPk name, we setup 10 different prefix’s names for each prefix’s level which are in range from the first to the fourth prefix’s level For the fifth prefix’s level, also means name of content, there are 50 file’s names If prefix’s name

of all levels are selected by random distribution, the total number of IntPk names is up to 5*105 names

Trang 27

To create UGC simulation, all prefix’s names are generated by power-law distribution (80-20 distribution) The probability density function is presented by Eq (2.1)

1

.( )

c c

With: 0 x b≤ ≤ , c is a shape parameter and b is a scale parameter

Hence, mean and variance value are:

2 2

2

.( )

1( )

( 2)( 1)

b c

E x

c b x

4

4 4

name "ccnx://hust.edu.cn/epiclab/talk/wban/zigbee.pdf", CS increases the counter of

matching element one time as presenting below:

Trang 28

With Cp[i] is a counter for the i th prefix level Then, lifetime of each content is calculated by Eq (2.4):

5 1

With t u is a lifetime unit variable and w[i] is a weight of Cp[i] We take note that

hitting rate is high when either popular content or unpopular content can be located in cache as long as possible At the time refresh CS, if CS still has free space to store more content, it will not delete any content in spite of lifetime of some contents could be already over

Fig 2.1 is simulation results about total bits cache at CCN nodes for LRU(LFU) and PPCC(PP) For LRU(LFU), the lifetime of all contents is fixed at 30 minutes while the lifetime for PPCC(PP) is variable due to Eq (2.4) As example in Fig 2.1, after total bit in cache reached the up-bound of cache size, it continues to keep the stable state with an old content replaced by the new one in case of LRU(LFU) This is also the drawback of LRU(LFU) presented For PPCC(or PP), dynamical lifetime is used for content And some less popular contents are deleted only if the total bit has reached the up-bound of cache size Then, it rolls back to lower value and keep space to store new coming contents Moreover, by handing prefix tree and prefix’s counters, the most popular contents are selected to locate in CS, this method trends to effective caching

Figure 2.1 Total bit cache in CCN node for LRU(LFU) and PPCC(PP)

Trang 29

Figure 2.2 Request rate and adaptive lifetime unit for PPCC(PP)

A number of arrival IntPks ( f req) is an important metric that strongly affect to replacement algorithm The request rate arrival CCN node is always high oscillation with variable number of users, especially for Internet mobile users We handle dynamic request

rate by controlling a lifetime unit t u A t u value is adaptive adjusted based on measuring a

number of requested content arrival every 60 seconds When f req value is high, CS is faster

to be full filled, then LRU(LFU) will be implemented To avoid LRU(LFU) may deletes

some popular contents, we decrease t u value in associated with the increasing of f req value

Decreasing t u helps to reduce lifetime of all contents Then, unpopular contents are faster

found and deleted before new contents arrive When f req value is low, we increase t u value

to keep contents in CS as long as possible For this reason, t u f req is a Constance Fig.2.2

is simulation results about the number of request rate arrival CCN node every 60 seconds and adaptive control lifetime unit

2.2.3 PPCC algorithm

Among close-by CCN nodes, they can cooperate with others to find out (or predict) popular contents quickly All CCN nodes notify their neighbors about the highest popular prefixes which they have found by themselves After CCN nodes received popular prefixes from others, they give longer lifetime for contents which have the same popular prefixes from others With a simple cooperation between CCN nodes, higher hitting rate and faster convergence speed to final state are achieved

Trang 30

Figure 2.3 Flowchart for CCN node with Prefix Tree (PT) integration

In our simulation, the cooperative interval time is set at 30 seconds And after CCN

nodes found the same popular prefixes from others, they increase lifetime unit t u 10% Fig

2.3 is a flowchart for CCN processor node to show out the integration of our caching algorithm to CCN protocol

Trang 31

2.3 PPCC algorithm simulation and results

2.3.1 Network architecture

We implement CCN protocol and perform simulation using OPNET Modeler 16.0

At this time, only ndnSIM is the famous open source specifically designed for NDN [17] ndnSIM is NS-3 module which specially optimized for simulation purposes Following the NDN architecture, ndnSIM is implemented as a new network-layer protocol model, which can run on top of any available link-layer protocol model However, ndnSIM is limited to

realistic mobile environment simulation (e.g 2.5G/3G/4G) while OPNET is full support.

Figure 2.4 CCN node processor model

Through caching policy in ICN has been studied by several researches [38-42] They had a common method that collect content by using tool to trace data for long time Then the trace content is replayed directly in their caching algorithm That is, a realistic CCN with the packet flows from users to server through CCN is unconsidered and difficult to implement simulation From the CCN background, we emphasized that the CCN model is compatible with today’s Internet and has a clear, simple evolutionary strategy At this moment, CCN is overlaid over IP layer We integrated CCN processing module into all network elements such as: eNodeB, LTE Mobile Station, Evolved Packet Core (EPC), Gateway, PC, Server and IP Cloud Finally, we realized realistic CCN indeed

Fig 2.4 is our OPNET model for CCN node processor which is integrated to eNodeB, Router or Gateway We apply our new caching policies in both WAN and LTE

Trang 32

network From our best knowledge, there is no topic talking about CCN overlay on mobile network, especially for LTE network until now

Fig 2.5 is network architecture included WAN and LTE network There are 3 cells in LTE network Each cell has eNodeB, CCN processor node and 25 LTE mobile stations (MSs) All MSs request for contents in video server in term of UGC, that is 20 MSs (80% traffic) request for popular contents while remaining 5 MSs (20% traffic) request for all contents We import CCN protocol to LTE network by NDN processor component which was described in Fig 2.5

Figure 2.5 Network architecture for LRU(LFU) and PPCC(PP) simulation

There are 2 scenarios in the simulation The first scenario considers 3 kinds of replacement policy which are stand-alone They are LRU at eNodeB_1 (cell 1), LFU at eNodeB_2 (cell 2), and PP at eNodeB_3 (cell 3) The second scenario considers cooperative caching between cells We set up PPCC for all 3 cells Hitting rate, coverage time to final state and percent offloading server traffic are important metrics to be verified

in the simulation results The same configuration and scenarios for WAN network

2.3.2 Simulation parameters

D Rossi et al., [43] summarize some of the most relevant system parameters used in

Trang 33

related works These information are believed to reflect the real world In [43], the number

of content in the considered catalogs can be as low as 250 objects, topping to 20,000 objects We realize that the object size is extremely small compared to real world of Internet catalog (108 contents for YouTube and 5,000,000 contents for BitTorrent) In the other words, number of contents is all under estimated because of limited simulation Within 3,000 seconds simulation time, we expect 500,000 contents are good enough for nodes coverage to final state Because this topic focus on the popular of prefix’s name,

we set all files size at 1Mbit (Mb) for easy The catalog size will be 500,000Mb Cache size is varied at 300Mb, 500Mb, 800Mb and 1,000Mb to determine the performance caching The relative cache sizes are taken into account, they are 0.06%, 0.1%, 0.16% and 0.2% respectively Comparing relative cache sizes with [38] and [41], they are all in range of important value consideration

In [43], an arrival rate of IntPk to CCN node is in range [1,10]Hz We choose arrival rate around 5Hz With 25 users exist in one cell, users send IntPk with interval time 5 seconds The requested contents from users are realistic with randomize starting requested time and slightly random inter-arrival time around 5 seconds for each user The number of

potential popular content is created by power-law distribution We adjust a shape (c) parameter and a scale (b) parameter to have multiple skewness factors of popular content

which are used to consider the impact of popularity skewness on hitting rate

Table 2.1 Sum up multiple skewness factors of popular content

90-10 rule 90 45 with 10% 5 with 90% 943

80-20 rule 80 10 with 80% 40 with 20% 1,677

70-30 rule 70 15 with 70% 35 with 30% 2,202

60-40 rule 60 20 with 60% 30 with 40% 2,516

50-50 rule 50 25 with 50% 2,621

Table 2.1 sums up multiple skewness factors of popular content A default 80-20 rule is used in the simulation With 3,000 seconds simulation, the total IntPks received by the CCN node is 15,000 times This value is must higher than popular potential contents

So, hitting rate value can quickly convergence to final state within limited simulation time

Trang 34

(a) PPCC with multiple relative cache sizes (b) Multiple cache policies performance

(c) Final state of cache policies with varied relative

cache sizes

(d) Impact of popularity skewness with relative

cache size 0.16%

Figure 2.6 LRU(LFU) and PPCC(PP) performance comparison

(1) Impact of cache size on hitting ratio

We compare the effectiveness of PPCC across a range of cache sizes In Fig 2.6(a), the cache size at CCN node is set in turn at 0.06%, 0.1%, 0.16% and 0.2% And hitting rate values at the final state are 45%, 55%, 63% and 63% respectively The results show

Trang 35

that, when we expand a cache volume, higher hitting rate can be gotten However, a growing of hitting rate does not linear increase with cache volume In Fig 2.6(a), we can observe that when cache size is large enough with 0.16%, PPCC can handles all most of popular contents If we continue to enlarge the cache volume to 0.2%, the algorithm performance gain decreases There is a tradeoff between cache volume (cost) and algorithm performance, and we can make a balance to choose suitable cache size

(2) Impact of replacement policy on hitting ratio

Fig 2.6(b) illustrates effect of multiple types replacement policy when we fixed relative cache size at 0.16% PPCC has good performance with the highest hitting ratio, following by PP, LRU and LFU, respectively

Fig 2.6(c) gives us a big picture about final state of multiple replacement policies with multiple relative cache sizes PP and PPCC always outperform with higher hitting rate than LRU(LFU) in all situations There is no big different hitting rate value between

PP and PPCC as well as between LRU and LFU However, with a simple cooperation among adjacent cells, PPCC can gain more 1% or 2% hitting value higher than PP

Fig 2.6(c) also illustrates for the impact of cache size on performance The maximum different hitting rate value between PPCC and LFU is 10% at relative cache size 0.1% When relative cache size increases to 0.16% and 0.2%, the different value reduces to 8% and 5%, respectively Thus, we find out the same conclusion with [40]

(3) Impact of popularity skewness on hitting ratio

The request pattern follows the impact of the number of content population on power-law distribution of content popularity by Eq (2.1) And the number of potential content population was presented in Eq (2.3) Table 2.1 shows the effect of skewness factor on the number of potential popular content When the skewness factor is near 100%,

a few contents attract the majority of the requests, while when the values are close to 50%, the popular content is spread to uniform

Fig 2.6(d) shows the performance of multiple replacement policies when the skewness factor is over a range from 50% to 90%, with the relative cache size of 0.16% Our simulation shows that when increasing the number of popular content, PP and PPCC always maintain performance stability and gain much better performance than existing algorithms

Trang 36

(4) Impact of replacement policy on server load

In CCN, IntPks can be immediately satisfied by edge routers or by the publisher server If CCN nodes performs with higher hitting ratio, lower requested traffic is fetched

to server, then higher percent offloading traffic achieve Fig 2.7 is amount of traffic responding by server under various caching schemes and relative cache size 0.1% Comparison offloading server capacity between PPCC and PP, they are quite similar because hitting rate of PPCC is just slightly higher than PP (see in Fig 2.6(b)) However, PPCC is faster coverage to final state than PP In Fig 2.7, PPCC helps server load quickly

to reduce around 10Mbps at the 500th second while the 700th second for PP

After all, PPCC and PP help server to offload total traffic more quickly and deeply than LRU or LFU The server traffic reduces from 16Mbps to around 10Mbps in case of PPCC and PP, following by LRU and LFU with 12Mbps So, percent offloading for PPCC(PP) and LRU(LFU) are 37.5% and 25% respectively

Figure 2.7 Server load in case of LRU(LFU) and PPCC(PP)

2.4 Fine-Grained Popularity-based Caching (FGPC)

Similar in spirit to Most Popular Content (MPC)[32], FGPC maintains a large table to

generate three kinds of statistic information, namely i) content names, ii) popularity levels

of content by counting the frequency of appearances of a content name, and iii) time

stamp of used contents located in a cache

In order to quickly achieve high hitting rate, the cache always stores new coming contents when it has available memory To avoid inadequate replacement decision, once

Trang 37

the cache is overflowed, FGPC checks the counting value of new content names If the counter reaches a predetermined popularity threshold value, the new content will be stored using the LRU replacement policy; otherwise the new content is simply deleted

Compared with the existing LRU/LFU or MPC solutions, the FGPC scheme has the following unique features:

In FGPC, we carefully investigate the characteristics of CCN where content name includes prefixes We add a content popularity table to handle content name, content counter, and time stamp

The way Internet users request content is fine turned with Pareto principle; that is 20% popular contents are requested by 80% number of users [33-37]

The popularity contents always appear with a high probability and vice versa Moreover, an enhanced version of FGPC, dubbed Dynamic FGPC (D-FGPC), is proposed Because the number of contents can be stored in the cache and the number of arrival requests to gateway/router is changing dynamically over time, the popularity threshold value should be adaptively adjusted on-the-fly over time, too For this reason, the enhanced version of FGPC is called DFGPC

From the background of CCN [31], we import CCN strategy to all network elements

on top of IP layer in the OPNET simulator The existing LRU, MPC and our proposed FGPC and D-FGPC policies are successfully constructed in CCN nodes Our simulation results prove that CCN is a good revolution for existing challenges of traditional IP network The results indicate that FGPC and D-FGPC outperform LRU and MPC with highly effective caching

2.4.1 FGPC strategy

In FGPC, each CCN node maintains a table containing statistic information about the popularity of a content name and that is in the form of a content counter, along with a time stamp Indeed, FGPC keeps track of popular contents by locally counting the frequency of appearances of each content name There are three main operations conducted by FGPC:

FGPC constantly updates three kinds of statistic information in the popularity

table, i.e., content name, content counter and time stamp when receiving a content from

upstream or delivering a content to downstream

Trang 38

FGPC always stores newly arriving contents (regardless their popularity) when

CS has available space

When the CS is about to get full and a new content arrives, FGPC compares the popularity level of the new content (PX) to a predefined popularity threshold value (Pth) If

PX exceeds Pth, FGPC adopts the LRU policy to store the new content into CS Otherwise, FGPC ignores the content and does not cache it

In FGPC, CCN nodes achieve effective caching when they recognize the popularity levels of all contents and keep popular contents for longer times than other less popular content items The trade-offs between performance and space availability as well as with computational complexity should be taken into account In order to significantly reduce the overhead of the popularity table, algorithms such as the message-digest (MD5) Hash algorithm or mapping content names to digital numbers could be effective [44] For instance, to keep one million content names, and given the fact that a MD5 Hash value

uses 16Bytes (i.e., 1B for counter and 3B for time stamp [43]), an CCN node would need an additional memory of merely 20*106B or 19.0735MB for the popularity table Nowadays, there is a clear technical trend that network devices are provided powerful packet processing and large memory This makes no extra cost impact to the provider or to the users for in-network caching with FGPC approach

2.4.2 D-FGPC strategy

In the basic FGPC scheme, the popularity threshold (Pth) is fixed for filtering popular content Admittedly, this is not realistic as in real-life implementations, the popularity of a content changes with time In addition, the number of arriving interest packages (ninterest), the practical file sizes (FP) as well as available cache sizes (Csize) of different routers/gateways dynamically change In this section, we define D-FGPC, as a new variant

of FGPC with dynamic Pth adjustment For the sake of better illustration, Fig 2.8 shows the main operations of both the FGPC and D-FGPC schemes

Trang 39

Figure 2.8 FGPC and D-FGPC flowchart

Here under, we discuss the relationship between the parameters Pth, ninterest , FP and

Csize as follows:

Given a fixed FP, when Csize increases, available memory of CS increases too Then, Pth must decrease to relax the popularity-based content filtering and help fill up the available memory of CS as soon as possible For this reason, DFGPC fully utilizes the cache and faster convergence speed than FGPC in case of large Csize

Given a fixed Csize, when FP increases, the number of practical files stored on CS decreases Then, Pth must increase to improve the popular content filtering It should be noted that contents in the memory are constructed by a lot of chunk files, and the users normally request for completed contents which include a group of chunk files together

Given a fixed Csize, when ninterest increases, the number of newly coming contents will also increase In order to efficiently use the limited cache size and to accommodate the largest number of content requests, Pth must set up to higher values to enhance the popular content filtering

Similarly, we have reverse situations and operations when Csize, FP and ninterest

decrease, respectively As above discussion about the relationship between the parameters

Pth, ninterest , FP and Csize, a dynamic setting of the popularity threshold can be achieved

Trang 40

using Eq (2.5), whereby β is a constant that reflects the content filtering factor

int .

P erest th

size

n F P

middle values in the range, ninterest can be estimated every minutes, then β is determined when Pth is set at five With the calculated β, when either FP or Csize varies, Pth will vary around five too

2.5 FGPC algorithm simulation and results

2.5.1 Network architecture

With every intention to consider a typical Internet network topology , we envision the

network topology as shown in Fig 2.9 Almost all networks are built with three layers, i.e i)

high-speed data transport between core routers and distribution sites is provided by core layer,

ii) the distribution layer provides policy-based connectivity, peer reduction and aggregation, iii)

the access layer provides common group access to the internetworking environment

Figure 2.9 Envisioned network architecture for FGPC simulation

Định dạng
Số trang	124
Dung lượng	2,27 MB