Masters thesis of science user delay tolerance aware edge node placement optimization for cost minimization

User Delay Tolerance Aware Edge Node Placement Optimization for Cost Minimization A thesis submitted in fulfillment of the requirements for the degree of Master of Science Xiaoyu Zhang School of Compu[.]

Motivation

Deploying edge nodes involves balancing deployment costs with the delay experienced by mobile users, which depends on site selection and resource allocation decisions Increasing the number of edge nodes or adding more computing resources can reduce transmission and computation delays but also raises overall deployment expenses Fortunately, users typically accept certain delay thresholds, allowing for optimized deployment strategies that minimize costs while meeting these tolerance levels An illustrative scenario highlights how edge node selection and resource allocation directly impact deployment costs across different user delay tolerances.

Figure 1.3 illustrates how users’ tolerance for service delays influences the optimal deployment of edge nodes to maximize cost-efficiency The diagram highlights that higher delay tolerance allows for more flexible and potentially more cost-effective edge node placements Blue numbers indicate the distances between base stations and edge nodes, while black numbers represent the workload at each station, emphasizing the importance of balancing service delay and operational costs for optimal network performance This analysis underscores the significance of user experience in guiding edge infrastructure deployment decisions, ultimately enhancing network efficiency and user satisfaction.

The placement of six edge nodes is influenced by both workload and distance, which jointly determine transmission latency; additionally, workload impacts computation delay As shown in Fig 1.3a, initial connections between base stations highlight potential locations for deploying edge nodes, often colocated with the base stations to optimize network performance.

Implementing an edge node within a base station involves setup costs, while expanding its computing capacity by adding servers incurs additional purchase expenses To optimize costs while respecting users' delay tolerance thresholds, strategically placing edge nodes at suitable base stations with the optimal number of servers is essential This approach can significantly reduce overall infrastructure and operational costs, ensuring efficient and cost-effective edge computing deployment.

To minimize total costs, the optimal edge node deployment strategy depends on users’ delay tolerance For a delay threshold of 22 seconds, the most cost-effective solution involves deploying two edge nodes, S1 and S2 This placement ensures optimal resource utilization while meeting user latency requirements, highlighting the importance of adapting deployment strategies based on delay tolerance parameters.

Adding additional edge nodes is more costly than expanding existing servers, but when the delay tolerance threshold decreases to 16 seconds, the optimal placement shifts to installing new edge nodes (Fig 1.3c) To meet this stricter delay requirement, two main strategies are considered: increasing the number of servers on existing edge nodes to reduce computation delay or installing new edge nodes to lower transmission delay The analysis indicates that the most effective solution is to deploy new edge nodes rather than adding more servers to current nodes.

Allocating appropriate resources to each edge node is essential for optimizing deployment costs Coarse-grained workload measurement, based on peak workload per location, can lead to overestimation of capacity needs; for instance, s1 requires 13 units, s2 needs 16, and s3 demands 6, but peak workloads vary across time scales Table 1.1 details the precise, time-specific workloads of edge nodes from t1 to t5, revealing that peak workloads occur at different times—for example, s1's peak at t4 is only 9—highlighting that summing peak values across all locations results in overly conservative resource provisioning.

13 Thus, a more accurate fine-grained workload estimation can decrease the deployment cost

(c) OPT with 16s delayFigure 1.3: Example of Optimal EN Deployment under Different Delay Tolerance

Table 1.1: Example of fine-grained workload in BS and EN s 1 b 3 b 4 b 5 s 2 b 6 b 7 b 8 b 9 s 3 b 1 t 1 1 4 ∗ 0 1 4 ∗ 0 0 3 1 0 0 t 2 0 3 1 0 3 1 1 4 ∗ 3 0 1 t3 4 ∗ 0 1 0 2 1 0 3 4 ∗ 1 2 t 4 3 2 2 ∗ 2 0 2 ∗ 2 ∗ 1 1 2 ∗ 3 t 5 2 1 0 3 ∗ 1 0 2 0 3 1 4 ∗

1 C, F represent coarse-grained and fine-grained workload metric respectively

Research Questions

Motivated by the application scenario in section, to deploy edge nodes to an MEC network we propose the following research questions:

• RQ1: How to find the optimal EN locations?

– Sub-RQ1: How to define our problem to make it consider the cost and the delay simultaneously?

– Sub-RQ2: How to accurately measure the workload and the delay so that it can guarantee the robustness of the network?

• RQ2 How to properly allocate resources?

To effectively assess workload and delays, it is essential to analyze individual timestamps, which provide precise insights into system performance and timing issues By examining each timestamp, we can identify bottlenecks and measure the exact delays experienced during data processing Additionally, handling large volumes of timestamped data requires efficient processing techniques, such as optimized algorithms and scalable infrastructure, to ensure timely analysis without compromising accuracy Implementing these strategies enables effective measurement of workload and delays while managing extensive timestamp data efficiently.

In addressing RQ1, we focus on formulating the Cost Minimization problem for MEC Edge Node Placement, aiming to reduce overall costs while adhering to users' delay tolerance constraints To enhance accuracy, we introduce a peak-based workload measurement approach Additionally, our method precisely evaluates both computation delay and transmission delay, ensuring comprehensive performance analysis of MEC edge deployments.

In addressing RQ2, we extend our initial problem formulation from RQ1 to a fine-grained level by precisely measuring workload and delay at the user request level This detailed measurement enables us to refine resource allocation strategies on edge nodes, reducing the necessary computational resources and lowering deployment costs For a comprehensive overview, please refer to Section 1.3.

Research Contributions

Few researchers have explored the trade-off between cost and delay in computation resource allocation, highlighting a significant research gap Existing studies face major limitations, including scalability issues that hinder practical deployment over large-scale real-world datasets, such as Shanghai's projected 50 5G base stations per km² Developing highly scalable and efficient solutions is thus crucial Additionally, the delay aspect has been overlooked; most studies ignore that increasing the number of edge servers can effectively reduce computation delays, emphasizing the need for more comprehensive research in this area.

Overall, ourMain Contributions in this thesis include:

This study introduces a novel and practical approach to balancing deployment costs from the service provider’s perspective with transmission and computation delays experienced by users We develop a peak workload metric to enhance the robustness of deployment strategies, addressing the limitations of previous studies that relied on average workload metrics and are less effective during high-traffic peaks Additionally, we establish a realistic delay measurement method that ensures our solutions are feasible in real-world scenarios, offering a comprehensive framework for efficient and resilient service deployment.

• We define our problem at various granularity: the base-station-based coarse-grained formulation (Chapter 3.2) and the user-request-based fine-grained formulation(Chapter 3.3).

• We prove that our problem is NP-hard (Chapter 4)

We introduce a comprehensive set of strategies to tackle existing challenges in coverage optimization Our approach includes a Cluster-based Mixed Integer Programming (MIP) method designed to enhance efficiency, and a Distance Aware Coverage First Search (DA-CFS) algorithm aimed at improving effectiveness over traditional Coverage First Search (CFS) Additionally, we propose a fine-grained optimization strategy to further boost the performance of both CFS and DA-CFS, ensuring more effective and efficient coverage solutions.

Our comprehensive experiments on a large-scale real-world dataset demonstrate the superior performance of our proposed solutions We specifically evaluate the impact of a fine-grained resource allocation strategy, which significantly enhances the effectiveness of CFS and DA-CFS Results show that our methods outperform baseline approaches in scalability, maintaining competitive efficiency and effectiveness, confirming their robustness and practical applicability.

Thesis Organization

This article provides a comprehensive review of related works and identifies key research gaps in Chapter 2 It then clearly formulates the research problem in Chapter 3 and establishes its NP-hardness in Chapter 4 Building on the problem formulation, a suite of innovative solutions is proposed in Chapter 5 An extensive evaluation of these methods is conducted in Chapter 6 to assess their effectiveness The study concludes with a summary of key findings and future directions in Chapter 7.

MEC Network and Key Components

In the MEC (Mobile Edge Computing) network, the two key components are base stations (BS) and edge nodes (EN) Base stations, also known as cellular base stations, serve mobile users by providing network connectivity, while edge nodes are small-scale servers that offer both network services and computing resources To optimize deployment and reduce costs, service providers often colocate edge nodes with existing base stations, leveraging existing infrastructure to add computing facilities such as server units This approach allows upgrades from base stations to edge nodes without extensive new construction, significantly saving costs Mobile users can access computing resources either by offloading tasks directly to edge nodes or by first offloading to the base station, which then forwards tasks to nearby edge nodes for processing, enhancing computational efficiency at the network edge.

The distributed network architecture of MEC brings many challenging research problems.

This thesis investigates the critical issue of edge node site selection and optimal resource allocation to enhance network performance A comprehensive literature review will follow to examine existing research and identify key methodologies related to this specific problem, providing valuable insights for effective edge network deployment.

Literature Review

Delay Aware Edge Nodes Placement

Minimizing delay in edge node placement remains a key focus in current research, highlighting its importance for optimal network performance Numerous edge site selection strategies are documented in the literature, with many studies concentrating on deploying a fixed number of edge nodes, such as: placing K edge nodes within the network to enhance efficiency These approaches typically involve two main assumptions regarding edge node capacity: either assigning edge nodes with uniform capacity or allocating nodes with varied capacities to better suit network demands.

[5,7,18] or each edge node has identical capped capacity [4,6,13,14,15,16,17,19,26,27].

Effective edge node placement is crucial for minimizing latency, depending on node capacity and workload distribution For nodes with varying capacities, larger capacity edge nodes should be strategically positioned in high-traffic areas like business centers, while smaller nodes are best placed in rural or less-visited regions Conversely, when all nodes have identical capacity, optimizing placement involves deploying different quantities of nodes across locations based on workload demands to ensure optimal performance.

Table 2.1: Literature review of delay minimization problems

In edge computing, HOM and HET represent homogeneous and heterogeneous server capacities, respectively, highlighting the variability in edge node density across different locations Homogeneous edge nodes require more deployment in busy areas and fewer in idle zones, but considering transmission delay—affected by distance—there is an inherent trade-off between edge node density and latency Neither purely homogeneous nor heterogeneous models fully capture real-world scenarios due to their limited flexibility; in practice, edge node capacities are adapted based on workload demands at specific locations, providing a more dynamic and efficient approach to optimizing edge network performance.

Several studies focus on minimizing end-user to edge node delay, with papers [4, 5, 6, 7, 27] primarily addressing this issue Other research efforts, such as [13, 14, 15, 16, 17, 18, 19], aim to optimize both access delay and workload balance across edge nodes To achieve optimal solutions, authors in [13, 17] employed Mixed-Integer Programming (MIP), but due to efficiency concerns, approximate methods like cluster-based solutions are proposed in [16, 17, 18, 27], offering practical alternatives for scalable deployment.

[4,5,7] utilized the greedy heuristic algorithms Furthermore, Genetic algorithms are proposed

Literature Review 14 to improve the accuracy in [14,19].

Minimizing network delay is a crucial aspect of network planning, as delays significantly impact Quality of Service (QoS) However, deployment costs are also a vital consideration for service providers and should not be overlooked Since budgets are limited, it is essential to balance network delay reduction with cost efficiency to ensure optimal network performance and affordability Considering both factors simultaneously leads to more effective and sustainable network planning decisions.

Cost and Delay Aware Edge Node Placement

Table 2.2: Literature review of cost minimization problems

Ref Server Setting Delay Workload

Capacity 1 # in EN Usage 2 Trans

1 HO, HE represent homogeneous and heterogeneous server capacity respectively.

2 O, T represent formulate delay as optimizing objective and threshold respectively.

Current research on edge node deployment mainly targets minimizing user delay or balancing workload among edge nodes, but only a few studies address deployment cost—a critical factor impacting network QoS Cost-aware deployment approaches fall into two categories: one focuses on deploying flexible numbers of homogeneous servers to reduce setup and purchase costs, while the other emphasizes placing a minimal set of heterogeneous servers with only one server per edge node to optimize expenses.

Compared with the second category, the first category has higher flexibility in manipulating

The capacity of edge nodes depends on the number of servers deployed, with server placement being a crucial aspect of edge node deployment The first category involves optimizing the placement of multiple servers, resulting in a larger solution space In contrast, the second category is more straightforward, focusing on selecting the single most suitable server from a set of heterogeneous options Overall, the first approach offers a more complex and extensive decision-making process compared to the simpler, single-server selection method.

Cost-related challenges are addressed using methods such as MIP, Greedy heuristics, and Genetic algorithms While MIP provides solutions, its efficiency diminishes significantly with large-scale datasets Greedy heuristics offer faster solutions but lack guaranteed effectiveness, whereas Genetic algorithms, being randomized, heavily depend on running time and do not provide theoretical performance bounds.

Our work falls into the first category, as outlined in Table 2.2 Previous studies, such as [21], focus on balancing cost and delay through multi-objective optimization, while [10] aim to identify optimal edge node deployment to minimize costs within user delay thresholds However, these approaches face limitations, including unverified scalability since both rely on Mixed Integer Programming (MIP) and were tested only on small datasets—such as 20 base stations in [21] Additionally, their delay models lack realism; [10] neglects computation delay entirely, whereas [21] considers computation delay but overlooks how increasing computational capacity at edge nodes can reduce processing time.

To alleviate the aforementioned limitations, in this paper, we define a more practical delay measurement and propose an approximate solution with remarkable strengths in scalability.

Previous studies generally measure workload using an average-based metric, which may lack robustness in handling peak-hour scenarios To improve accuracy, we propose a peak-based workload measurement method that precisely estimates workload by accounting for variations over specific time periods.

This chapter begins by defining the MEC (Mobile Edge Computing) network and its core components It then explores workload and delay measurement methodologies essential for network performance analysis The chapter formulates the primary optimization problem aimed at minimizing deployment costs, presenting both coarse-grained (Chapter 3.2) and fine-grained (Chapter 3.3) formulations The coarse-grained approach addresses the problem at the base station level, while the fine-grained formulation provides a more detailed analysis at the user request level, incorporating workload and delay metrics Notably, the fine-grained model subsumes the coarse-grained case as a special scenario For a comprehensive list of frequently used notations, readers are referred to Table 3.1.

Preliminaries

An MEC network comprises a set of base stations (BSs) and edge nodes (ENs), each represented by tuples containing their ID, latitude, longitude, and number of allocated servers In accordance with established conventions, ENs are co-located with BSs, with a base station upgraded to an edge node when additional servers are added to enhance computational resources Multiple standard servers can be integrated into an EN to ensure sufficient processing capacity Formally, the network is defined such that all base stations initially have zero servers, while each edge node has at least one server allocated, ensuring adequate computational capability across the network.

S Set of edge nodes b Base station

B Set of base stations s.n Number of servers placed in edge node s r User request d Distance

CT Number of concurrent task ξ Single task size à Single server computation rate

Ctrans Transmission capacity of the channel

C comp Computation capacity of the edge node

Deployment costs in our network include two primary components: EN setup cost and server cost The EN setup cost (pr) encompasses expenses related to upgrading a base station to an edge node, such as infrastructure rental and construction fees The server cost (ps) covers the expenses for acquiring new servers, which are essential for operational functionality For instance, installing a server at a base station incurs a combined cost of pr+ps, whereas adding a server to an existing edge node only involves the server cost p s These cost considerations are crucial for efficient deployment and scalability of edge network infrastructure.

Connectivity between base stations (BSs) is established when they meet a specific delay threshold, considering both transmission and computation delays The EN service range, denoted as R(s), encompasses all BSs that are connected to a particular EN, representing the area within which services can be effectively provided This definition highlights the importance of delay constraints in determining the coverage and connectivity of edge networks, ensuring reliable communication between BSs within the service range.

Mobile users initiate a user request to a base station to access a data stream The request is defined as r = (b_i, t_s, t_e), where b_i is the base station receiving the request, t_s is the request start time, and t_e is the request end time To utilize network computation resources, mobile users send these requests to the designated base stations in order to efficiently access necessary data and services during the specified time periods.

In coarse-grained formulations, user requests are directed either directly to an EN or to a nearby BS, which then offloads the requests to an EN for computational processing Throughout this article, the terms "user request" and "task" are used interchangeably unless specified otherwise, to simplify understanding of the request handling process in edge computing systems.

Due to overlapping service ranges among Edge Nodes (ENs), base stations can select multiple nearby ENs to distribute user requests However, since each user request requires continuous processing, it must be handled by a single EN and cannot be split across multiple ENs.

Assignment determines whether user requests can be offloaded to specific edge nodes from designated base stations Each base station may offload requests to multiple edge nodes, but every user request is assigned to only one EN, which must process it exclusively until completion without migration The selected edge nodes collectively cover all base stations in the network, ensuring they can handle all user requests efficiently.

In our study, we use A to denote assignments at the coarse-grained base station level and At for fine-grained user request level For instance, A(s1) = {b1, b2} indicates that base stations b1 and b2 are assigned to edge node s1 Conversely, At(s1) = {r1, r2, r3} represents that, at a specific timestamp, user requests r1, r2, and r3 originating from their respective base stations are offloaded to edge node s1 This distinction helps optimize resource allocation and data processing efficiency at both levels.

Coarse-grained Formulation

Workload Measurement

Most of existing studies measures the workload of a BS or an EN by task’s average requesting

[14, 15, 17] However, in real cases, the workload usually fluctuates dramatically during a

Coarse-grained Formulation 19 day [2], so the peak workload is non-negligible in some cases considering the robustness of the network, especially during the rush hour 1

We introduce a peak workload metric to effectively measure network capacity under heavy data traffic Focusing on data-intensive computing tasks such as HD videos, we ensure the MEC network can handle overwhelming workloads The task size is defined as ξ bits, and the peak workload occurs during the time period with the highest arrival rate of tasks By assuming tasks are processed immediately upon arrival, this metric provides a reliable assessment of the network's maximum processing burden.

Concurrent tasks are defined as tasks whose processing times overlap The variable CT represents the number of these concurrent tasks at any given moment, while CT max indicates the maximum number of concurrent tasks that have occurred simultaneously Using this peak metric, the workload of a BSb is characterized by the highest level of task concurrency observed during its operation Optimizing the management of concurrent tasks can significantly improve system performance and resource utilization.

Delay Measurement

Task offloading between ENs is not permitted, leading to two primary delays: transmission delay, which occurs when a task is transmitted between a Base Station (BS) and an Edge Node (EN), and computation delay, which results from processing a task within an EN These delays are influenced by the channel’s transmission capacity and the EN’s computational capability, respectively, impacting overall system performance.

Transmission capacity We adopt Shannon’s channel capacity formula 2 to compute channels’ transmission capacity (denoted as Ctrams):

In this equation,Brepresents the channel’s bandwidth,SP represents the average received signal power over the channel and N represents the average noise power over the channel.

Assuming equal signal power across all channels, we recognize that channel noise is influenced by various factors including distance, environmental conditions, and cable quality To simplify our analysis, we adopt a common approach from existing literature, assuming that channel noise is primarily affected by these key factors, thereby ensuring a more accurate and practical evaluation of communication performance.

1 https://www.ciscopress.com/articles/article.asp?p%259&seqNum=6

2 Shannon theorem: http://www.inf.fu-berlin.de/lehre/WS01/19548-U/shannon.html

3 Noise: https://documentation.meraki.com/MR/WiFi Basics and Best Practices

Fine-Grained Formulation 20 distance [17,13,15,7] We define the noise as N =αãd(s, b), whered(s, b) denotes the distance betweensand b, and α is a coefficient between N and the distance.

Adding servers to an Edge Node (EN) increases its computational capacity Assuming all servers placed at the EN have identical processing capabilities measured in bits per second, the total computation capacity of an EN with s.n servers is proportional to the number of servers deployed This scalable approach allows the EN to handle higher processing loads, enhancing overall system performance and efficiency.

The delay between a Base Station (BS) and an Edge Node (EN) is influenced by both data size and processing capacity This delay incorporates two key components: transmission delay and computation delay Therefore, our delay model is designed to accurately reflect these factors, enabling effective optimization of network performance Understanding this relationship is crucial for improving communication efficiency and reducing latency in edge computing networks.

Here, the first term represents the transmission delay for transmitting tasks from b to s,while the second term represents the computation delay for s to process the task.

Qualified EN Placement Plan

Definition 1 (Qualified EN Placement Plan) Given a set of BSs B and a delay thresholdθ, select a subset S ⊆ B as ENs such that the following constraints hold: (1) ∀s ∈ S b∈ A[s], D(s, b)≤θ; (2)S s∈SA[s] =B\S; (3)∀si, s j ∈S, s i ̸=s j ,A[si]∩ A[sj] =∅.

These constraints ensure that the total delay experienced by users remains within the threshold θ Additionally, the server (S) must serve all users in the set B\S, and each base station (BS) is assigned to a single edge node (EN) for task offloading This guarantees efficient resource allocation and optimal network performance.

Fine-Grained Formulation

Workload Measurement

By adopting our defined Concurrent Tasks with the peak workload metric, we define our workload measurement in a fine-grained user request level with integrating the previous coarse- grained case.

Users submit requests lasting from r.ts to r.te for data-intensive computing tasks such as HD videos, each requesting ξ bits of data at any given timestamp This assumption ensures that the MEC network is capable of handling high workloads effectively, accommodating the demands of user requests in real-time.

During a specific time period from t₁ to t₂, there are two primary types of workload to consider: the peak transmission workload between edge node j and device bi, as described in Equation 3.5, and the peak computation workload at edge node s_j, as outlined in Equation 3.6 These workloads are essential metrics for optimizing network performance and resource allocation in edge computing environments.

Wt 1 ,t 2(sj) =ξãmax{CTcomp(t, sj) :t∈[t1, t2]} (3.6) where max{CT(t,ã) :t∈[t 1 , t 2 ]}is to find the maximum number of concurrent tasks appearing betweent 1 andt 2

Coarse-grained workload estimation at the edge nodes can be performed at the base station level by analyzing concurrent tasks Specifically, for each base station \( b \in A(s) \), the maximum number of simultaneous tasks at any given timestamp \( t_i \), denoted as \( C T_{t_i} \), is used to estimate the peak workload as \( W_{peak}(b) = \xi \times C T_{t_i} \) The workload at any other time \( t' \neq t_i \) should not exceed this peak, ensuring \( W_{t'}(b) \leq W_{peak}(b) \) By assuming that the timing of peak workload occurrence varies across base stations, a coarse-grained total workload for the set \( s \) can be approximated as \( W(s) = \sum_{b_i \in A(s)} W_{peak}(b_i) \), providing an efficient method for workload estimation without detailed dynamic analysis.

Delay Measurement

In fine-grained systems, two primary types of delays must be addressed: transmission delay, which refers to the time taken to transmit a user request between a base station (BS) and an edge node (EN), and computation delay, which involves the time required to process the user request within the EN Managing both delays is crucial for optimizing system performance and ensuring low-latency responses.

Fine-Grained Formulation 22 transmission capacity andEN’s computation capacity are calculated same as the one in coarse- grained case introduced in Chapter 3.2.2.

Delay calculation depends on data size and processing capacity, as detailed in reference [31] Using our previously defined peak workload metric, we estimate the delay for processing a user request based on the maximum workload encountered during transmission and processing, ensuring accurate performance assessment in various network conditions.

BSbi and ENsj and computation insj, which is modeled as follow:

(3.7) where the first term represents the transmission delay for taskrfrombi tosj, while the second term represents the computation delay experienced byr in edge node si.

A coarse-grained estimation of delay at the base station level can be achieved by utilizing workload measurements outlined in Section 3.2 This approach allows for modeling the maximum delay experienced between a base station (BS) and edge nodes (ENs), providing valuable insights into network performance and optimizing overall system efficiency.

Qualified EN Placement Plan

Definition 2 (Qualified EN Placement Plan) Given a set B of BSs in which each has a set of user requests attached and a delay threshold θ, select a subsetS ⊆B as ENs such that the following constraints hold:

(1) For each single user request r from a b∈B, the delay experienced is always within the given threshold θ i.e ∀ b∈ B r∈b,D(r)≤θ;

A single base station can have multiple designated Edge Nodes (ENs) to handle task offloading, allowing user requests to be distributed among different ENs However, each user request can only be processed by one EN at any given time, ensuring that the processing tasks remain exclusive This means that for any two distinct user requests, their assigned processing times do not overlap, maintaining efficient and conflict-free task management within the network.

Once a user request r has been offloaded to an edge node (EN) s at time t_k, it will not be reassigned to another EN later This means that the request is either completed at time t_k, or it continues to be processed by the same EN in subsequent timestamps, ensuring efficient task management and consistency in edge computing environments.

Cost Minimization in MEC Edge Node Placement Problem 23

Cost Minimization in MEC Edge Node Placement Problem

Definition 3(Cost Minimization in MEC Edge Node Placement.) TheCMMENPproblem is to find an optimal EN placement planS ∗ which can minimise the total deployment cost, i.e.,

(pr+s.nãps) (3.9) where F(S ∗ ) denotes the total cost incurred by selecting S ∗ as ENs,S is a qualified EN placement plan,pr is the setup cost, and ps is the server cost.

In this chapter, we prove the NP-hardness of our problem through a reduction from a NP-hard problem, namely the Dominating Set (DS) problem [32].

The Dominating Set Problem involves identifying the smallest subset D* within the vertex set V of a graph G=(V, E) The goal is to find a minimal dominating set where every vertex not in D* is adjacent to at least one vertex in D* This problem is fundamental in graph theory and has applications in network design, resource allocation, and social network analysis Solving the Dominating Set Problem efficiently can optimize coverage and control within complex networks.

Theorem 1 CMMENP is NP-hard.

Proof We prove that CMMENP is NP-hard through a reduction from the Dominating Set (DS) problem The reduction is performed in three main steps:

In a special case of CMMENP where the server cost (ps) is zero and the construction cost (pr) is one, edge nodes can have unlimited computational capacity without incurring server placement costs This means the total cost is solely determined by the number of edge nodes deployed, while computation delay becomes negligible by adding an infinite number of servers to the edge nodes In this scenario, the parameter θ is primarily influenced by transmission delay only.

We model our CMMENP problem using a graph G = (B, E′), where B represents the set of base stations and E′ denotes the connections between them An edge (u, v) exists between any two base stations u and v if and only if the transmission delay between them does not exceed the threshold θ This graph-based approach effectively captures the relevant communication constraints for optimizing network performance.

When we set s = 0, the impact of transmission delay on computation delay becomes negligible, allowing us to disregard edge weights As a result, the input graph can be considered an unweighted, indirect graph, aligning directly with the input requirements of the DS problem Consequently, for any instance of the DS problem under these conditions, the simplified graph model facilitates more efficient analysis.

V and E, we can construct an instance of the CMMENP problem by mappingV andE in DS to the base station set B and connectionsE ′ in CMMENP, respectively.

The main goal of the CMMENP problem is to minimize overall costs When setting parameters pr=1 and p s=0, it becomes equivalent to identifying the smallest subset S ∗ within B that covers all elements outside of S ∗, aligning with the optimal solution D ∗ to the Disjoint Set (DS) instance If a polynomial-time algorithm could find S ∗ for the CMMENP problem, it would imply that the DS problem could also be solved in polynomial time, which contradicts the widely accepted computational complexity assumption unless P=NP Therefore, the CMMENP problem is classified as NP-hard, highlighting its computational difficulty.

Our solutions are divided into two main approaches: MIP-based methods and heuristic methods The MIP-based approach formulates the CMMENP as a linear optimization problem solved using a MIP solver, but it faces scalability issues with large datasets To address this, we introduce a cluster-based MIP to enhance efficiency The heuristic method, such as Coverage First Search (CFS), offers high efficiency but limited effectiveness, which we improve through the Distance Aware Coverage First Search (DA-CFS) Additionally, we develop a fine-grained resource allocation strategy grounded in heuristic principles to reduce deployment costs at the user request level.

MIP-based Methods

Mixed-Integer Programming (MIP)

Mixed Integer Programming (MIP) is a widely used optimization technique for solving linear problems efficiently Although existing research has not specifically addressed the exact problem at hand, we utilize the MIP approach adopted by [21], as its objectives align most closely with our requirements Implementing MIP ensures a robust and reliable solution, optimizing performance and accuracy in addressing complex linear problems.

We utilize MIP-based methods by adopting the peak workload metric across 27 of our deployed networks Due to the exponential increase in runtime associated with fine-grained workload measurement, we formulate the MIP using a coarse-grained approach To model CMMENP, we employ a binary array y of size |B| to represent the EN selection scheme, where each element y[j] = 1 indicates that base station bj is chosen as an EN, and 0 indicates non-selection Additionally, a matrix x of size |B|×|B| is used to represent BS assignments, with x[j][i] = 1 indicating BS bi is assigned to EN sj, and 0 otherwise This formulation allows us to optimize EN selection and BS assignment effectively within the constraints of the MIP model.

X|B| i=1 xiã ⌈ W(bi, sj) (θ−D trans )ãà⌉ ãps) (5.1) subject to the following constraints:

(2) if yj = 0, then P|B| i=1xi,j = 1,P|B| i=1xj,i= 0

Based on Constraints (1) and (2), at least one Edge Node (EN) must be deployed within the network, with each Base Station (BS) assigned to a single EN Additionally, Constraints (2) and (3) ensure that the chosen ENs effectively provide coverage for all BSs in the network, optimizing network connectivity and performance.

The MIP solution has an exponential time complexity (O(2^n)), where n represents the number of input base stations, making it impractical for large-scale datasets Experimental results indicate that, despite delivering high-quality deployment schemes, MIP cannot efficiently handle scalability issues in real-world, large-scale scenarios Additionally, MIP's formulation is limited to coarse-grained cases and cannot be applied to fine-grained deployment problems These limitations motivate the exploration of more efficient and scalable solutions for large-scale network deployment.

Cluster-based MIP

To address the scalability challenges of traditional MIP, which experiences exponential growth in computational time with increasing input size, we propose a cluster-based MIP approach that significantly enhances efficiency This method leverages the high effectiveness of MIP while mitigating its limitations, enabling it to handle larger problem instances more feasibly Our approach aims to improve computational performance and scalability, ensuring reliable results even as input sets grow, and ultimately making MIP more practical for complex, real-world applications.

Input : Base Station set B, Minimum Cluster Size β

// cut tree at various height

When partitioning input base stations into clusters, it is essential to ensure that each cluster is independent, with clear boundaries separating internal and external points Clusters should also be balanced in size to prevent efficiency issues caused by overly large clusters or degraded performance from very small ones Additionally, the partition process must be efficient; since the clustering problem resembles a k-cut problem in graph theory—known to be NP-hard—approximated solutions are necessary to achieve practical and timely results.

We utilize hierarchical agglomerative clustering to effectively group data points by progressively merging the closest clusters This approach ensures that the resulting clusters are independent and well-defined By gradually combining nearby clusters, the method minimizes the inclusion of points spanning across multiple clusters, enhancing clustering accuracy and integrity.

Hierarchical agglomerative clustering often lacks guaranteed balance between clusters, especially when analyzing datasets with a single center, as illustrated in our data distribution (see Fig 6.6) To address this limitation, we have enhanced the original clustering algorithm to produce more balanced clusters The detailed implementation of this improved method is outlined in Algorithm 1, ensuring more effective and equitable clustering outcomes.

We utilize a traditional hierarchical clustering approach to build a cluster tree from data points, starting by constructing a dendrogram (line 3) To identify meaningful clusters, we cut the dendrogram at various heights to obtain different agglomerative formations; however, direct cuts often result in unbalanced cluster sizes, so we instead perform a bottom-up search at each level Once a cluster exceeding a predefined size threshold is found, it is extracted from the dataset (line 11), and the hierarchy tree is rebuilt with the remaining base stations to continue the clustering process This iterative method continues until insufficient base stations remain, with the final remaining stations forming the last cluster (line 20).

We utilize an efficient hierarchical agglomerative clustering algorithm with a time complexity of O(n log² n) for constructing the hierarchy tree The process of searching for a suitably sized cluster takes O(n log n), contributing to the overall algorithm complexity Consequently, the total time complexity of our clustering method is O(n² log³ n), ensuring high efficiency in large-scale data processing.

Partitioning base stations into clusters enables applying Mixed Integer Programming (MIP) locally within each cluster, significantly reducing the search space and allowing the algorithm to find near-optimal solutions efficiently This cluster-based approach enhances scalability by sacrificing minimal accuracy, making it suitable for large datasets When allocating equal computational time, such as one hour, the cluster-based MIP consistently outperforms traditional MIP in solution quality For detailed experimental results and validation, refer to Chapter 6 of our study.

Input : Base Station set B, Delay Threshold θ

1 S ← ∅,A ← ∅; // A: a set of ⟨s:{b1, b 2 , }⟩ for BS assignment

Heuristic-based Methods

Coverage First Search (CFS)

We introduce Coverage First Search (CFS), an efficient approximation algorithm designed to minimize deployment costs in networks Inspired by the observation that the construction cost of Edge Nodes (EN) is typically much higher than the cost of adding a standard server to an existing EN [10], CFS strategically focuses on reducing the number of ENs deployed This approach effectively achieves cost minimization by prioritizing the expansion of existing infrastructure over new deployments.

In our approach, we first model the connections between base stations (BSs) based on a delay threshold θ, as detailed in Algorithm 2 We then iteratively select the BS with the highest number of connections to serve as an edge node (EN), assigning it along with all its connected BSs within its service range R After each assignment, the EN and its associated BSs are removed from the set, and this process repeats until all BSs are assigned The time complexity analysis highlights that the computation of the clustering framework involves calculating the connections between BSs, ensuring an efficient and scalable network construction process.

Input : Base Station set B, Delay thresholdθ, Candidate number τ

1 S ← ∅, ← ∅,M ← ∅,A ← ∅; // : candidate set, M: inverted index of

8 if distance(b, bs)

Tiêu đề	User-Delay-Tolerance-Aware Edge Node Placement Optimization for Cost Minimization
Tác giả	Xiaoyu Zhang
Người hướng dẫn	Prof. Zhifeng Bao, Dr. Hai Dong
Trường học	School of Computing Technologies, College of Science, Technology, Engineering and Maths, RMIT University
Chuyên ngành	Computer Science / Network Optimization
Thể loại	Thesis
Năm xuất bản	2022
Thành phố	Melbourne

Định dạng
Số trang	64
Dung lượng	753,37 KB