Network Slicing for Service-Oriented Networks with Flexible Routing and Guaranteed E2E Latency††thanks: Corresponding author: Ya-Feng Liu.
Abstract
Network function virtualization is a promising technology to simultaneously support multiple services with diverse characteristics and requirements in the fifth generation and beyond networks. In practice, each service consists of a predetermined sequence of functions, called service function chain (SFC), running on a cloud environment. To make different service slices work properly in harmony, it is crucial to select the cloud nodes to deploy the functions in the SFC and flexibly route the flow of the services such that these functions are processed in sequence, the end-to-end (E2E) latency constraints of all services are guaranteed, and all resource constraints are respected. In this paper, we propose a new (mixed binary linear program) formulation of the above network slicing problem that optimizes the system energy efficiency while jointly considers the resource budget, functional instantiation, flow routing, and E2E latency requirement. Numerical results show the advantage of the proposed formulation compared to the existing ones.
Index Terms:
E2E delay, network function virtualization, network slicing, resource allocation, service function chain.I Introduction
Network function virtualization (NFV) is considered as one of the key technologies for the fifth generation (5G) and beyond 5G (B5G) networks [1]. In contrast to traditional networks where service functions are processed by dedicated hardwares in fixed locations, NFV can efficiently take the advantage of cloud technologies to configure some specific nodes in the network to process network service functions on-demand, and then flexibly establish a customized virtual network for each service request. In the NFV-enabled network, classic networking nodes are integrated with NFV-enabled nodes (i.e., cloud nodes) and each service consists of a predetermined sequence of virtual network functions (VNFs), called service function chain (SFC) [2, 3], which can only be processed by certain specific cloud nodes [4]. In practice, each service flow has to pass all VNFs in sequence and its end-to-end (E2E) latency requirement must be satisfied. However, since all VNFs run over a shared common network infrastructure, it is crucial to allocate cloud and communication resources to meet the diverse service requirements, subject to the SFC constraints and the E2E latency constraints of all services and all cloud nodes’ and links’ capacity constraints.
The above resource allocation problem in the NFV-enabled network is called network slicing in the literature and considerable works have been done on it recently; see [4]-[10] and the references therein. More specifically, references [4, 5] considered the VNF deployment problem with a limited network resource constraint, but it did not take the E2E latency constraint of each service into consideration. Reference [6] investigated a specific two-layer network which consists of a central cloud node and several edge cloud nodes without considering the limited link capacity, which may lead to violations of resource constraints. Reference [7] simplified the routing strategy by selecting paths from a predetermined path set. Reference [8] simplified the VNF placement decision-making by assuming that all VNFs in a SFC must be instantiated at the same cloud node. Reference [9] proposed a way of analyzing the dependencies between traffic routing and VNF placement in the NFV networks. Though the E2E latency requirement of each service was integrated into the formulation, only a single path was allowed to route the traffic flow of each service. Apparently, such a formulation does not fully exploit the flexibility of traffic routing and hence might affect the performance of the whole network. Reference [10] assumed that instantiation of a VNF can be split over multiple cloud nodes, which may result in high coordination overhead in practice.
To the best of our knowledge, for the network slicing problem, none of the existing formulations/works simultaneously takes all of the above practical factors (e.g., flexible routing, E2E latency, and coordination overhead) into consideration. The goal of this work is to provide a mathematical formulation of the network slicing problem that simultaneously allows the traffic flows to be flexibly transmitted on (possibly) multiple paths, satisfies the E2E latency requirements of all services, and requires that each service function in a SFC is processed by exactly one cloud node. In particular, we formulate the problem as a novel mixed binary linear program (MBLP), which minimizes the total power consumption of the whole network subject to the SFC constraints and the E2E latency constraints of all services and all cloud nodes’ and links’ capacity constraints. Numerical results show the effectiveness of our proposed formulation.
II System model and problem formulation
In this section, we give a MBLP formulation for the network slicing problem.
II-A System Model
Consider a communication network , where is the set of nodes and is the set of links. The network supports a set of flows . We assume that each link has an expected delay [9], and a total data rate upper bounded by the capacity . Let be a subset of denoting the set of the cloud nodes. Each cloud node has a computational capacity and we assume as in [4] that processing one unit of data rate requires one unit of (normalized) computational capacity. Let and be the source and destination of flow , respectively, and suppose that . Each flow relates to a distinct service, which is given by a SFC consisting of functions that have to be performed in sequence by the network:
(1) |
As required in [4, 6], and [9], to minimize the coordination overhead, each function must be instantiated at exactly one cloud node. If function , , is processed by node in , we assume the expected NFV delay is known as which includes both processing delay and queuing delay, as in [9]. For flow , denote and as the service function rates before receiving any function and after receiving function , respectively. Each flow is required to have an E2E latency guarantee, denoted as .
II-B Preview of the New Formulation
The network slicing problem is to determine functional instantiation of all flows and the routes and associated data rates of all flows on the routes while satisfying the SFC requirements, the E2E delay requirements, and the capacity constraints on all cloud nodes and links. In this section, we shall provide a new problem formulation of the network slicing problem which takes practical factors like coordination overhead, flexible routing, and E2E latency requirements into consideration; see Eq. (21) further ahead.
Our proposed formulation builds upon those in two closely related works [9] and [4] but takes further steps. More specifically, in sharp contrast to the formulation in [9] where only a single path is allowed to route the traffic flow of each service (between two cloud nodes processing two adjacent functions of a service), our proposed formulation allows the traffic flow of each service to transmit on (possibly) multiple paths and hence fully exploits the flexibility of traffic routing; different from that in [4], our formulation guarantees the E2E delay of all services, which consists of two types of delays: total communication delay on the links and total NFV delay on the cloud nodes. Next, we describe the constraints and objective function of our formulation in details.
II-C VNF Placement and Node Capacity Constraints
We introduce the binary variable , to indicate whether or not node in processes function , i.e.,
For simplicity of presentation as in [4], we require that each cloud node processes at most one function for each flow:
(3) |
For each flow , we require that each service function in the chain is served by exactly one cloud node, i.e.,
(4) |
Since processing one unit of data rate consumes one unit of (normalized) computational capacity, we can get the node capacity constraints as follows:
(5) |
Let represent the activation of cloud node , i.e., if , node is activated and powered on; otherwise, it is powered off. Thus
(6) |
II-D Flexible Routing and Link Capacity Constraints
We assume that there are at most paths between any pair of cloud nodes that processes two adjacent functions of a flow. In general, such an assumption on the number of paths may affect the solution’s quality. Indeed, the choice of offers a tradeoff between the flexibility of routing in the problem formulation and the computational complexity of solving it: the larger the parameter is, the more flexibility of routing and the higher the computational complexity.
Denote . If cloud nodes and are used to host the -th and -th functions of flow (i.e., functions and ), respectively, and path is used to route the traffic flow, let be the associated amount of the data rate. We need to introduce this variable in our formulation, as the traffic flow of each service in our formulation is allowed to transmit on (possibly) multiple paths in order to exploit the flexibility of traffic routing. In particular, if , we assume and if , we assume . Notice that by (3) and the fact that , we must have . For each , from the definitions of , , and , we have
(7) |
Constraint (7) indicates that if the -th and -th functions of flow (i.e., functions and ) are hosted at cloud nodes and , respectively, then the sum of all the data rates sent from to must be equal to . We then use to denote that the -th and -th functions of flow (i.e., functions and ) are processed by cloud nodes and , respectively, path is used to route the associated traffic flow, and link is on path ; otherwise, . By definition, for all , , and , we have
(8) |
If , let denote the associated amount of data rate. By definition, for each , , and , we have the following coupling constraints:
(9) |
The total data rates on link is upper bounded by capacity :
(10) |
II-E SFC Constraints
To ensure the functions of each flow are followed in the prespecified order as in (1), we need to introduce several constraints below. For each , , with , and , we have
if | (11) | ||||
if | (12) | ||||
if | (13) |
if | (14) | ||||
if | (15) | ||||
if. | (16) |
First, note that constraints (11), (12), and (13) are flow conservation constraints for the data rate. Second, we need another three flow conservation constraints (14), (15), and (16). To be more precise, for each pair of cloud nodes and , considering constraints (14), (15), and (16), we only need to look at the case that and since otherwise from constraint (8), all the variables in (14), (15), and (16) must be equal to zero. Constraint (15) enforces that for every node that does not host functions and , if one of its incoming links is used to route the flow from node to node on path , then one of its outgoing links is also assigned for this path. Similarly, constraints (14) and (16) imply that, if function is hosted at node , one of the outgoing links of node must be assigned for path and, if function is hosted at node , one of the incoming links of node must be assigned for path , respectively.
II-F E2E Latency Constraints
Next, we consider the delay constraints. Let be the variable denoting the communication delay due to the traffic flow from the cloud node hosting function to the cloud node hosting function . Then, should be the largest one among the paths, i.e.,
(17) |
Hence the total communication delay on the links of flow , denoted as , can be written as
(18) |
Now for each flow , we consider the total NFV delay on the nodes, denoted as . This can be written as
(19) |
The E2E delay of flow is the sum of total communication delay and total NFV delay . The following delay constraint ensures that flow ’s E2E delay is less than or equal to its threshold :
(20) |
II-G A New MBLP Formulation
The power consumption of a cloud node is the combination of the dynamic load-dependent power consumption (that increases linearly with the load) and the static power consumption [12]. Our objective is to minimize the total power consumption of the whole network:
In the above, the parameters and are the power consumptions of each activated cloud node and inactivated cloud node, respectively, satisfying ; the parameter is the power consumption of processing one unit of data rate. From (4), the above objective function can be simplified as where is a constant. Hence, minimizing the total power consumption is equivalent to minimizing the total number of activated cloud nodes. Based on the above analysis, we obtain the following problem formulation:
(21) |
II-H Analysis Results of Problem (21)
We now present some analysis results of problem (21) (without proofs due to the space reason). First, problem (21) is a MBLP (with both numbers of binary variables and linear constraints being ) since the bilinear terms of binary variables in (7), (8), (15), and (16) can be equivalently linearized. More specifically, we can replace the bilinear term by introducing an auxiliary binary variable and add the following linear constraints: , , and . Note that the linearity of all variables in problem (21) is vital, which allows to leverage the efficient integer programming solver such as Gurobi [11] to solve the problem to global optimality. Second, problem (21) is strongly NP-hard. Therefore, the above approach can only solve problem (21) associated with small size networks. In future works, we shall develop polynomial-time heuristic algorithms for solving problem (21) to achieve the tradeoff between the performance and time complexity. Third, if we set in (21), then our proposed formulation reduces to that in [9]. In particular, the variables in (10) can be replaced by those in the right-hand side of (9) and all constraints related to the variables (e.g., (7), (9), (11), (12), (13)) can be removed. Our proposed formulation with allows the traffic flows to transmit over possibly multiple paths and fully exploits the flexibility of traffic routing.
III Numerical Results
In this section, we present numerical results to illustrate the effectiveness of the proposed formulation.
III-A An Illustrative Example
In this subsection, we show the performance of the proposed formulation by using an illustrative example.
Consider the example in Fig. 1. There are two different functions available, i.e., and . Cloud node C can only process function , while cloud node E can process both functions and .
Suppose there are two services where service I is from node A to node D with the E2E delay threshold and service II is from node A to node B with the E2E delay threshold .
Functions and need to be processed for services I and II, respectively; for each service , the service function rates and are ; the NFV delays of both functions at (possible) cloud node C and cloud node E are .
Solving problem (21) with gives the following solution:
Note that since the communication delay on each link is (as shown in Fig. 1), and the NFV delay of each function is at cloud node C or cloud node E, the E2E delays of service I and service II in the above solution are and , respectively, which satisfy the E2E latency requirements of both services.
Now, consider the case that where there is no E2E latency constraints, i.e., removing constraints (17)-(20) from problem (21).
Notice that this reduces to the formulation considered in [4].
Since the objective is to minimize the number of activated cloud nodes, the obtained solution is that both functions are processed by cloud node E:
For service II, it traverses links from node A to node B with a communication delay being , which, pluses the NFV delay , obviously violates its E2E latency constraint.
Next, suppose that there is only one service from node A to node D with the E2E delay threshold being . The considered service contains function and both of the service function rates and are . If only a single path is allowed to transmit the traffic flow as in [9], no solution exists for this example due to the limited link capacity. However, in sharp contrast, using our formulation (21) with , the traffic flow can be flexibly transmitted on multiple paths, which gives us a feasible solution as follows: first use paths and to route the flow from A to E where the data rates on both paths are ; after function being processed by cloud node E, route the flow to the destination node D using the link . This toy example clearly shows the benefit of flexible routing in our proposed formulation (21), i.e., it has a lower requirement on the link capacities of the network to support the services.
III-B Simulation Results
In this subsection, we present more simulation results to illustrate the effectiveness of the proposed formulation compared to those in [4] and [9].
We randomly generate a network consisting of nodes on a region in the Euclidean plane including cloud nodes. We generate link for each pair of nodes and with the probability of . The communication delay on link is calculated by the distance of link over , where is the average length of all shortest paths between every pair of nodes. The cloud node and link capacities are randomly chosen in and , respectively. There are in total different service functions: . Among the cloud nodes, cloud nodes are randomly chosen to process service functions of and the remaining one is chosen to process all the service functions. The processing delay of each function in each cloud node is randomly chosen in . For each service , nodes and are randomly chosen from the available network nodes excluding the cloud nodes; SFC is an ordered sequence of functions randomly chosen from with ; the service function rates are all set to ; and the E2E delay threshold is set to where is the length of the shortest path between nodes and and is randomly chosen in . The above parameters are carefully chosen such that the constraints of problem (21) are neither too tight nor too loose.
In our simulations, we randomly generate 100 problem instances for each fixed number of services and the results presented below are based on statistics from all these 100 instances. In problem (21), we choose . We use Gurobi 9.0.1 [11] to solve all MBLP problems.

Since the formulation in [4] does not explicitly take the latency constraints into consideration, the blue curve in Fig. (2) is obtained as follows. We solve the formulation in [4] by changing its objective into minimizing the number of activated cloud nodes and then substitute the obtained solution into the latency constraints in (20): if the solution satisfies all latency constraints, we count the corresponding problem instance feasible; otherwise it is infeasible. We can see from Fig. 2 that the number of feasible problem instances of solving our proposed formulation (21) is significantly larger than that of solving the formulation in [4]. This clearly shows the advantage of our proposed formulation (i.e., it has a guaranteed E2E Latency) over that in [4]. In addition, the flexibility of traffic routing in our proposed formulation (21) enables it to also solve a larger number of problem instances than that can be solved by using the formulation in [9]. These results further illustrate the effectiveness of our proposed formulation (21) (as compared to those in [4] and [9]).


We now show the performance of problem (21) versus the number of services. Fig. 3 is obtained by averaging the results over all feasible problem instances. We can observe from Fig. 3 that: as the number of services increases, more cloud nodes need to be activated; the average NFV delay almost keeps unchanged. The later is due to the fact that the number of functions in all services is fixed (to be 3) and the difference of NFV delays on different cloud nodes is small. However, the average communication delay (and the average total delay) increases rapidly. This is mainly due to the limited link capacity. More specifically, as the network traffic gets heavier, a traffic flow may use a path with a larger communication delay as some link’s capacity (in a path with a smaller communication delay) is not enough for the data rate.
IV Conclusions
In this paper, we have investigated the network slicing problem that plays a crucial role in 5G and B5G networks. We have proposed a new MBLP formulation for the network slicing problem, which can be optimally solved by the standard solvers like Gurobi. Our proposed formulation minimizes the total power consumption of the whole network (equivalent to the total number of activated cloud nodes) subject to the SFC constraints and the E2E latency constraints of all services and all cloud nodes’ and links’ capacity constraints. Numerical results demonstrate the advantage of our proposed formulation over the existing ones in [4] and [9].
References
- [1] R. Mijumbi, J. Serrat, J.-L. Gorricho, N. Bouten, et al., “Network function virtualization: State-of-the-art and research challenges,” IEEE Commun. Surveys Tuts., vol. 18, no. 1, pp. 236-262, 2016.
- [2] Y. Zhang, N. Beheshti, L. Beliveau, G. Lefebvre, et al., “StEERING: A software-defined networking for inline service chaining,” in Proc. 21st IEEE Int. Conf. Netw. Protocols (ICNP), 2013, pp. 1-10.
- [3] G. Mirjalily and Z.-Q. Luo, “Optimal network function virtualization and service function chaining: A survey,” Chin. J. Electron., vol. 27, no. 4, pp. 704-717, 2018.
- [4] N. Zhang, Y.-F. Liu, H. Farmanbar, T.-H. Chang, et al., “Network slicing for service-oriented networks under resource constraints,” IEEE J. Sel. Areas Commun., vol. 35, no. 11, pp. 2512-2521, 2017.
- [5] N. Zhang, Y.-F. Liu, H. Farmanbar, T.-H. Chang, et al., “System and method for network slicing for service-oriented networks,” US Patent App. 16/557,169, 2019.
- [6] A. De Domenico, Y.-F. Liu, and W. Yu, “Optimal computational resource allocation and network slicing deployment in 5G hybrid C-RAN,” in Proc. IEEE Int. Conf. Commun., 2019, pp. 1-6.
- [7] J. W. Jiang, T. Lan, S. Ha, M. Chen, et al., “Joint VM placement and routing for data center traffic engineering,” in Proc. IEEE INFOCOM, 2012, pp. 2876-2880.
- [8] B. Addis, D. Belabed, M. Bouet, and S. Secci, “Virtual network functions placement and routing optimization,” in Proc. IEEE 4th Int. Conf. Cloud Netw. (CloudNet), 2015, pp. 171-177.
- [9] Y. T. Woldeyohannes, A. Mohammadkhan, K. K. Ramakrishnan, and Y. Jiang, “ClusPR: Balancing multiple objectives at scale for NFV resource allocation,” IEEE Trans. Netw. Service Manag., vol. 15, no. 4, pp. 1307-1321, 2018.
- [10] M. Charikar, Y. Naamad, J. Rexford, and X. K. Zou, “Multi-commodity flow with in-network processing,” in Proc. Int. Symp. Algorithmic Aspects Cloud Comput. (ALGOCLOUD), 2018, pp. 73-101.
- [11] Gurobi Optimization, “Gurobi optimizer reference manual,” 2019. [Online]. Available: http://gurobi.com
- [12] 3GPP TSG SA5, “TR 32.972, Telecommunication management, Study on system and functional aspects of energy efficiency in 5G networks,” Release 16, V. 16.1.0, 2019.